X

Interest in IT automation is picking up rapidly, in large part because deployment of complex cloud-native technologies has dramatically increased the demand for – and value of – automation.

Migrating to cloud-native technologies can increase application and infrastructure management complexity and overwhelm IT professionals with repetitive manual work.

Automation is one approach to better manage workloads, and it can deliver important benefits, including reducing errors and allowing skilled IT professionals to shift their work toward mission-critical activities.

The first part of this report examines the broad trend toward IT automation.

The second part of the report drills into automation of monitoring and incident response workflows and presents a taxonomy framework designed to help IT professionals prioritize automation options and strategies.

Both sections are based on data obtained from surveys of IT professionals in the 451 Alliance community

WHAT’S DRIVING IT AUTOMATION?

Cloud-native technologies such as containers, Kubernetes and microservices generate complex, dynamic environments that create new management challenges.

Embracing automation alleviates some of the workloads of IT team members. Automation can also lead to other important benefits, including improved reliability, enhanced security and faster access to resources.

Automation can also be used to address skills gaps. In a survey of IT professionals in the 451 Alliance, the top two categories of acute skills shortages were in cloud platform expertise and cloud-native functions.

IT categories facing skills shortages

Retraining staff, hiring new staff, and relying on third-party contractors or consultants are the top approaches for addressing skills gaps, but almost one-third (29%) of the respondents say they plan to implement or expand IT automation to help close skills gaps.

How organizations will address IT skills gaps

More than half (57%) of the survey respondents say that their IT environment is 'mostly manual with some automated processes,' while 28% are 'mostly automated with some manual processes' and 11% are 'almost totally manual.' Only 5% of the survey respondents say that their IT environment is highly automated.

Regardless of the current level of automation, the intent to invest more in automation is striking: Almost three-fourths (74%) of the companies in the 451 Alliance plan to increase spending on automation over the next 12 months, with 21% anticipating ‘significant’ increases.

AUTOMATION BENEFITS

Companies with at least some level of automation report a variety of benefits, including:

  • Increased reliability/consistency (cited by 52% of the participants)
  • Improved security (45%)
  • Accelerated access to infrastructure resources (37%)
  • Faster time to business value (35%)
  • Decreased operational expenditures (28%)
  • Increased infrastructure utilization (25%)

Another important benefit of IT automation is the ability to shift scarce human resources toward mission-critical projects and away from rote, manual work. This is important because IT staffs on average spend 60% of their time on ongoing business, as opposed to new, mission-critical projects.

MONITORING AND INCIDENT MANAGEMENT AUTOMATION

Today, monitoring and incident response is often characterized by disjointed approaches to automating certain responses, as well as a lack of appropriate tools and best practices, particularly in cloud and cloud-native environments.

451 Research has developed a taxonomy framework for monitoring and incident management automation that is designed to help organizations examine the benefits they might realize from adopting automation.

The taxonomy framework is organized into five categories of monitoring and incident response automation: pre-production, monitoring, incident management, autoremediation and continuous optimization.

Taxonomy Framework for Monitoring Incident Management Automation

Each business’s experience will be different, and we encourage organizations to rate potential automations according to their needs.

For example, if you find that your team often experiences a considerable time lag between when an incident is detected and when the appropriate skilled person begins to react, you may find that automating the process of identifying the correct responder can have a high impact, can be relatively easily implemented and should be prioritized.

The level of difficulty to implement an automation will depend on a number of factors, including the maturity level of existing development processes, as well as available skills and tools.

For instance, to perform automated release validation, an organization will require a CI/CD tool, internal continuous development cultures and processes, and an integrated monitoring system for measuring the performance of new code.

It also involves important decisions about preferred approaches to testing, which could include canary and blue/green deploys, all of which requires processes and measurement before validating for production.

These tools, processes, cultures and decisions must be in place to achieve the most impact from automating release validation.

The tools required to implement an automation may vary, with overlaps in capabilities across tool categories.

For example, grouping alerts or issues into incidents can be done by a monitoring, incident management, alerting or event-correlation tool. Or, multiple tools may be used for different types of alert or issue groups. In complex automations, multiple tools may need to be strung together.

Additionally, there are general-purpose robotic process automation (RPA) tools that can be used to build some of the automations required in monitoring and incident response.

We encourage organizations to determine a comfort level in terms of human involvement for each automation. For some automations, teams may want a human to review the proposed action and authorize it.

Other automations, such as ticket creation and enrichment, may be regarded as low risk and thus not require human permissions.

Users of 451 Research’s taxonomy framework may want to add a column rating the potential risk of each automation based on the requirements of the business.

DRAWBACKS TO AUTOMATION

Of course, there can be drawbacks to automating or partially automating processes in the context of monitoring and incident response. IT professionals in the 451 Alliance noted the following drawbacks:

  • Change management complexity (cited by 42% of the survey respondents)
  • High learning curve (40%)
  • Lack of customization (28%)
  • Skills displacement (25%)
  • Security gaps (18%)
  • Role attrition (18%)
  • Cost overruns (15%)

Only 22% of the survey participants said that their organization did not experience any of these drawbacks.

CONCLUSIONS AND RECOMMENDATIONS

For organizations interested in expanding the use of automation in monitoring and incident response, the first step should be to measure current operations in order to identify workflows that are ripe for automation. Teams may already have access to important data that enables this type of analysis in their monitoring, alerting and incident response tools.

With analysis of this data in hand, they can create a taxonomy of potential workflows that might be automated, assessing the difficulty with which an automation might be built, as well as potential benefits and risks.

We recommend starting with automations that are relatively easy to implement, with low risk and high potential impact. Examples include automatic ticket creation and ticket enrichment.

Continued auditing and analysis is key to early implementations. Teams must build in the capability to collect data about the automated processes in order to be able to measure the impact of the automation.

The practice of automating monitoring and incident response workflows is in its early days, with much room to grow. The current pressures on IT Ops and DevOps teams will continue to drive interest in automation, and we anticipate significant progress in the coming years around the development of both best practices and available tools.

The full report is only available to 451 Alliance members.

JOIN THE ALLIANCE          MEMBER LOGIN