In this modern era of the hybrid data center, consisting of cloud, a combination of on-prem technologies and management methodologies, IT operations teams are grappling with the resulting ever increasing complexity as they attempt to adapt to this new reality. One only needs to glance at the news to see the increasing number of slow-downs and outages bringing down banking and financial operations; forcing airports to a standstill; and impacting the accessibility of medical records.
The effects of our mounting data growth urgently need addressing. There is little doubt that we have just reached a stage where the complexity and rate of change has far surpassed traditional human IT teams’ ability to effectively manage the infrastructure.
Integrating with legacy
Technology advancement is a fantastic thing, but new products do not always effectively integrate within legacy environments, causing gaping vulnerabilities. This has resulted in organizations becoming ill equipped to keep up with the pace of change and get to grips with how these deployments affect the behavior and performance of application workloads. The effect of these slow-downs and outages hit right at the customer, causing significant financial loss to the company, not to mention its damaged reputation and idled workers. To help navigate this tumultuous path, artificial intelligence for IT operations (AIOps -a term coined by Gartner), has arisen as a solution. It came about as IT operations teams found that they needed a new approach to manage the various elements and complexity of the technology stack as it increased.
While it is generally accepted that automation is a key priority for modern data centers (supporting IT teams in ensuring the consistent running of operational processes, aiding the reduction of costs and time spent on maintenance), a true understanding of AIOps it seems, evades many. In simple terms, AIOps can be viewed in a similar way, using anomaly detection and machine learning to enhance the human capability to understand, reducing the time it takes to locate and diagnose performance problems.
AIOps applied to infrastructure performance management (IPM) powerfully ensures optimum performance, overseeing the health and utilization of business critical customer-facing applications, with the ability to provide alerts in advance of any potential blockages or latency issues within the infrastructure.
AIOps intelligently and effectively helps to monitor and oversee the complexity of all of the numerous disparate components and various deployments of the hybrid data center (be it cloud, flash storage, hyperconverged, etc.). It effectively monitors, correlates and prioritizes infrastructure processes for IT operations, so that they run as smoothly as possible, no matter what the stresses and strains on the ecosystem, be it ad hoc, or seasonal. AIOps is also used for capacity planning across the infrastructure to optimize application availability and performance. Leveraging heuristics and algorithms, it can detect and expose anomalies and the potential ticking time-bombs in the infrastructure. AIOps event correlation and analytics capabilities mean that it can mine an influx of less important alerts to highlight the ones that are critical to the running of the business.
As the stack has become more and more complex and mission critical, the capabilities of AIOps are urgently required. Traditional methods and proprietary, legacy tools are simply no longer up to the job in today’s hybrid, virtualized and multi-vendor environment.
Navigating the hype
However, to be truly effective, AIOps needs a foundation of experiential machine learning (ML) to achieve its proper level of maturity. Its capability cannot be achieved in short timeframes. For machine learning to be effective to the business, thousands upon thousands of scenarios need to be ingested for this learning to effectively take place. This ability can potentially take years to achieve. With so much hype from new companies springing up and claiming to have AIOps functionality, organizations need to be vigilant and crystal clear on what AIOps really is (and what it is not), or IT teams will find themselves ill equipped to achieve its benefits and get their hybrid data centers into shape.
Key AIOps capabilities
To achieve the true promise of AIOps, its capabilities must go beyond mere data aggregation and algorithm application, AIOps must represent the holistic transformation of IT operations, which includes the management model, intelligently correlating data, analytics and context to achieve the automation of all elements of IT operations. The question is, how does the organisation go about achieving this?
The importance of application-centricity
A vital starting point is to ensure that the AIOps deployment takes an ‘application-centric’ approach. AIOps offerings should include the ability to automatically discover and map out the entire infrastructure to applications topology. This enables a deep understanding of which infrastructure resources are being consumed by every application service. It should be able to proactively identify resource contention issues that may affect performance. The next layer over and above aggregation that an app-centric approach offers is the ability to understand the context of how all of the disparate elements of the infrastructure relate and connect to each other. Most importantly, this must include how the application interacts with infrastructure stack and the applications’ value to the business.
Access to high quality data
AIOps must also leverage high quality data and analytics to deliver valuable insights to aid in decision-making and optimum management. The power of true AIOps lies in going beyond mere aggregation alerts. This requires both reactive and proactive capabilities continuously in real time, providing advance knowledge of any likelihood of potential slow downs and therefore issue avoidance. It is the availability of operationally-impacting data and analytics that empowers IT to apply AI and gain the insights needed. This allows IT operations to effectively focus on holistic performance optimization. And this vital step is where true AIOps platforms add the next crucial layer of value.
Automation for adaptive IT operations
The next key attribute of real AIOps is automation, which is crucial for operations teams to be able to adapt to changes and embody the ability to be proactive in preventing slow-downs and outages. True AIOps achieves this by applying fixes and optimizations as required so that the health of the entire ecosystem can be maintained, executing with an understanding of workload behavior across the entire stack, whether on premises or in the cloud.
Long gone are the days when the IT operations role was much more tactically focused in simply keeping the infrastructure running. For IT to adequately support enterprise organisations in today’s dynamic environment, it must not only maintain the consistent, smooth and reliable running of the business, but all of AIOps capabilities must be deployed: utilizing full-stack monitoring from an application and business value-perspective, obtaining the right data to deliver high quality insights, and applying intelligent automation for IT operations to respond in real time.