The focus of the AI revolution is all too often focused on customer workloads, and ensuring that facilities have enough capacity to serve the growing needs of their stakeholders. But there’s another aspect – bringing that AI technology into the data center environment, to act as an extra pair of eyes and ears for the hard-working engineers.
Craig Compiano is CEO of Modius, a company that specializes in data center infrastructure management (DCIM) software to bring data center functionality together into a single system which, amongst other things, is capable of monitoring and even predicting failures. The system also uses machine learning (ML) to crunch through data points and provide a holistic view of what’s going right in a format that meets customer demand for metrics.
DCD’s Kat Sullivan met up with Compiano on the DCD>Talks stage to discuss how ML-enriched DCIM can solve many of the pressing issues faced by data center operators, from sustainability metrics to predictive maintenance, and even as a solution to the ongoing skills shortage.
Compiano tells us about the decade-long journey toward widespread DCIM adoption: “We have followed the evolution of DCIM since it started as a collection of vendors in an ecosystem of data center software. Over time, the category solidified around a product suite, so having the capability to conduct real-time monitoring through rack visualizations and inventory. Today, you will find more credible DCIM vendors that can support that entire ecosystem.”
Before moving to a DCIM system, one of the first decisions to be made is exactly what needs measuring, a decision shared between business needs, those of your customers and stakeholders, and increasingly, those of state regulators. All this has been made possible by the improved capabilities of AI.
“Analyzing requires continuous real-time data because analyzing something that happened six months ago is of no value for running a real-time, mission-critical facility. I want to analyze and forecast the future through a steady stream of organized, structured data. We’ve spent 10 years creating improved tools for operators to analyze the performance of their facilities.”
Compiano goes on to explain that DCIM needs to continue evolving to offer more autonomous recommendations to assist those on the data center floor. He gives the example of liquid cooling, which needs much closer monitoring than traditional air cooling:
“As we move from air-cooling big rooms to liquid-cooling individual servers, the mean time to repair is now seconds, not minutes or hours. As the interval between an incident and an action becomes shorter, you need systems that help the operator react, and even proactively address, impending failures.”
But that’s not to say there’s a “one-size-fits-all” approach to these issues. A DCIM must be agile to be useful, serving different customer needs in different ways, as Compiano explains:
“A small enterprise data center operator might be focused on keeping track of the inventory and making sure they have power to the rack, so a real-time, massively scalable platform may not be necessary. Some of our customers manage 20 sites across the globe and are keenly interested in extensive data collection. They're in the business of providing highly reliable infrastructure to their customers, and colo customers want real-time transparency with data feeds through APIs.”
The ongoing skills shortage in the sector is causing a potential “brain drain”, and Compiano believes that software can offer a repository for all the knowledge that would otherwise be lost:
“It's hard to recruit and fill positions, and then a lot of the talent with many years of knowledge in this industry is retiring or has retired. Software provides a platform to record that knowledge base and conduct better analyses and interpretations of activity to help the operator do a better job.”
That’s not to suggest that DCIMs will replace human operators, but rather augment them to have all the information they need, even if they’re new to the industry:
“Senior tenured operators have been at it for 30 years. They understand what's going on in the data center, whether through sound, smell, or intuition, and use expertise to bridge gaps in operations. However, experience on-site 24/7 gets expensive. Instead, you can leverage software that will perform 24/7 in a predictable and repeatable fashion. Machine learning is not new, it's just that the platforms historically did not support that massive data acquisition and analysis.”
Compiano then offers his advice to anyone considering DCIM: “Any vendor will tout the part of their software they’re most proud of, but it may or may not be what the customer needs. It looks great in a demo and can be an impressive display of software capabilities, but it may or may not address the fundamental problem that the customer is trying to achieve. So first, be sure of the requirements. And second, when you identify those requirements, do a POC (proof of concept) with the vendor, and turn up the software to support the specific use cases you are searching for.”
To learn more about the role a DCIM can play in your data center, you can listen to the whole DCD>Talk by clicking here.