In a previous article for DCD I suggested that if true data center thermal optimization is to be achieved, it requires a proven, safe process based on thousands of real-time sensors and expert spatial models. Inevitably this involves the collection of huge volumes of information, so it’s critical that you can absolutely rely on the data you’re gathering if you’re to succeed in removing much of the uncertainty from data center cooling.
That’s particularly the case when you’re considering the kind of control models needed to monitor critical cooling duty performance for your data center estate’s multiple CRAC/AHU units. Of course, you’ll want to track your data center cooling loads in real-time using standard temperature and current measurement sensors for both chilled water and direct expansion cooling systems. However, it’s also important to continuously monitor air inlet and outlet temperatures – along with variables such as fan performance, filter quality, and alerts for potential CRAC/AHU blockages.
That’s clearly a lot of data to capture and assimilate, so how can data centers turn this into meaningful information that can be acted on? Invariably at this stage people start talking about the role that AI should play. You’ve collected all the data - but no need to worry, AI will sort it all out for you. This kind of dynamic management sounds great, but when it comes to the kind of controls that are needed for truly 100 percent reliable data center cooling, things are never that simple.
It’s time for greater clarity about data centers and AI
We’re really excited about applying AI techniques to help optimize data center thermal performance, however we’re concerned that AI is often presented as a universal panacea that can somehow resolve all the multiple complexities and trade-offs associated with best practice data center thermal optimization.
We’re clearly committed to establishing best practice when it comes to enabling ‘fully-sensed’ data centers and capturing the kind of real-time machine learning class data required for true thermal optimization, however we’re currently unconvinced that AI alone should play a part in managing the controls needed for critical cooling duty performance.
That’s why it’s time for some clarity, and a greater awareness of the significant differences between so-called AI solutions and expert systems-based controls that are actually more predictable, auditable and effective.
Achieving greater control, and increased energy savings, through expert system controls
Given the business-critical nature of most data center environments, organizations need to ensure that there’s genuine transparency around how they manage their cooling duty performance. If it’s an AI engine that’s been tasked to handle this role then there are real questions to be asked.
Is the AI algorithm auditable? Why did it make a particular decision in a given scenario? Can you predict that it would make a similar decision when new machine learning data is introduced into the system? Do you actually know what source data is being queried by the AI?
In contrast, an expert system-based approach is fundamentally different, with an auditable, controlled sequence that derives insight from experience and provides all the checks and balances demanded by your risk management and compliance officers. With an expert system you can challenge how the control application has been developed, check the process at any time, and also audit the system to identify the root cause of specific performance issues.
But ask how an AI system actually makes a decision, and you’ll quickly find out that the technology really isn’t as accessible as you might hope. Contrast this with the robustness and accessibility of expert system control algorithms, as well as their ability to operate effectively as a manageable component within broader building management system environments, and you quickly start to question whether sealed AI engines are really appropriate for prime-time data center deployments.
However, where AI can make a big difference is in providing deep insight to support data center experts. For example, work is underway on AI-enabled multi-site thermal analytics at the core of the thermal optimization engine. This breakthrough equips users with powerful simulation tools that can deliver real time recommendations concerning areas of inefficiency or of risk in live sites. Strategically, we are convinced that proving the efficacy of the AI-driven optimization engine in this way will inevitably lead to predictive alarm functions where ‘tested’ AI engines are able to simulate ahead in real time potential imminent system failures that have yet to breach an alarm condition.
And, who knows, the future may ultimately result in autonomous AI control, but there are several key stages to go before we would promote that step. We would currently caution that entrusting your critical controls entirely to AI systems brings with it a risk cost that’s potentially much larger than any promised cooling energy savings. That doesn’t mean that AI won’t still have an important role to play in future control system environments, but it will perhaps have to wait until the essential ‘expert systems’ element is tightly integrated into the process.
Dr Stu Redshaw is the chief technology officer of EkkoSense