With data center workloads ever increasing due to advanced analytics, AI, and the digitization of every process, the average rack power draw has shot up considerably. And, as we know, with more power draw comes more waste heat that needs to be removed from the rack and eventually the white space.

In the recent past, when racks consumed up to 20kW, air-based cooling methodologies could be relied on to keep the IT hardware operating safely and efficiently. But as some racks start to exceed 30kW or more, new cooling approaches need to be used.

This is in part due to the densification of IT hardware in general with each new CPU generation packing more processing capacity in smaller and smaller die sizes. Workloads such as artificial intelligence (AI) and machine learning (ML) require floating point operations which are usually delivered via a graphical processing unit. These GPUs are designed to have a normal operating temperature above 80°C (176°F) when fully utilized for a particular workload.

Although air-based cooling options exist for racks drawing more than 20kW, they are often cumbersome to install and maintain effectively, essentially passing the point of diminishing returns in terms of cooling capacity. As such, owners and operators of data centers are now cautiously looking towards liquid cooling for their new facility projects.

A short history of liquid cooling

Liquid cooling of IT equipment seems like a new technology, but that cannot be further from the truth.

Liquids in general can be a great heat transfer medium and with a little chemical engineering, boiling and condensation points can be tailored precisely, improving the heat transfer using dielectric fluids.

Various forms of liquid cooling have been around since the late 1800s when they were used to insulate and cool extra high voltage transformers. The automotive industry is another ecosystem that relied, and still relies on, liquid cooling - the water in a typical auto radiator.

Figure 1 - Layout of hybrid air&liquid approach in System360 - source IBM.png
Layout of hybrid air/liquid approach in System/360 – Exploring Innovative Cooling Solutions for IBM’s SuperComputing Systems: A Collaborative Trail Blazing Experience by Dr. Richard C. Chu, IBM Fellow

Liquid cooling entered the computer sector early in its history, when IBM released a series of enterprise-grade computers called System/360, in the early 1960s.

The System/360 has been one of the most enduring lines of commercially available computers. While the original hardware is now retired, S/360 code written in the early 1960s are still found in new mainframes today. It was also the first computer to have a unified instruction set, making upgrades or changes to the mainframe easier than ever.

The System/360 was also cooled with a hybrid approach using both air and liquid cooling. This was quite big and cumbersome to install, but IBM developed the hybrid model to accommodate increased heat loads. With these systems, as much as 50 percent of the heat dissipated was removed from the cooling air via water-cooled heat exchangers.

Figure 3 - Desktop PC CPU Liquid Cooling.png
Desktop PC CPU Liquid Cooling – Intel

Today, liquid cooling is present in pretty much every desktop PC – and the concept has essentially remained the same. The cooling process is made up of three distinct parts: - the heat plate, the supply and return pipes, and the radiators and fans.

The heat plate is essentially a metal plate that covers the whole CPU die with a small reservoir on top. The plate is engineered to be as conductive as possible in terms of heat. Any heat generated by the chip will be transferred to the reservoir on top.

The liquid in this closed loop will travel via the supply and return pipes to the radiators where heat will be pushed out of the PC enclosure through the radiator fins – these fins being actively cooled by fans.

Consumer-grade liquid cooling options have originally only dealt with CPU heat, but now almost every component of a modern-day PC can be liquid-cooled.

That is the consumer-grade option of liquid cooling – but what about larger-scale deployments and enterprise-grade solutions? We’ll look at these next in the context of the data center.

Liquid Cooling Technologies.png
Liquid Cooling Technologies – Vlad-Gabriel Anghel

Enterprise-Grade Liquid Cooling Solutions

When analyzing liquid cooling options for enterprise-grade IT hardware there are essentially two main categories of liquid cooling – Direct-to-Chip Liquid Cooling (sometimes called conductive or cold plate liquid cooling) and immersive liquid cooling.

When considering the phases (what state the fluid is in – either liquid or gas) that the coolant goes through we have five distinct types of liquid cooling.

Direct-to-Chip Single Phase

This method of cooling requires delivering the liquid coolant directly to the hotter components of a server - CPU or GPU - with a cold plate placed directly on the chip. The electric components are never in direct contact with the coolant.

With this method, fans are still required to provide airflow through the server to remove the residual heat. While the air-cooling infrastructure is greatly reduced, one is still required for the correct operation of this liquid cooling method.

Coolants can be either water or dielectric fluids, but water will infer a downtime risk of leakage, however, Leak Prevention Systems (LPS) are available. Single phase refers to the fact that the coolant does not change states - i.e from a liquid to a gas.

This is also the same method used in the previous desktop PC example.

Direct-to-Chip – Two-Phase

The two-phase direct-to-chip liquid cooling method works like the previous single-phase method, the only difference being that the liquid coolant changes states - from a gas to a liquid and vice-versa as it completes the cooling loop. These systems will always use engineered dielectric fluid.

In terms of heat-rejection, two-phase systems are better than single-phase systems and have a lower risk of leakage due to the coolant's state-changing nature. They do however require additional controls which will increase maintenance costs over the lifetime of the system.

IT Chassis Single Phase.png
Immersion Cooling - IT Chassis - Single Phase – Vlad-Gabriel Anghel

Immersive Liquid Cooling – IT-Chassis Single-Phase

This cooling approach uses a single-phase dielectric fluid and is in direct contact with IT components. Servers are fully or partially immersed in this non-conductive liquid within the chassis effectively removing all sources of heat.

The cooling can happen either passively via conduction or actively pumped. Both heat exchangers and pumps can be found inside the chassis or in a side arrangement where the heat is transferred from the liquid to a water loop.

This approach also involves no fans, so its operation is nearly silent (0 dB). In contrast, some air-cooled facilities can reach upwards of 80 dB in the data hall with workers requiring hearing protection for longer exposures.

Immersion Cooling-Open Tub-Single Phase.png
Immersion Cooling-Open Tub-Single Phase – Vlad-Gabriel Anghel

Immersion Cooling – Open Tub – Single-Phase

Sometimes referred to as an "open bath,” this immersive liquid cooling method involves the IT equipment being completely submerged in fluid.

Essentially, it is a rack turned on its back, filled with dielectric fluid - instead of mounting servers horizontally, they are now mounted vertically.

These systems are usually fitted with centralized power supplies and the natural dielectric fluid is cooled off through a heat exchanger using a pump which can be installed either inside or outside the tub, or by convection.

Immersion Cooling - Open-Tub-Two phase.png
Immersion Cooling - Open-Tub Two Phase – Vlad-Gabriel Anghel

Immersion Cooling – Open Tub – Two-Phase

As with Single-Phase, in this method the IT equipment is completely submerged in fluid vertically within a tank. But, importantly with this approach, the dielectric fluid must be capable of changing states from liquid to gas as it heats up.

In such a system, submerged and exposed parts will create heat, turning the liquid into a gas, which rises to the surface and condenses on a coil, falling naturally back down once it cools off enough by turning back into a liquid state.

What are dielectric fluids?

Dielectric liquids are used as electrical insulators in high voltage applications, e.g. transformers, capacitors, high-voltage cables, and switchgear (namely high voltage switchgear).

Their functions are to provide electrical insulation, suppress corona and arcing, and serve as a coolant. Generally, they are split into two categories, fluorochemical, and hydrocarbons.

Fluorochemical fluids, generally with a lower boiling point, are predominantly used for two-phase immersion cooling.

Hydrocarbons typically are not used for Two-Phase immersion cooling systems, as most hydrocarbons are combustible and/or flammable. Therefore, hydrocarbons are typically only used in Single-Phase applications.

Both fluorochemicals (or fluorocarbons) and hydrocarbons (e.g., mineral oils, synthetic oils, natural oils) can be used for Single-Phase immersion cooling. Fluids with a higher boiling point (above the maximum temperature of the system) are necessary to ensure the fluid remains in the liquid phase.

Considerations when deciding among various fluorochemicals and hydrocarbons include heat transfer performance (stability and reliability over time, etc.), ease of IT hardware maintenance, fluid hygiene, and replacement needs, material compatibility, electrical properties, flammability or combustibility, environmental impact, safety-related issues, and total fluid cost over the lifetime of the tank or data centers.

Current Adoption

While far from mainstream, liquid cooling is positioning itself as the cooling solution for high-performance computing. Its mainstream adoption will however depend on advances in technology and chip designs.

Retrofitting already existing data centers is costly for some forms of liquid cooling, while the weight of immersion tanks makes it impractical for many current raised floor facilities. 