As chip temperatures and rack densities rise, a plethora of companies have come forward to pitch their vision of the future.
Cooling demands of artificial intelligence and other high-density workloads have outstripped the capabilities of air systems, requiring some form of liquid cooling.
“When you think about the landscape of liquid cooling, we see three different technical categories,” JetCool CEO Bernie Malouin explained.
“There’s single phase immersion, dipping in the oil. And that's interesting, but there are some limitations on chip power - for a long time, they've been stuck at 400W. There are some that are trying to get that a little bit better, but not as much as is needed.”
The second category is two-phase dielectrics: “We see those handling the higher [thermal design point (TDP)] processors, so those can get to 900-1,000W. Those are fit technologically for the future of compute, but they’re held back by the chemicals.”
Many two-phase solutions use perfluoroalkyl substances (PFAS), otherwise known as forever chemicals, which are linked to human health risks, and face restrictions in the US and Europe. Companies like ZutaCore have pledged to shift to other solutions by 2026, but the move has proved slow.
“It’s a concern for a lot of our customers, they're coming to us instead because they're worried about the safety of those fluids,” Malouin said. “They're concerned about the continued availability of those fluids.
And then there’s the third category: Direct Liquid Cooling (DLC) cold plates. “We’re one of them,” Malouin said. “There are others.”
DLC cold plates are one of the oldest forms of IT liquid cooling - simply shuttling cold liquid to metal plates mounted directly on the hottest components. They have long been used by the high-performance computing community, but JetCool believes that the concept is due for a refresh.
Instead of passing fluid over a surface, its cooling jets route fluid directly at the surface of a chip. “We have these arrays of a thousand tiny fluid jets, and we work directly with the major chipmakers - the Intels, AMDs, Nvidias - and we intelligently landscape these jets to align to where the heat sources are on a given processor.”
Rather than treating the entire chip as a whole with a singular cooling requirement, the microconvective cooling approach “tries to balance the disparate heat loads, disparate thermal requirements of certain parts of that chip stack,” Malouin said.
“When you start thinking about really integrated packages, the cores themselves might be able to run a little higher temperature, but then you might have high bandwidth memory (HBM) sections that aren't as power hungry, but have a lower temperature limit.”
Instead of trying to design for the high-power cores and the temperature-sensitive HBM, each section can be cooled at a slightly different rate. “This allows you to decouple those things and allows you to have precision cooling where you need,” Malouin said.
While Malouin believes that facility-level liquid cooling is the future of data centers, the company also has a self-contained system for those looking to dip their toe in cooler waters, with a Dell partnership focused on dual socket deployments.
Two small pumping modules provide the fluid circulation and an air heat exchanger ejects heat at the other end of the Smart Plate system.
"When we add these pumps, you add some electrical draw, but you don't need the fans to be running nearly as hard, so it makes it 15-20 decibels quieter - and in net, we pull out about 100W per server after we've taken the penalty off of the pumps," Malouin claimed.
When you go to 10 racks or more, going to the facility level makes more sense, he said. Asked about the preferred inlet temperature, Malouin said the system was flexible but added, "we actually really like the warm fluids."
He said: "We have facilities today that are feeding us inlet cooling temperatures that are 60°C (140°F) and over. And we're still cooling those devices under full load." That's not common just yet, but Malouin believes that warmer waters will grow in popularity in places like Europe due to the heat reuse potential.
Back in the US, the company is part of the Department of Energy's COOLERCHIPS project, aimed at dramatically advancing data center cooling systems.
The focus of JetCool's $1m+ award is not just on the cooling potential, but a tantalizing secondary benefit: "We have instances where we've made the silicon intrinsically between eight and 10 percent more electrically efficient," Malouin claimed.
"That has nothing to do with the cooling system power usage, but with leakage."
Malouin doesn't mean leakage of the cooling system, but rather the quantum phenomenon of semiconductor leakage currents that can significantly impact a chip's performance.
The recent history of data center cooling has tended to assume that allowing temperatures to rise higher will save energy because less is used in cooling. Results, including research by Jon Summers at the Swedish research institute RISE, are finding that leakage currents in the silicon limit the benefits of running hotter.
"A big part of our COOLERCHIPS endeavor is to substantiate that through more rigorous scientific evidence and extrapolate it to different environments to see where it holds or where it doesn't go."
Looking even further ahead, Malouin sees an opportunity to get deeper into the silicon. “In some cases, it might actually be integrated as an embedded layer within the silicon, and then coupling that to a system that's outside that's doing some heat reuse. When we think about that holistically, we think that there's a real opportunity for a step change in data center efficiency.”
For now, the company says that it is able to support the 900W loads of the biggest Nvidia GPUs and is currently cooling undisclosed ‘bespoke’ chips that use 1,500W.
“Ultimately, you're really going to have to look at liquid cooling if you want to run not just the future of generative AI, but if you want to run the now of generative AI.”