The US Department of Energy will spend $40 million on 15 projects targeted at dramatically advancing data center cooling systems, aiming for at least a 10× improvement.
Run by Advanced Research Projects Agency-Energy (ARPA-E) the program was announced last year as the 'Cooling Operations Optimized for Leaps in Energy, Reliable and Carbon Hyperefficiency for Information Processing Systems (COOLERCHIPS)' initiative.
DCD can now exclusively detail which companies and universities were awarded funds, as well as reveal how each group hopes to change the future of cooling.
Data center cooling has become an increasing challenge for the sector as rack densities skyrocket. To meet the increasing demands of workloads like AI, despite the slowdown in chip performance advances, companies have been forced to push more power through less and less space. This means more heat is produced, pushing some data centers beyond traditional air cooling and into different forms of liquid cooling.
Current liquid cooling efforts still have their limitations, and can be costly, limiting their deployment in data centers.
If technology does not improve, but data center usage continues to grow, ARPA-E estimates that US data centers will grow to produce 900 billion kWhth (kilowatt hours of heat) by 2028. The new technology advances proposed below could dramatically reduce that, the agency argues.
Data centers on average have a total-power usage effectiveness (TUE, a combination of PUE and ITUE, or IT PUE) of 1.7, the agency believes. New approaches could bring that down to 1.1, saving between 0.5 to 1.4 quads of electricity (each quad is 293 billion kilowatt-hours), ARPA-E believes.
“Climate change, including severe weather events, threatens the functionality of data centers that are critical to connecting computing and network infrastructure that power our everyday lives,” US Secretary of Energy Jennifer M. Granholm said. “DOE is funding projects that will ensure the continued operation of these facilities while reducing the associated carbon emissions to beat climate change and reach our clean energy future.”
Alongside the opportunity to reduce emissions and water use (US data centers are believed to consume more than 500bn gallons), ARPA-E notes that cooling improvements will help facilities be more location independent. It could, DCD understands, lead to the elimination of chillers and CRACs. Some solutions could also help push the thermal design point (TDP) of chips beyond 1kW.
The full list is available below, but companies include Intel, Raytheon, and HP (not to be confused with HPE). Approaches range from coral-shaped immersion cooling heat sinks and micro-convective cooling technology to ribbon oscillating heat pipes. The different technologies are all expected to at a minimum reach Tier III reliability levels of 99.982 percent uptime.
Each project could be funded for up to three years, depending on how they meet milestones along the way. After that, some of the smaller companies could be eligible for scale up grants to help commercialize, but overall the commercialization efforts are in the hands of the companies and not ARPA-E. Should projects fail early, funds could be reinvested in the other efforts (and ARPA-E has around $2m still in reserve).
While some approaches appear exotic and may require expensive production for now, the goal is that they ultimately prove the most economical approach to cooling - not just the most efficient.
The projects are separated into four different types (from A to D). Category A is focused on the secondary loop of data center cooling, focusing on technology just from the facility to the surface of the chip, and making the thermal resistance as small as possible.
Category B is for modular or Edge data centers of up to 1MW. The project imagines them in environments of 40°C (104°F) ambient temperatures and 60 percent humidity, to ensure that they are designed with difficult locations in mind.
As for Category C, it is focused on the tools. ARPA-E envisions providing tools to allow cooling system designers to see live the expected reliability of a system in a data center. The tool, developed by the University of Maryland as part of the 15 projects, will be used by the other COOLERCHIPS projects to give a unified baseline measurement that they can be compared against.
Researchers will also be able to deploy their systems at the National Renewable Energy Laboratory (NREL) for testing. This is Category D, where NREL will develop a test site at its Flatirons Campus. NREL operates one of the world's most energy-efficient data centers, while the DOE more broadly is a huge supercomputing operator. But the agency has historically been an end user, rather than being involved significantly in the design of cooling systems. DCD understands this project aims to change that, with the DOE's high-performance computing industry set to be brought into COOLERCHIPS.
The COOLERCHIPS projects, in order of project cost:
Nvidia (Santa Clara, CA)
Project title: Green Refrigerant Compact Hybrid System for Ultra-Efficient and Sustainable HPC Cooling
Project description: Nvidia will develop a novel modular data center with an innovative cooling system that combines direct-to-chip, pumped two-phase and single-phase immersion in a rack manifold with built-in pumps and a liquid-vapor separator. The design cools chips with a two-phase cold plate, while the rest of the server components with lower power density will be submerged inside a hermetically sealed immersion sled with the servers cooled using green refrigerants for the two-phase cooling and dielectric fluid for the immersion. The two-phase porous metal cold plate will achieve a thermal resistance as low as 0.0025°C/W.
Award amount: $5,000,000
University of California, Davis (Davis, CA)
Holistic Modular Energy-efficient Directed Cooling Solutions (HoMEDiCS) for Edge Computing
The University of California, Davis, will develop a suite of holistic thermal management solutions for modular data centers used for Edge computing. Their design innovations include efficient heat extraction from CPU and GPU chips with a liquid-cooled loop and dissipation of this heat to the ambient by use of high-efficiency, low-cost heat exchangers. Auxiliary electronics in the server boards would be cooled with a secondary loop that rejects heat radiatively to the atmosphere.
Award amount: $3,586,473
Flexnode (Bethesda, MD)
Prefab Modular Liquid-cooled Micro Data Center
Flexnode will develop a prefabricated, modularly designed Edge data center that will leverage four key component and system-level technology advancements: a novel manifold microchannel heatsink, a chassis-based hybrid immersion cooling approach, a cost-effective additive manufacturing-enabled dry cooling heat exchanger system, and a topology optimized container housing the entire system.
Award amount: $3,500,000
University of Maryland (College Park, MD)
Multi-Objective Optimization Software for COOLERCHIPS
The University of Maryland will develop an integrated decision support software tool for the design of next-generation data centers that seamlessly links the existing open-source software for modeling reliability, energy, carbon footprint, and cost with an innovative co-simulation framework. This tool will permit data center designers to develop transformational and disruptive design advances compared to existing state-of-the-art technologies.
Award amount: $3,484,484
HP (Corvallis, OR)
Embedded Microfluidic Cooling for Nextgen High Power Server Architectures
As part of their Category A effort, HP will work with partners to develop an aggressive liquid cooling solution that reduces the need for thermal interface material and the number of thermal interfaces between high-power CPUs/GPUs and the coolant, thereby dramatically lowering the package thermal resistance. The proposed approach leverages HP’s inkjet microfluidics platform and relies on first coupling silicon microchannels to a device’s surface, and then by embedding microfluidics deeper into the device as a future step. This design would reject server heat to 40°C and 60 percent relative humidity external ambient air.
HP actually tried to use its inkjet technology for cooling way back in 2002, building a working prototype, but it never commercialized the tech. At the time, it warned that chips could one day consume 100 watts. Data center GPUs now often go beyond 500W.
Award amount: $3,250,000
University of Florida (Gainesville, FL)
Hyperefficient Data Centers for Deep Decarbonization of Large-scale Computing
The University of Florida is developing what it calls a disruptive thermal management solution proposed for cooling future CPU and GPU chips at unprecedented heat flux and power levels in data centers server racks. The new technology allows for significant future growth in processor power, rejects heat directly to the ambient air external to the data center, and facilitates adoption within the existing data center infrastructure with a primary liquid cooling loop.
Award amount: $3,042,417
University of Texas at Arlington (Arlington, TX)
Holistic Co-Design of Novel Hybrid Cooling Technology for the Data Center of the Future
The University of Texas at Arlington and its collaborators will develop a novel hybrid cooling technology to address the growing need for advanced thermal management solutions for high-power data centers. At the server level, the design combines direct-to-chip evaporative cooling module including electrodeposition of metal on high-powered devices to eliminate thermal interface materials and to reduce chip-to-coolant thermal resistance, and air cooling including Rear Door Heat Exchanger for the rest of the system and thus enabling a robust and extendible solution for the future as well as an easy path to retrofit legacy data centers.
Award amount: $2,843,223
Raytheon Technologies Research Center (East Hartford, CT)
EXTRACT: Extra Efficient Data Centers Using Avionics Cooling Technology
Raytheon Technologies Research Center will develop Extra Efficient Data Centers with Avionics Cooling Technology (EXTRACT) with a cross-industry collaborative team. Targeted heat removal from high-power processors will be achieved with Ribbon Oscillating Heat Pipes (RHPs). Heat is extracted from servers with integrated, passive, and reliable heat spreading. The RHP technology, with record-low thermal resistance, could enable a transformational reduction in the power consumption of future data centers.
Oscillating heat pipes, also known as Pulsating Heat Pipes, use pressure-driven, two-phase fluid flow to rapidly transfer heat between heat sources and heat sinks.
Award amount: $2,504,024
University of Illinois at Urbana-Champaign (Champaign, IL)
Holistic Rack-to-Processor Power and Thermal Co-Design for Future Servers
The University of Illinois at Urbana-Champaign will develop what it says is an innovative cooling paradigm capable of both minimal energy use and maximum cooling power for future servers. Their design integrates high-performance thermal interface materials, coefficient of thermal expansion matched and reliable silicon carbide coolers, topology optimization-based design automation coupled with silicon carbide additive manufacturing, robust and cost-effective single-phase water cooling, and high primary-side temperatures to enable efficient heat dissipation to the ambient.
Award amount: $2,500,000
HRL Laboratories (Malibu, CA)
Aligned Graphite Microchannel Cooling (AGMC) System with Additively Manufactured Manifolds
As part of a Category A project, HRL Laboratories will develop a novel data center thermal management system with low thermal resistance and greater energy efficiency to reduce power consumption for the next generation of data center servers. HRL’s system utilizes aligned graphite micro-fins and additively manufactured flow manifolds to overcome performance limitations common to existing cooling blocks and provide unprecedented cooling for current and future processors.
Award amount: $2,000,000
Purdue University (West Lafayette, IN)
Confined Direct Two-phase Jet Impingement Cooling with Topology Optimized Surface Engineering and Phase Separation Using Additive Manufacturing
Purdue University, Binghamton University, and data center thermal management company Seguente Inc. will develop an innovative chip-level direct two-phase impingement jet cooling solution to drastically enhance overall thermal performance while reducing pumping power. The design includes new algorithms for topology optimization of the cooling structure, novel on-chip direct printing methods for laser powder bed fusion of multi-porosity wicks, and an additively manufactured multi-input\multi-output fluid distribution manifold.
Jet impingement cooling involves spraying a working fluid onto a surface being cooled.
Award amount: $1,881,315
Intel Federal (Austin, TX)
Enabling Two-Phase Immersion Cooling to Support High TDP
Intel Federal will develop ultra-low-thermal resistance, coral-shaped immersion cooling heat sinks integrated with a 3D vapor chamber cavity for high-power devices. Intel’s design would address the challenge of adapting two-phase immersion cooling by optimizing 3D vapor chambers to spread the heat more effectively. This is paired with innovative boiling enhancement coatings to reduce thermal resistance by promoting high nucleation site density.
Award amount: $1,711,416
University of Missouri (Columbia, MO)
Dual-mode Hybrid Two-phase Loop for Data Center Cooling
The University of Missouri will develop a hybrid mechanical-capillary-drive two-phase loop that could serve as an ideal cooling solution for data centers. The proposed technology offers numerous advantages over existing phase-change processes such as flow boiling and condensation, including dual-mode operation, low thermal resistance, high heat flux, low pumping power consumption, high power density, reliable operation, and a fully scalable design.
Award amount: $1,649,290
National Renewable Energy Laboratory (Golden, CO)
COOLERCHIPS Technical Evaluation Team
The National Renewable Energy Laboratory (NREL), Sandia National Laboratories, and the Georgia Institute of Technology will develop testing protocols to evaluate the cooling technologies developed by COOLERCHIPS projects in real data center operating conditions. The scale will range from the component level to the rack level and all the way up to full Edge data centers. This technical evaluation team will leverage the work done by the COOLERCHIPS Category C team to develop a digital twin to evaluate key parameters and help test a broad range of technologies developed by other COOLERCHIPS project teams to evaluate their thermal, reliability, and cost goals.
Award amount: $1,463,319
JetCool Technologies (Littleton, MA)
Sub-One PUE through Silicon Cooling Efficiency
JetCool will develop a micro-convective cooling technology that combines and optimizes two distinct cooling approaches to provide the highest levels of energy efficiency in data centers. JetCool’s micro-convective cooling modules lower CPU temperatures, reducing leakage current and resulting in power savings of 8-10 percent while an in-server radiator eliminates the need for server-dedicated air cooling in the data center to provide significant additional energy savings.
Award amount: $1,265,747
DCD will profile some of these projects in more detail in the months to come.