Following Stellium Datacenters’ collaboration with immersion cooling specialists Submer this year, we are experiencing at close quarters the considerable challenges when designing, engineering, and installing a substantial immersion liquid cooling system.
We are doing this as a proof of concept under the Open Compute Project (OCP) umbrella.
For those unfamiliar with OCP, the hardware standards are open-source (like some software) so that operators know what a given rack needs to do in terms of size, capability, and power before it arrives onsite. It also ensures the design and layout are ready to support this class of equipment.
In our view, the transparent and collaborative approach of OCP is the best way forward to making immersion liquid cooling environments viable and accessible to our customers as soon as possible.
The goal is to help the wider industry accelerate immersion cooling’s evolution into a product that can eventually be deployable in a broader range of data centers, ultimately becoming an off the shelf solution for existing data center cooling environments. Without doubt, it has the capability to bring the rack power density and efficiency to new levels.
But all this comes at a price in terms of the immersion environment, usually a tank filled with dielectric fluid. Both the tanks and the dielectric fluid bring a new range of issues for the data center environment to embrace:
- Weight – This can vary in fully loaded size and weight from 500kg to several thousand kilograms requiring 20kN+ floors to support the same.
- Physical size – These can vary in size from 1,000MM X 800MM X 1,500MM high to 6,000MM X 2,000MM X 2,000MM high.
- Dielectric fluid handling – this is a significant issue. Depending on the specific dielectric there can be health and safety issues, as well as the practical issue of dealing with 230/400 volts in a dielectric environment.
- Power density – HPC power densities can be very challenging for many of the mature data centers.
- Power and communication interfaces with the tank.
- Removal/reinstatement of IT kit from the immersion tank.
- Specialist training for staff – in an industry where we are already challenged by a significant deficit of trained talent in the data center industry. The additional training will add to an already challenged situation.
While none of these elements are particularly challenging for greenfield development, many of the 3,000 existing data centers in North America plus another 3,000 or so in Western Europe will struggle to accommodate access, weight, space, and staff training.
However, all is not lost. For the many facilities that cannot yet meet the fundamental requirements of immersion cooling, proof of concept projects will serve as a catalyst and an example.
Cooling alternatives
Meanwhile, the existing range of HPC cooling solutions still offers robust efficient HPC solutions for data center operators and their clients:
- In row cooling for application up to 40kW per rack
- Rear door cooling up to 70kW.
- Direct to chip/cold plate up to 100kW.
These are all tried and tested robust solutions that can freely operate within existing chilled water data center environments. They have been deployed in many existing data centers. All can be configured to deliver a PUE of sub-1.2. In many ways, these non-immersive options maintain the flexibility of the traditional data center to evolve over time.
Flexibility is really important. Taking an existing rack and fitting it with a rear door cooler offers an immediate solution to support HPC demands. This route to HPC also has the real value of extending the life of existing data center facilities without creating the significant construction carbon burden of building new facilities.
So, from our learning experience so far, my takeaway is that for most colocation facilities immersion cooling installation will remain a step too far for non-OCP certified data centers. This may come as a nasty surprise to the many data center operators and their customers already faced with optimizing exponential HPC AI workloads and who see immersion cooling as a panacea to energy inefficiency. Taking steps to become OCP-ready will reduce the wait time.