Game-changing generative AI technologies are dramatically impacting the way employees work, engage with customers, and collaborate with partners. Yet behind the scenes, generative AI’s voracious appetite for resources is forcing organizations to completely reconsider how data centers are designed and managed.

Those challenges are most pointed in corporate environments, where purpose-built generative AI large language models (LLMs) are being trained on company data to provide worker and customer support tools with unprecedented natural language and data analytics capabilities.

By 2026, Gartner has predicted, over 80 percent of enterprises will be using generative AI APIs or models – and most will have deployed generative AI applications in production environments running either on hyperscale cloud platforms or within the confines of their own data centers.

While companies have long used high-performance computing (HPC) systems and specialized graphics processing units (GPUs) for intensive data processing, the growth in demand for generative AI systems – and their need to process data quickly enough to enable near real-time response to users – have driven a surge in development of specialized AI accelerator chips.

Servers running these high-powered chips – which combine GPUs with high-speed memory and include the likes of Nvidia’s GH200, AMD’s MI350, and Intel’s Gaudi 3 – are optimized to make generative AI run faster, and they’re being installed in data centers as quickly as manufacturing facilities can pump them out.

Deloitte has predicted the market for such chips will be worth more than $50 billion this year, or around 11 percent of the global chip market – increasing to several hundred billion dollars by 2027. And while this presents significant revenue possibilities for chipmakers, it also creates new challenges for companies integrating generative AI accelerators into their data centers.

Generative AI’s power problem

Widespread adoption of generative AI chips is already creating challenges due to their power consumption being much higher than normal chips.

Processing a generative AI query consumes 10 times as much energy as a conventional database query – meaning that where a conventional data center compute typically consumes up to eight kW or 10kW per rack, generative AI optimized systems push this out to 40kW, 50kW, and even 100kW per rack.

That’s a significant jump in power requirements that requires myriad accommodations in data center infrastructure including electrical lead-in capacity, on-site power distribution, surge protection, backup generation, and more.

Indeed, with generative AI set to scale rapidly within individual companies’ data centers – and the multiplier effect exacerbating this effect across a colocation facility housing generative AI capabilities for many tenants – generative AI is expected to account for 1.5 percent of the world’s total power consumption by 2029, according to industry research firm TechInsights.

By 2027, the Columbia University Center on Global Energy Policy (CGEP) projected that GPUs will consume 1.7 percent of the electrical capacity of the United States and four percent of all electricity sold in that country – up from two percent this year.

By that point, CGEP found, that data centers optimized for running LLMs will consume 12 percent of total commercial electricity demand – nearly double the seven percent of demand this year and four times the demand in 2022 before the release of OpenAI’s ChatGPT brought generative AI technology into the mainstream.

Surges in power consumption won’t be evenly distributed: although colocation facilities dot major cities across the APAC region and around the world, large-scale data centers tend to be concentrated in regions with reliable power generation, extensive communications interlinks, and stable geopolitical situations with strong corporate governance mechanisms.

That means generative AI’s rapid adoption will strain regional electricity supplies, while driving its biggest users to scale up data centers in well-established regional centers such as Singapore, Sydney, Manila, and Hong Kong.

Many other organizations, however, will concentrate their generative AI capabilities within their existing data center, keeping their generative AI compute close to their users to minimize latency and complexity.

Keep a cool head as generative AI heats up

As a specialist data center architect, Oper8 Global has seen firsthand the impact of this generative AI-driven surge in demand as businesses look for assistance in developing strategies to bring the technology into their own data centers.

Most companies are allocating a portion of their data center white space to generative AI, often delineating space for a ‘micro data center’ containing one or more purpose-built racks with supporting power, cooling, and communications infrastructure.

That includes not only concentrating on high-capacity electricity supply but also new approaches to cooling generative AI accelerators that generate so much heat that conventional hot aisle/cold aisle cooling techniques simply can’t keep up.

This challenge is driving a resurgence in alternative forms of cooling – specifically, water cooling techniques like direct-to-chip liquid cooling, which extracts heat from AI accelerator chips by physically resting cold plates on top of the chips, and then running cold water past them to absorb excess heat.

Liquid cooling requires careful technical design and the banking of chip manufacturers, but surging heat generation in data centers is expected to drive the technology into the mainstream in coming years – with this year’s estimated two to three billion dollars spent on the technology expected to grow by 25 percent annually.

That’s creating new opportunities for new technologies such as DUG Cool – an innovative solution from our partner, DUG Technology, that immerses the chips in a patented, non-conductive fluid that reduces the cooling energy required by up to 95 percent and cuts overall power consumption in half.

A recent Deloitte survey of 2770 executives in 14 countries found that 55 percent had run into data-related issues that caused them to avoid certain generative AI use cases – and that most generative AI adopters have still only implemented 30 percent or fewer of their generative AI experiments into production – the adoption of new data center technologies will be critical for companies to break free from the limits of their existing data center infrastructure and fully embrace generative AI’s possibilities.

By taking a careful approach to data center design that frees you from concerns about whether your infrastructure can keep up with generative AI’s demands, you’ll be well-positioned to join them.