The history of the data center industry is one of unprecedented growth. As the world grew connected, facilities sprouted up around the world to serve both people and enterprises.
Every day brings the news of a major new facility, with higher and higher power envelopes, filled with beefier chips. And, despite all this, the industry has been able to avoid significant energy jumps - a testament to tremendous innovation and a shift in how the sector operates.
This feature appeared in the Energy Transition supplement. Read it for free today.
But can the good times last?
In 2020, Northwestern University, Lawrence Berkeley National Laboratory (LBNL), and Koomey Analytics published a groundbreaking study into the global energy use of data centers.
They found that, between 2010 and 2018, compute jumped a whopping 550 percent. Internet protocol (IP) traffic increased more than 10-fold, and data center storage capacity increased by an estimated factor of 25.
In that same timeframe, energy use grew just six percent to 203TWh.
This startling disparity is worth noting - in times of increasing grid uncertainty, and with data centers already under focus for their impact on carbon emissions, it's hard to imagine how the sector would be treated it if had grown linearly with compute.
Flawed metrics and analysis have long oversimplified the connection between data center demand and power, for years predicting ballooning energy usage that would overwhelm the grid. "A quick way of trying to predict data center power use is to take that electricity use and scale it with some other values, so you could be scaling it with the number of people that are watching online videos, or with the population increase or with the general size of the of the market from a financial point of view," the paper's co-author Dr. Arman Shehabi, of LBNL, told DCD.
"And you would start finding strange extrapolations there because there are multiple variables that are really changing every year, like you have more servers going in, but how those servers are being used is changing. So many different parts are becoming more efficient, or the storage is becoming more efficient, the processors are changing, the cooling systems that are used in data centers have changed over time.”
All this has to be included, he said: “What's the stored capacity of storage? How much is IP traffic increasing? How many workloads are we seeing? All of these things have to be taken into account."
He explained that some approaches look at how much data was used for streaming video in the early years of the century, and how much power it required, and then try to scale it to today's streaming demand. "But did the energy usage increase accordingly? Of course not," he said.
What happened was a remarkable feat of engineering. On the semiconductor level, processor designers like Intel and AMD eked out more and more efficiencies from processors as they followed Moore's Law. New data centers were created and then followed best practices on cooling and design. Old facilities were slowly phased out in favor of advanced data centers built by hyperscalers. The cloud also meant that server utilization has skyrocketed, leaving fewer servers idling, and needlessly drawing power.
"Back in 2005, I visited a data center at LBNL," Shehabi recalled. "There were racks of servers in there, and then these different computer room air conditioners that were just on the floor, just randomly placed in different locations. There were all these different places where the hot air was mixing with the cold air, and these desk fans in different locations blowing the air around. It was just grossly inefficient. It is like building a refrigerator and not putting the door on.
"So once you figure out 'let's put the door on,' your efficiency is going to jump so much from that."
In his efforts to track energy use at the time, he found data centers and server rooms the cooling used twice as much power as the IT. That equates to an enormous power usage effectiveness (PUE) of 3.0 - a figure which has fallen substantially, to a point where hyperscalers like Google claim a PUE of just 1.10.
A colleague, Jonathan Koomey, also studied the prevalence of 'zombie servers,' that is servers that were left operating despite serving no purpose. He found a massive 10 percent of the world's servers could be classed as zombie servers, simply frittering away power because no one switched them off. "So that was another big opportunity for efficiency increases," he said. Harder to track were servers that did have a use, but at a low utilization rate - meaning most of the compute was not used - but that number has also improved with the growth of cloud.
This is all something that should be celebrated. Without the combined effort of hundreds of thousands of data center, IT, and semiconductor workers, the sector would have been unable to support its growth as the grid simply would not have been able to meet the needs of the digital world.
But, as is always the curse of early success, maintaining the pace is not guaranteed. "I think going into the future, it's going to be harder, because those obvious low-hanging fruit opportunities aren't really there," Shehabi said. "And there's going to need to be other ways of having orders of magnitude efficiency increases to balance the orders of magnitude increases in services that we can expect from the industry."
There is no sign of the pace of data center build-outs slowing, particularly since the pandemic entrenched the need for hyper-connected workers. Maintaining that growth without power usage spiraling out of control will be one of the great challenges of our time.
At the chip level, there is already cause for concern. Semiconductors are simply not advancing as fast as they used to: The death of Moore's Law is now a given, but its demise came as a long drawn-out whimper, not a sudden bang. Transistor density improvements began to slow as early as 2010, and the pace has gradually picked up. Manufacturers are hitting the physical limits of transistors, and it's not clear how much further they can go after 2-3nm process nodes.
Chip designers have responded gamely, exploring new avenues for improving performance beyond transistor density. But they have also maintained compute improvements by increasing the thermal design power (TDP) of processors, essentially gaining more compute power by pushing more electrical power through the chip. But that increases both the power demands of the server, and the need for cooling.
Then there's data center layouts. While there are some old facilities still clinging on, and engineers are happy to share stories of the inefficiencies of their competitors, the reality is that many of the obvious improvements have been made. Average data center PUE has fallen, but is beginning to plateau at the bleeding edge. Studies differ on what the average PUE is, but modern efficiency-focused facilities (in forgiving climates) are believed to be around 1.2. Even if that is brought lower, dropping from 1.2 to 1.1 simply won't show the same improvement as falling from 2.5 to 1.2.
At the same time, the conditions for lowering one's PUE are getting harder, and not just because chips are getting hotter. More problematic is that the world is getting hotter. This summer, Google and Oracle data centers in the UK simply stopped working amid a blistering heat wave. The facilities were built for a pre-climate change world that is going away. Expect future facilities to have more cooling equipment built in, running more regularly, as ambient air can't be trusted.
Hyperscalers have also been able to lower waste with efficiencies of scale, building vast server farms that can share infrastructure equipment. Now they are targeting the Edge - either with smaller data centers within cities, or micro data centers that could be as small as a half-rack. This will mean a loss of that scale, but Edge vendors argue that they will reduce the power needed to shuttle data back and forth. The story of the Edge is still in its infancy, so it is too early to say which will outweigh the other.
Facing increasing scrutiny over energy use, and hotter servers, many data center operators are embracing water.
Direct liquid cooling is a long term goal as water’s superior heat absorption, it can mean much less cooling power is required. But that means a big change in data center hardware. A shorter term approach is to use evaporative cooling, which increases water consumption.
"With a strong focus on the PUE metric the industry has reacted and said: 'let's throw water at it.' But now you're consuming a lot more water - and so you've plugged one hole by creating another" said Shehabi.
Done right, water use could indeed help stave off massive power increases, but again it is not clear how many data center plans truly take into account the changing world. The US is in the midst of a historic and unrelenting drought, and yet data centers are still trying to compete for water (often, even, drinking water), and gain access to non-renewable aquifer reserves.
In the UK, utility Thames Water launched a data center water probe in London and Slough, claiming the sector was using too much water amid a drought. But that declaration came as the Greater London Authority said that new housing projects in West London could be blocked for more than a decade because data centers have taken up all the electricity capacity - highlighting the delicate balance the sector will have to navigate as both power and cooling solutions become hard to secure.
Shehabi remains hopeful, pointing to specialist chips that are more efficient for certain workloads, and the wonderful inventiveness of the industry. But he cautioned that the challenge was immense.
The US Congress has officially kicked off the long bureaucratic process that will eventually lead to Shehabi's team creating a new report on data center power usage. "The industry has changed completely since we last looked at this, and I expect it to be a really groundbreaking report, frankly. What I don't know is how the trend of electricity use is going to look - if it's going to be the same as what we've seen in the past, or if we're going to see a bigger increase or maybe a decrease. We just don't know."