How much energy do data centers consume?
As a research scientist at Lawrence Berkeley National Lab, Dr Arman Shehabi has spent more than a decade trying to answer that question, providing comprehensive US-focused reports on the impact of the industry, and helping to predict how fast energy demands are growing.
And yet, Shehabi is the first to admit that his work is inexact, hampered by pervasive industry secrecy and a rapidly changing market.
A matter of consumption
“The goal at LBNL is to put out information that the industry, consumers, and policy-makers can use to make decisions on,” Shehabi said at day two of DCD>San Francisco 2019. “And we're here to get more information.”
In Shehabi’s last major report, from 2016, his team estimated that US data centers consumed approximately 70 billion kWh, about 1.8 percent of total US electricity consumption.
While still a huge number, the percentage is well below headline-grabbing studies that gained widespread publicity showing IT usage rapidly rising to 20 percent of global energy usage.
“This area is so difficult to get data on,” Shehabi said. “It’s such a complex highly dynamic sector that we see people either say: 'Okay, well, it’s completely futile, I won’t deal with it.’ Or they say: ‘Let's put out sensational numbers to people's attention.’ And neither of those help the cause, because if we have sensational numbers that just keep changing every year, no one is going to take it seriously.
“So I've seen numbers like this, and there's been numbers similar to this for Bitcoin recently. In both of those, really the problem is that it’s projections made off of static numbers.”
One of the reasons projections are difficult is simply because the market is constantly changing, with new innovations in both technology and business models upending the industry.
“At LNBL we do a lot of forecasts and projections for different sectors, and we'll go out 30 years for refrigerators. We can't do that for the IT sector,” Shehabi said.
Forecasting the unexpected
A 2007 report from LNBL is a case in point: It looked at the doubling of electricity usage by servers in the data center industry between 2000 and 2005, and forecast that it would continue to expand at pace, laying out various scenarios for how the industry could rise considerably from the then-1.2 percent of US consumption to something more significant.
That's not what happened. The growth of the server market slowed, partially due to the financial crash, but also due to technological advances.
Cloud companies, taking advantage of virtualization and higher server utilization rates were able to serve more compute with fewer servers. Power demand on servers, after noticeably growing between 2000 and 2005, mostly leveled out.
Servers also improved their power scaling abilities, drawing less power during idle periods or when at low utilization. From 2005 to 2016, impressive efficiency improvements were made in storage, network and infrastructure. Companies, particularly those in hyperscale, found new ways to eliminate energy waste.
“It's hard to go even just a couple years out,” Shehabi said. Instead of the 30 year predictions on the refrigeration industry, in “our 2016 report, it went out to 2020,” with the report predicting that the industry will hit 73 billion kWh by then.
But, while these future predictions prove challenging due to the fundamentally unpredictable nature of technological change, even understanding what is happening right now is a difficulty - simply due to corporate secrecy.
“It's ironic that in an industry where it's understood that more data can improve optimization that we don't have very much data about how much energy is being used in data centers across the board,” Shehabi said. “[Companies] keep their cards close to their chest, for proprietary reasons.”
He continued: “We need to start understanding how we're interacting with the broader system. If we can have data from data centers where it can be anonymized and given to an unbiased third party, like - I don't know - a national lab, that can really help for the putting out of better numbers.”
Others in the data center sector agree that it is time to be more open.
Open up
"As an industry we need to be more transparent, because if we're going to solve the actual problem of energy, we need to be transparent about the totality of the problem itself," Adam Kramer, Switch's EVP of strategy, said at the same panel.
“And that includes working with academics, it includes working with environmentalists, it includes total transparency in order to get here. I don't think there should be any secret sauce about the total megawatt hours consumed. It's something we need to have a dialog around.”
Switch, which uses renewable energy PPAs to power its data centers, publishes its energy usage on its website. ”We publish our total megawatt hours every single year - it's about 600,000 megawatt hours,” Kramer said.
QTS’ VP of energy and sustainability, Travis Wright, was also forthcoming: “Our total consumption is about a billion kilowatt hours per year and about 32 percent of that is renewable.”
Both companies were dwarfed by the hyperscaler on the panel: Google. “The last time we published it, ours is about 10 terawatt hours, which is two San Franciscos, per year,” the company’s senior director of data center energy and location strategy, Gary Demasi, said. “We have matched that with renewable energy purchases.”
But while Google has been forthcoming about its total energy usage - and commendable about how it offsets or reduces its carbon footprint - its cloud service brings with it a whole new set of transparency challenges.
Clouding the data
“We've been moving from a colo in New Jersey to Google Cloud over the past two years,” Emily Sommer, staff software engineer at Etsy, said.
“When we were in a colo situation, we had a great insight into the power efficiency of all of our systems and servers and processes. And now on the cloud we're losing that insight.”
The colo setup was simple: “I went to visit my database that I managed, I waved to the physical box,” Sommer said. An actual machine in a set location, it was easy to track how much power it consumed.
“In Google, you have virtual machines on top of virtual machines on top of virtual machines. I'd imagine it's especially difficult, given the ephemeral nature of servers.”
Now, Etsy’s services are run in various servers, in various data centers, sharing those same systems with other customers. How do you track that?
"Data is one of the big gaps," Google's Demasi admitted. "The technology inside the servers is no longer just the metal container - virtualization happens across infrastructure, and may actually span multiple facilities to be as efficient as possible for the customer. So the problem gets compounded.
"[We're] working hard on that problem."
Currently, Etsy only has access to its billing data - how many hours of storage, or the amount of compute used, “but we don't have a way to translate that into kilowatt hours,” Sommer said.
This makes it hard for the company to make decisions about how to run its services in a way that reduces its energy usage, a pledge the company has made. “As engineers, we're struggling to come up with some metrics and ways to measure the impact of our code and architecture decisions.”
Giving customers insight into what their role is in the shift to renewable energy and efficient practices is vital, Switch’s Kramer believes.
"We try to help educate customers just by making them aware of what their actual energy load is," he said.
Every year, the company lets its users know how much carbon they would have produced using Switch, if the company wasn't fully renewable. "Then they can start seeing what they have achieved, and they can actually start to articulate how they have [saved] a ton of carbon just because they chose a data center that is 100 percent renewable and that made a difference.
"That's a start."
Stick to DCD for more coverage of the San Francisco conference over the coming days