Shesha Krishnapura is beaming. Of the many data center tours I have been on, Krishnapura’s is clearly the one with the most excited of guides. “Welcome to the world's best data center,” he says with a chuckle.
We’re in a huge, 690,000 square foot (64,100 sq m), five-story building right next to Intel’s global headquarters in Santa Clara, California. “Where you are standing is an old Intel fab called D2. The last product that was made here was the first generation Atom in 2008. It was decommissioned by the end of 2008, so by 2009 it was available - just an empty space.”
Krishnapura, Intel IT CTO and Fellow, "made a proposal to use D2 to learn how to build megascale data centers with energy efficiency, cost efficiency, and so on, in mind.”
“At the time, [then-chief administrative officer] Andy Bryant said ‘OK, you guys can have this fab to build, but first go and experiment in the old SC11 building.’” There, Krishnapura’s team built a chimney cooling system, along with a traditional CRAC unit, to achieve a power usage effectiveness (PUE) of 1.18. “It was our first experiment, and when we were successful at getting 30kW per rack with 24-inch racks, then we said ‘ok now let's go and break the barriers.’”
The result, built out over several years, is a data center with several unique elements, and an impressively low PUE.
"Today you're going to see around 150,000 servers," Krishnapura says as we head into D2P3, Intel’s largest data center, which spans 30,000 square feet (2,800 sq m), with a power capacity of 31MW.
The facility uses close-coupled evaporative cooling that relies on recycled water, to help it to reach an annualized PUE of 1.06, significantly below the “worldwide average of about 1.7,” Krishnapura says.
He explains: “The city, when they process all the wastewater from homes, like sewer water and all the kitchen waste, they typically throw it into the Bay for natural evaporation. But they also sell that water for industrial use, or landscaping or other stuff, at 50 percent lower cost. So we buy that to cool this data center.”
Elsewhere in the old semiconductor fabrication plant are smaller data centers, including D2P4, which has 5MW of power capacity across 5,000 square feet (465 sq m). Thanks to free air cooling, it, too, has a PUE of 1.06 - “they have exactly the same PUE, but totally different techniques.”
The two facilities have the lowest PUE of any of Intel’s data centers. “We've closed lots of small, inefficient data centers, and are trying to reduce our average PUE across our data centers to near 1.06,” Krishnapura says. Back in 2003, the company operated 152 data centers, by 2012 the number shrunk to 91. “Now we’re at 56.”
The reduction in data center footprint has conversely come as the company has faced more and more compute demand - with a rough increase in demand of 39 percent a year. To meet this challenge with fewer sites, Intel has relied on its old friend, Moore’s Law (the observation that the number of transistors in a chip doubles every two years, proposed by Intel co-founder Gordon Moore).
“In 2002, we had 14,191 servers with single-core, two-socket CPUs, totaling 28,000 cores,” Krishnapura says. “Now we have 260,000 servers, we have 2 million cores, more than 315 petabytes of storage and more than half a billion network ports within the data center.”
While Krishnapura talks, I become aware of something relatively unusual about the facility: “You're already sweating, because it's very hot,” Krishnapura observes.
When retrofitting D2, Krishnapura read a paper from Google that revealed the search giant operates its facilities at 78°F (25.5°C), in the cold aisle. “We said 'why limit it at that? What's the maximum we can go to?'” All available IT equipment supported inlet temperatures of up to 95°F (35°C), so the company settled on a cold aisle target of 91°F (32.7°C).
“It ranges between around 78-91°F in the cold aisle, depending on the outside temperature. The hot aisle is usually 20-30°F hotter.”
Looking up, Krishnapura says another difference is the height. “Industry-standard full IT racks are 42U, roughly 6ft. We are much taller, our racks are 60U, it's 9ft.” They are also slimmer: instead of the standard 24-inch format, they are trimmed to 20 inches, allowing for a few more racks to be crammed in.
“In 50 linear feet, where you can put 25 standard racks, we can put 30 of them. And as they're taller, we can put a lot more servers: each rack supports all the way up to 280 servers, and each rack can support up to 43kW peak power load.”
These are used internally for many of the things one would expect from a large enterprise, from running SAP workloads, to hosting webservers, to running Intel's video conferencing tools.
They are also used to design the company’s chips. “We use it for all the pathfinding to 7nm, to 5nm, some of the quantum physics algorithms - how the electrons scatter - all of that,” Krishnapura says.
By using Intel products in the company's facilities at scale, Krishnapura is additionally able to “give feedback back to the data center chip design business - we want to eat our own dogfood and learn from it.”
This creates a “feedback loop into the data center business, about how do we innovate and what kind of chips we want to make,” Krishnapura says. “It could be FPGAs from our Altera acquisition, or it could be the discrete graphics which we are working on, or it could be the other accelerators like Nervana.”
Less is more
But perhaps, for a business like Intel, one of the major benefits of using its own data centers is the obvious one - saving money: “The goal is to run the best data centers in the world and, from 2010 to 2017, we saved more than $2 billion, compared to having everything in the public cloud. So our cost is less than 50 percent of running in a public cloud.”
Further cost savings are being realized as the company pushes for fewer, but larger, data centers. “For every 5MW, we are paying $1.91m less in utility bills, electricity bills, every year in Santa Clara,” Krishnapura says. “For this particular data center, for example, when it is filled out, we will be paying a nearly $12m lower electricity bill for the 31MW.”
That electricity is used to power servers from several different vendors - DCD spotted logos from HPE, Dell and others - but most of all, it’s used to power Supermicro servers. That’s because the company partnered with Intel on Krishnapura’s brainchild: disaggregated servers.
"Green computing is not just energy-efficient computing, or using less natural resources like freshwater, there's also e-waste," Krishnapura says, describing how data center operators often rip out entire servers and replace them every five or so years - a huge financial cost, and a significant waste, when some equipment can be reused, as can the sheet metal. "There is no reason to change all of this."
Disaggregated servers "separate the CPU complex, the I/O complex and the accelerator complex," he says. "Whatever you need, you can independently upgrade easily."
Of course, companies can already upgrade aspects of their servers, but it is a difficult and laborious process. With disaggregated servers, "if people want to ‘rack it and stack it,’ they just need to remove four screws - it takes 23 percent of the time [spent previously]. You save 77 percent of the technician's time, that's a huge value."
However, with Krishnapura and Intel holding the patent, there is a downside: "It's our IP, obviously we don't want our competitor chip, graphics or other processor to get in."
The servers are currently only available from Supermicro: "In fact they were the last company that I pitched the idea to, I had pitched it to every other company... but Supermicro saw the value very quickly."
In June 2016, Krishnapura discussed the idea with Supermicro CEO Charles Liang, "and then six weeks later we had 10,000 servers deployed - it's never been done in the industry that fast. Now we have more than 80,000 disaggregated servers."
The idea, which aims to reduce the number of new servers companies need to buy, has not been immediately embraced by the industry: “It could be that their revenues are also lower,” Krishnapura says. “But what they don't understand is that as [company] IT budgets are flat, you are now giving a reason for them to upgrade every two years, instead of every four.”
This would mean that customers are likely to buy more CPUs - which will presumably come from Intel - but Krishnapura also insists this is a matter of ethics. “My whole purpose now is to use less material, less energy to deliver more compute and more aggressively bring value to the enterprises worldwide - not just to the cloud, but also for the private cloud, for HPC, and every segment.”
This article featured in the Modernization Supplement of the February issue of DCD>Magazine. For more information, or if you'd like to subscribe for free, click here or fill in the form below