Google, Facebook and the other giants update their cooling practice every six months. What can you learn from them?
The efficiency of data centers has increased in leaps and bounds in recent years – at the leading edge, anyway. For the rest, there have been worthwhile gains, but nothing like those achieved by the webscale (or hyperscale) operators such as Amazon, Facebook, Google and Microsoft. So what is it that those operators are doing right – or rather, better? And given that their massive data centers only represent a small proportion of the world’s data center power consumption, how long might it be before the rest of us get to see the same advances and advantages?
There are plenty of things smaller data center operators can emulate, such as running the data center at a slightly higher temperature – Google now runs its cool aisles at 80F (27C), for example. Research shows that the vast majority of data center equipment can perform just as well without being chilled – today’s more efficient hardware simply does not need the amount of cooling that early mainframes and other equipment did – and that for each degree Celsius that you raise the data center’s ambient temperature, you cut the power bill by two percent.
Source: Thinkstock / Krishna Kumar
Learning by example
There are caveats, of course. Google also does thermal modelling to locate hot spots, and during the design phase it arranges its equipment to even out the temperatures. Once the facility is operational, it will move its computer room air conditioners around to reduce hot spots if necessary.
Large data centers also take care to keep the hot and cool air separate – but they do it as cost-effectively as possible. The ducting can be as simple as a heavy plastic curtain or even sheets of cardboard. Similarly, empty rack slots are blanked off, and modern servers leave the factory with all the operator-accessible slots and sockets on the cool side and all the heat generating elements on the other.
Then there are modular data center designs, an area that has seen considerable development at Facebook. One of the company’s innovations was to pre-fabricate chassis assemblies off-site – not for the racks, but for the frames that hold all the overhead services such as cable trays, power distribution and even lighting.
These 12 feet wide and 40 feet long (3.7m by 12.2m) frames sit atop an adjacent pair of racks and have the major cooling-related benefit of enclosing the cool aisles, so when looked at from the side, the data center is shaped like a series of n’s. The frames have two further advantages: the first is that the services no longer need to hang from the roof trusses, so those can be lighter. And second, cool aisle enclosure allows the air handling machinery to be sited at the sides, instead of being in an overhead air supply penthouse – again, this saves structural steel.
Large data centers have demonstrated unusual ways of doing the actual cooling without using mechanical chillers, too. Some use evaporative cooling, by collecting the heat from the air with water-filled coils, then using cooling towers and fans to remove it from the resulting hot water. Others, such as Google’s coastal site at Hamina in Finland, use heat exchangers to transfer waste heat to cold sea water. Using recycled water or rainwater for cooling also reduces the environmental footprint for some of these sites.
More efficient power distribution is another cooling-friendly development. As top Google engineers are fond of saying, most imprtant here are the last six feet (2m) and then the last 10 inches (25cm) of the power network. That means first the distribution within the rack, and then within the server itself.
So Google and others now prefer 48v within the rack, stepping down from AC to 48v directly and in one step, to get best efficiency, and then supplying 48v direct to the load where it is converted to the sub-5v levels required. Google’s research has demonstrated that this approach cuts energy losses (and therefore waste heat) by more than 30 percent when compared to using 12v within the rack. Even the company’s UPS systems are now running on 48v.
But perhaps the most important lesson we can learn from the webscale operators is that different parts of infrastructure interact, and that decisions cannot be taken in isolation. No longer can systems, software, networking and facilities management be confined to their individual silos, where each has no interest in saving money for the others.
We all know that the energy you put into a system to do work has to come out again in some form or other. In the case of IT equipment, almost all the electrical power you feed in comes out as heat, with a little turning into sound, light and of course electrical signals. So anything you can do to minimise the power consumption and maximise the efficiency of your systems directly feeds through into how much waste heat they generate. Less waste heat in turn means you need much less cooling effort (which means power) to get rid of it.
This is a big part of what the webscale companies do. For example, they design their own barebones servers, optimised for their specific workloads, and then loosely couple them in software-defined architectures. These use replication and redundancy to create cheaper and more flexible distributed scale-out systems that replace the complex monolithic systems typically used by enterprises.
This is a huge conceptual and practical change, driven both by scale and by the rapid maturation of software-defined technologies of all sorts. One challenge is that it doesn’t (yet) suit every user or application, and there are good reasons why general-purpose servers are built the way they are, but outside the realms of specialist and heavyweight business applications, simpler systems are often a more efficient and effective choice.
Cooling experts have spoken for years of the need to connect IT operations and facilities management. The webscale companies go further: for them it is also essential to bring in new approaches to hardware design, networking and software, because everything interacts.