Archived Content

The following content is from an older version of this website, and may not display correctly.

When you are in charge of a data center infrastructure for a company like Facebook's, you are faced with problems few have been faced with in the past. One of these problems is storing all the data users upload to your systems and how do you make sure that data is available to them on the whim for as long as they want and not spend your entire IT budget on buying and housing more and more storage capacity.

 

By tracking patterns with which users access their files, Facebook engineers have figured out that not every file they upload needs to be available for instant delivery at any given second. In fact, the majority of files uploaded reach a point in their lifecycle where they seldom accessed if ever.

 

The company's infrastructure team figured it could create a separate storage system that would be optimized for volume and cost, rather than performance and availability, and make it every file's destination after it reaches the end of its viewing prime. Facebook is actually building a separate building for this system next to its massive primary data center in Prineville, Oregon, optimizing the building's design for its specific use, which is what the company calls Cold Storage.

 

Stephen Chan, manager and technical program manager at Facebook, took an audience at the last Open Compute Summit in the Silicon Valley on a deep dive into the social network's cold-storage systems.

 

A modified Open Vault

The company's hardware team used a modified version of Open Vault, the custom-designed storage system used in its data centers. The typical Open Vault is a 2U chassis, holding 30 drives, that can operate with almost any host server.

 

Open Vault chassis are also installed into Facebook's custom-designed data center racks. The rack specification (like the Open Vault one) is now available as an open-source project through the company's Open Compute Project. Called Open Rack, it also had to be modified specifically for cold storage.

 

The overarching goals of the design effort were to reduce power consumption and cost of storage, while increasing density, Chan said.

 

One of the ways the design team reduced power consumption was reducing the amount of head nodes per X-amount of disk. The Open Rack for cold storage has a total of two head nodes, controlling eight Open Vault systems each.

 

While the amount of head nodes was reduced, the amount of disk drives was increased. The Open Vault storage rack went from 270 drives to 480 drives, giving Facebook nearly two petabytes of storage capacity per rack.

 

The team was able to fit more drives because of the space freed up by removing extra head nodes and by going from three power shelves to one. A power shelf is where servers in a Facebook rack get their power from.

 

The cold-storage power shelf has seven hot-swappable power supplies, meaning six-plus-one redundancy.

 

Another way to reduce power consumption was making sure that only one drive is spinning per tray at any one time, meaning only 32 of the 480 drives in the rack are spinning at any moment. Since files are accessed infrequently, idle drives are spun down and the drives are turned off.

 

Cold storage allows that kind of flexibility. The engineers figured they could get away with the time it takes to spin up a disk to retrieve an old file, Chan says.

 

In the end, they were able to bring down the rack's total power consumption to 2kW, which according to Chan is about one quarter of the power of a typical hot-storage rack at Facebook.

 

The first target in trying to reduce cost of the solution was storage media. The team had to find something cheap that would be able to sustain a multi-drive environment.

 

In cold storage, Facebook is using SATA drives that are not rated for running around the clock but are capable of spinning up and down. They do not have high rotational-vibration performance or accelerometers, and they are not hot-swappable.

 

If one of them fails, a tech will take down the system, replace the drive and put it back into production.

 

No back-up for back-up

The cold-storage racks do not have any traditional power backup, such as uniterruptible power supply (UPS) systems and diesel generators. They are also not plugged into the 48V-battery cabinets Facebook uses for backup power at its primary data centers.

 

The data center team treats cold storage as back-up, and, as Chan says, there is no reason to back up the back-up.

 

The reason Facebook decided to build a separate building for cold storage was low power consumption at high capacity requirement of the cold-storage systems. They calculated that the current cold-storage data hall in Prineville will take 1.5MW to power one exabyte of storage.

 

The main Prineville data center would quickly run out of space with this power-v.-disk ratio, Chan said. Like the storage-arrays themselves, the cold-storage facility is also a bare-bones stripped-down building, with concrete slabs holding 2,800-pound racks arranged in hot-air-contained aisles that are cooled by a dedicated cooling system.

 

Just like the librarian who goes away for a while to find that obscure book you asked for, you may have to wait for a few seconds to retrieve a photo from your vacation three years ago. Facebook's software has to reach the cold-storage facility where it may have to spin up a drive before it can pop that image on your screen.

 

A version of this article appeared in the 29th edition of theDatacenterDynamics FOCUS magazine, out now.