On a grey July day, the barnacle-encrusted cylinder broke surface off Scotland’s Orkney Islands.
It might have been taken for some unexploded World War II ordnance, but this was bigger than any bomb. Twelve meters long, two meters in diameter, it was the size of the X Class midget submarines which trained off the Scottish coast in 1942.
But the gantry barge wasn’t retrieving a piece of military history. In harbor, pressure hoses revealed a gleaming white tube. This was more like a time capsule from the future.
The logo on the end made its identity clear: “Northern Isles” or SSDC-002 was not lost treasure. It was the property of Microsoft.
This article appeared in Issue 38 DCD>Magazine click here to view
In 2018, Microsoft tethered it to land by power lines and fiber optic cables, and deliberately sank it. For the next two years, under 117 meters of sea water, 12 racks of IT equipment inside continued to run, processing workloads for the aptly-named Microsoft Azure.
The underwater data center was the latest experiment in Project Natick, an ongoing effort to run servers unattended; to find whether the cloud can work underwater. In July 2020, it was time to retrieve the capsule and assess the results.
Natick began in 2013, when Microsoft researcher Sean James, previously a US Navy sub-mariner, wrote a paper proposing underwater data centers. In 2014, the company decided to do it for real, and put together the Natick team under Ben Cutler.
In 2015, Cutler’s team took the plunge, sealing a single rack of servers in a vessel and sinking it in shallow water off the coast of California. The micro data center ran reliably for three months, before it was retrieved for assessment.
That was a proof of concept. Northern Isles - SSDC-002 - would be the real test. It had 864 servers and 27.6 petabytes of storage in a cylinder filled with unreactive nitrogen gas. It used a grand total of 240kW of power, working out to some 20kW per rack.
“The system is designed to do up to 900W/U, and has 42U racks, so we can manage more than 30kW per rack,” Cutler told DCD. The system worked - and it beat expectations.
Cutler says SSDC-002 didn’t just equal land-based equivalents. In some respects it outperformed them. Microsoft has said that the hardware which spent two years in the underwater data center was eight times more reliable than equivalent servers running on land.
Cutler gave DCD some more detail behind that figure. Why did the seaborne servers do better than those on land? How was the comparison made? And what are the implications for the future?
There was nothing special about the kit in the box, Cutler explained: “Basically, this data center is designed to use off the shelf IT equipment.
It’s not even brand new equipment; the servers had been retired from a data center. We did put in brand new SSDs for boot drives and brand new hard disks for data. That’s literally, standard off the shelf equipment.”
Cutler’s team took a batch of nearly 1,000 servers that had been retired from a data center: “We split them between the land and the water, and ran the same workload in both places.”
855 of those servers went into the Northern Isles canister, and worked unattended underwater for 25 months and eight days. The remaining 135 servers toiled in a couple of racks of a normal data center, alongside hardware running Microsoft's Azure cloud.
Internal Microsoft projects sent jobs which were distributed fairly between the two sets of servers. “So the same thing was happening both in land and water,” says Cutler. “We wanted the same life experience for these servers, so just the environment is different.”
Neither group got any maintenance, he says, although “there was a lot of normal data center work activity going on all around [the land-based servers].”
SSDC-002 never handled customer data, but it ran so well it had spare capacity this year. Microsoft donated that to the fight against Covid-19, through a project led by [email protected] to understand the protein structures of the virus. Cutler explains: “We turned over all of our data center assets on Project Natick, through to decommissioning.”
Long cables to land could look like a security risk, so SSDC-002 picked up on modern security methods: “The path between the data center back to land was protected by post-quantum cryptography. If somehow, there was a quantum computer and someone was listening in, they wouldn’t have gotten anything. So we were a testbed for a variety of other technologies beyond straight data center technologies.”
After two years, the aquatic servers outperformed those on land. “From the 135 land servers, we lost eight,” says Cutler. “In the water, we lost six out of 855.” Cutler makes a point of saying these failure rates - of untended second-hand systems - aren’t comparable outside the project.
The temperature in the underwater vessel was steady, and there were no passing people. But it’s likely that a major factor in the aquatic servers’ reliability was the inert nitrogen atmosphere. Earth’s atmosphere is mostly nitrogen but the 21 percent of oxygen, which we need to breathe, is reactive: “Oxygen reacts with stuff which causes corrosion. So our hypothesis was that removing the oxygen, and going to this nitrogen atmosphere, could result in higher reliability.”
Natick did detailed tests on long term changes to this atmosphere (see box), but the major finding is that - unsurprisingly - IT is very comfortable in an unreactive nitrogen atmosphere.
Once the unit was ashore, the internal data center was sent to Microsoft’s partner, Naval Group in France, so the equipment could be analyzed in detail, says Cutler: “to understand how different parts wore, relative to expectations."
The eight servers which failed are probably the most intensively examined pieces of IT hardware ever, as Microsoft does a root cause analysis: “We have things like scanning electron microscopes, we have the ability to X-ray parts, and do detailed surface analysis.”
That analysis was nearly complete when we spoke to Cutler, and no surprises had yet emerged. But one conclusion seems to be that hardware can be more robust than expected. “There’s a bathtub curve for the lifetime of parts, and a sweet zone for temperatures. If you’re too hot or too cold, you can have problems. We were outside the sweet zone on the hard disks: we operated them colder than normal, and that did not hurt us. Sometimes people have preconceptions about what matters.”
Normal data centers maintain a steady temperature and humidity, and are concerned about airflow. With a sealed container, the Natick team also had to include equipment to vary pressure. ”Remember your ideal gas law from school? Now if we raise the temperature, the pressure goes up. So things are a little bit different in this environment.”
Cooling was by an air-to-liquid heat exchanger between each pair of racks, says Cutler: “Each of those heat exchangers has data center fans on it, that push the air through as needed.” Seawater was pulled in from outside and run through the heat exchanger, and back out to the ocean.
That’s a big plus for Cutler: “Data centers can use a lot of water for cooling, but we don’t use any potable water. We’re just driving sea water through and then right back out. That allows us to deploy this anywhere we want to, without the need to tap into a water supply.
"That’s important, because there’s a lot of places on the planet where water is the most valuable natural resource. Even right now in the United States, half the country is in drought conditions, and we don’t want to compete with consumers and businesses for that.”
The effectiveness of the cooling also means that Natick’s data centers can be deployed in seas from the Arctic to the equator, says Cutler. They wasted very little energy in cooling, so most of the unit’s power could go to the servers, giving it a power usage effectiveness (PUE) of only 1.07.
A low PUE is good, but did the SSDC-002 affect its local environment? “The water we discharge is a fraction of a degree warmer than what comes in from the ambient ocean. And it’s a very small percentage of the water that’s passing outside. So literally a few meters downstream, you can’t measure the temperature difference anymore.”
Heat dispersal is important: “Wherever we would put these things, we would look for persistent currents, and the density at which we put these things is small, so we don’t have any harmful local effects.”
In fact, says Cutler, “the sea life likes these things. It’s an artificial reef. It becomes a nice place to go foraging for stuff, and it’s a place to hide from bigger creatures.”
The cylinder was supporting large healthy anemones when it came ashore - but the seabed where it lay has returned to the way it was before Northern Isles arrived.
Although renewable energy was not in the scope of this project, Scotland has a lot of local green energy sources, and Orkney is home to the European Marine Energy Center, a test bed for wave and power generation. “It’s a facility where people go to test these renewable energy devices, and they have ‘bursts’ you can lease. We actually leased one of those bursts in the wave energy area.” SSDC-002 was connected to the same grid, but as a consumer: “One of the things we liked up there was the dynamic that it was a renewable environment. That’s consistent with the direction we want to go."
The obvious question now is: what next? “This statistical result is very strong,” says Cutler. “But what do you learn in terms of tangible things you can go off and do? We know some next steps to try and do on land.”
Will there be more subsea facilities? Previously, Cutler has said the sea could be a haven for data centers, and he’s still keen to see it happen. The environment there is anything but harsh, with cooling available for nothing, he says. And real estate is cheap.
Finally, the sea floor is actually a convenient location, with more than half the world’s population living within 120 miles of the sea: “It’s a good way to get applications close to people, without the societal impact of building a giant data center in London.”
In 2017, Cutler filed a patent for Microsoft, describing a large underwater data center in which eight or more containers are lined up as an artificial reef.
Such a site could benefit from renewable energy: “Europe’s the leader in offshore wind, and a lot of those offshore wind farms are relatively close to population too. So imagine a data center, deployed at scale, and colocated with an offshore wind farm. Now I don’t have long distance power lines to get my power to the data centers, and I’ve taken out a bunch of capital costs and a bunch of risk from all those transformers in the power lines.”
Given the steady power from some wind farms, he says, “imagine a data center that has no backup generators, that has no batteries. It sits there, it’s a small fraction of the overall size of the wind farm, and it draws power from it. On a rare day, when there is no wind, it pulls power from land.
“Now, that’s not quite how that infrastructure works today. But it gets us to a mode where we take out a lot of capital costs, a lot of risks, and become much more environmentally friendly relative to infrastructure today. Batteries are an environmental challenge, and a supply challenge as cars and other things adopt batteries more broadly. So we like this idea of truly locally generated renewable power, close to customers, with very good environmental characteristics.“
Learning on land
But right now, it’s too soon now to say whether Microsoft will follow up SDCC-002 with a bigger seabed facility. And Cutler says Microsoft has learned plenty, even if it never puts another data center underwater.
“We want to understand what learnings we can take from this experience and bring back to land data centers,” he says. “One aspect of the analysis that’s going on now is to understand that, and then maybe spin up some work that would be low impact, and improve reliability on land.”
“In a normal data center if something breaks, someone goes in to replace the part,” says Cutler. “In this case, we can’t do that. If something dies, it’s broken, whether it happens the minute before we take it out of the water, or right after it goes in the water.”
In fact, that model is very like the new data centers emerging on land, in remote locations, at the Edge of the network. “They will tend to be lights out, like what we did. We operated this thing for 25 months and eight days, with nobody touching it.
"And when you think about the Edge you’re gonna end up with things that operate on their own. People don’t go there for a long time because it’s too hard to get there.”
Edge data centers will tend to be identical units, deployed to varying environments, and Cutler says this process could look like an extension of the Natick idea: “The vision here is a handful of global factories with a controlled environment. You manufacture the shells, inject the servers, seal them, and you can quickly deploy them, and have a much more common experience for the servers, regardless of where we send them.”
One problem with lights-out operation has been the need to keep upgrading hardware, but that could lessen as the continued performance improvements predicted by Moore's Law come to an end.
“A huge percentage of the cost of a data center over its lifetime is the servers. In a post Moore’s Law world, there’s really no reason to change the infrastructure every two years,” he says. In this world, it will pay to arrange longer life expectancies, “because that then drives out not just cost, but environmental impact.”
He’s talking about the embodied energy and materials in hardware, as well as shipping costs and warranty work. “All that might be better spent on other things like designing smarter, better machines, rather than a lot of churn.
“High reliability is not just for Edge,” he says. “Since the 1980s, we’ve been on this curve of increased reliability. We’re trying to drive further out on that curve.”
SSDC-002 sounds historic, but it won’t end up in a museum. Cutler’s team took their commitment to recycling to extremes and, when the equipment had been dismantled and tested, recycled the canister.
When Cutler spoke to us, it had already been cut up and was scheduled to be melted down. After all, he says, the value of the project is in what we can learn from it, not in the metal container.