Microsoft is designing many of the traditional engineering elements of the data center out of the equation in its quest for simplicity, including the backup generator.
In its move towards the software resilient cloud data center, it has stripped away the building, leaving modules sitting outside in its latest facility in Boyton Virginia powered only by the grid. And it is not the only facility running without diesel protection. It has taken the same approach in Chicago and Quincy.
“There are tens of megawatts in this data center and other data centers Microsoft runs. They do not have diesel generators behind them,” GFS Microsoft Director of Data Center Architecture & Design Management David Gauthier said.
“We did a lot of looking at our grid that we connected into - its reliability and what the grid is looking like - and then decided if there is a way we can take advantage of that.
“Many people say that’s your backup, what are you doing? In our Chicago data cener we actually had a couple of good blips that dropped megawatts of load. The application keeps on humming. The user gets redirected to another data center and we bring facility back up online everything hunky dory and we move on.”
Gauthier said Microsoft has removed the cost associated with maintenance and repair of the generator, and the current footprint.
Essentially, the data centers work as part of a virtual cycle – workloads can easily be provisioned to other Microsoft cloud data centers.
The Boyton facility uses modular containers, similar to its Chicago facility, which Gauthier said was just one part of the evolution of Microsoft’s cloud infrastructure.
“In Chicago the containers sit inside a big warehouse – it’s nice, it’s comfortable, it’s easy to operate and it’s a learning experience for us about how these things operate inside. It is about getting our operations people comfortable with it,” Gauthier said.
But after learning in Chicago, where Microsoft opened a modular facility in 2009, Gauthier said it was time to move the containers outside, and now lessons learned from Boyton will hopefully prove to Gauthier and his team that even more can be removed in future modular builds.
“We are already learning things from this design,” Gauthier said. “Because software is resilient, it can evolve, it can change and evolve which means the data center will change, and become lower cost.”
Gauthier said this is made possible by thinking about the full stack.
“We treat the whole container space and applications as one converged system, so the application that runs inside of these is also our building management system, fire system. It is all managed as one operating application,” Gauthier said.
Chicago was Microsoft’s first software resilient data center, Gauthier said. It was designed so that the whole data center could go off the grid and keep carrying the load its containers supported.
“The container is the failure building—basically it is a container-sized computer,” Gauthier said.
The approach has also removed one of the biggest risks in the data center – human intervention.
“There is a tonne of complexity in having resiliency. There are so many ways to wrap these PDUs (power distribution units) etc around the building but in doing that we were making a lot of human errors,” Gauthier said.
“So we started to look at taking out some of that capability. It has been good for us and it’s kind of what we call ‘fail small’. We are able to really compartmentalize the failure.
“Cloud environments don’t deal well with mixed mode failures. Binary failures, losing all ten racks, are much better than losing one rack. This is harder to deal with from a software standpoint.”
You can read more about Microsoft’s ‘Fail Small’ approach here.