Christian Belady, Microsoft’s general manager of data center services, says he has tasked his team with thinking about a 10X scale out in data centers.
Does that mean that Microsoft is going to spend ten times its US$15bn investment to date on new data centers?
This seems unlikely for any normal budgetary forecast term.
But Belady says he believes Microsoft is working on data center build outs at a scale no-one has ever seen before.
What type of facilities are being planned? Where are they being built? How many? How big? We know there is more capacity under construction but the details are sketchy.
While that information remains inside Microsoft, the company did open its doors at its fully operational Dublin mega-scale data center and is being more open about its cloud-scale data center development and deployment strategy.
Obviously, no company is obliged to divulge details of how it builds and operates its data centers. But one which has faced criticism in the past over its data center energy use understands that some level of transparency is required – not least to assure its existing and future cloud services customers of the sustainability and stability of its infrastructure.
The firm’s open-door policy includes a revamped web site which includes videos, whitepapers and blogs.
Why is it there?
Following a trip to Dublin to see its ‘colo halls’ in the original 330,000 sq ft facility, and a walk through of its latest 132,000 sq ft extension, Belady told me that the only way to achieve the company’s goals in the cloud data center space is to industrialize.
In Microsoft’s case, it says there will continue to be the development of large-scale powered shells with halls, containerized systems within those shells and ITPacs – its pre-built and ready-to-deploy data centers.
The ultimate goal is to replace the failure resilience which is currently built into the physical layer hardware of generators, switchgear, circuit breakers, uninterruptible power supplies (UPSs) and power distribution units (PDUs) or the IT hardware itself with logic-based application resilience.
It means abstracting the applications from the hardware. And it means designing the IT hardware for the applications and for the form factors which it is placed.
When pressed for exactly what this means, i.e. what software is being developed and where it resides in the stack, Microsoft handlers say this is the company’s Intellectual Property. It is not a big Data Center Infrastructure Management (DCIM) system, although DCIM does play a part.
Microsoft GM of Global Foundation Services Christian Belady
How did it get there?
Why all this equipment, the generators and chillers which the company now plans to ‘sunset’ was deployed was because traditionally built data centers were designed in silos, separate from the IT equipment they house and far removed from the applications that they run.
But since Microsoft undertook its data center journey – its original data centers having been built to get services to the market quickly – it became obvious that not all applications are equal. Microsoft data centers were designed for the highest availability based on “very conservative practices still used by most of the industry today”.
Belady says Microsoft Global Foundation Services took a look at its varied business requirements. The company says it has 200 online services. It lists MSN, Bing, Office 365 and Windows Live as examples.
“It is all integration. Part of what you are seeing is standardization and what we’re embarking on is integrating the software within the systems. Pretty soon the app is controlling the data center,” Belady says.
But Belady has greater ambitions and believes to achieve the operational scale that is required to serve cloud applications the data center sector must re-evaluate everything.
“Even the power infrastructure was far too complex. We’re looking at how we do an integration with the utility to reduce the complexity of how the power is delivered to and used within the data center.”
“We have five nines, three nines, highly available traditional builds, containers and ITPacs. However as we move forward, there is a convergence. As the apps become more resilient, requirements start converging. Chicago (Microsoft’s Chicago data center) has containers downstairs which operate at three nines – with no generators in place – while upstairs the data center has generator back up.”
This, Belady says, is part of Microsoft’s constant efforts to drive costs out of traditional design. This can be seen in the company’s efforts in Boydton and Columbia where ITPacs are deployed.
“In our move forward there is almost no difference [between traditional form factors and modular designs]. I don’t call it a modular design. What you’re seeing is the morphing of the two. The opportunity is in how we converge them. It is a module that is assembled – it is no different from a colo hall – you build it up in response to demand.”
Belady says not all applications are equal because otherwise Microsoft would have only one form factor. It is a hand-in-hand design – apps and design work together. It is about redundancy zones and making these predictable within the form factor.
MS Dublin, hot and cold
Removing large plant
“Every time I go into my own data centers I ask why do we need maintenance reserve capacity?” Belady says. “We have had great successes so far [in operational efficiency]. We’re using outside air, and that was big step (to do so at scale).”
“We had centralized UPSs and now they are out in the racks. I ask if we can get rid of generators. We have some data centers that no longer have generators. That’s because of the close collaboration with the application developers and an integrated approach.”
Microsoft points out that while it does run data centers without generators this is in certain expanded portions of data centers where SLAs allow it to do so. This means not every new or expanded data center will be built without generators but the company is continually looking to build without.
Developing new capacity
In terms of new data center facility developments, Belady has a permanent research team. “We’re continually looking at sustainability, looking at how much material is required. We look to use recyclable materials such as steel,” he says.
“I look to my team. I’m running one of the biggest data center development teams in the world. We have 35 different development criteria – we look at a lot of different things: Proximity to population, what are the business apps running in that region, if there are other data centers in the region, energy security and the mix of energy. Total cost of long-term energy operations, political stability and risk. We essentially have a market research team which looks at all this.”
What is measured
“I’ve got all these inputs into making decisions – cost is the important metric – optimizing the whole and not just the power usage effectiveness (PUE). We’ve come such a long way. Today PUE is getting close to 1. When we were at a PUE of 3 it was really useful because when it was designed the industry sucked at measuring anything at all,” Belady says.
As the man who conceived PUE “on a plane back from Tokyo”, Belady feels many in the industry misuse the measurement.
“As one who started the PUE metric I have a close relationship with it. It is a tool and should be used as such – we should be using all the tools that are available to us.”
Today Microsoft says it looks way beyond PUE when considering metrics. “Operating at lower costs is sustainability in action. I’ve got a team that looks at land, building, servers. In construction we look at cost per MW. The one metric that can be transferred is TCO (total cost of ownership). We make it so that our cost to run compute is as low as possible. Sustainability and low cost are thesame thing.”
Lest we forget why Belady is building out this cloud-scale infrastructure – it is all in the context of Microsoft’s business needs: That of selling its software applications over a network.
The cloud has yet to deliver, says Belady. He believes that while it is clear that enterprise data centers cost a lot to design, build and operate, to date cloud providers have yet to prove that the business model for the services they offer makes sense for those enterprises which operate at a global scale. The Cloud may not have delivered a big enough delta to convince enterprises that it is the correct investment. This is the scale argument. Once the Cloud starts delivering business value delivered on standard environments then the traditional $100m cost for the enterprise becomes $10m. That’s a delta that is difficult to argue against.
How this is measured is interesting. “When we look at buying servers we look at application performance per watt or rather ‘performance, per dollar per watt’. And this is absolutely critical or else what we do is useless to an enterprise,” Belady says.
“The integrated approach is vital and we, the cloud providers, will be the innovators. Who is best placed to respond to the technology developments? It is the cloud provider. “The cloud providers have a much faster adoption rate for new ways of data center operating – look at air side economization at scale. And they will also integrate faster with the utilities.”
To date, Microsoft’s all-in investment in data centers is up to US$15bn. “Look at the size of facilities like Dublin – and we’ve got other facilities that are that size. Look at the Windows 8 ecosystem – it reaches into the Cloud. Nothing that we’re seeing is suggesting that growth won’t continue,” Belady says.
This is as close to an official confirmation by Belady that Microsoft is embarking on a many-form-factor huge capacity build out (Unofficial word from the contractor world is that Microsoft is on a major capacity drive.)
In Dublin I meet with David Gauthier, the man who designs much of Microsoft’s fleet. He said he’d been on nine aeroplanes in five days. He didn’t say if he’d been flying East to West or North to South. Nor did he say exactly where those planes had landed.
Read what he had to say about what is going on inside Microsoft’s data centers in Part II of this article or read it now in the May/June 2013 DatacenterDynamicsFOCUS digital edition which is available now.