Web scale data centers: software defined because they have to be

Google, Microsoft and Facebook design own software defined data center technologies

1 August 2013 by Yevgeniy Sverdlik - DatacenterDynamics

Web scale data centers: software defined because they have to be
An aisle at Facebook's Lulea, Sweden, data center

A lot of innovative ideas IT vendors and service providers eventually productize get implemented first by big Internet or cloud companies (and often originate there) out of necessity. This has been the case with software defined data center technologies. Companies like Google and Microsoft designed their versions of solutions in this class because they needed them but could not find anything on the market that quite fit the bill.

 

Rules change once you reach the scale of a Google or the scale of a Microsoft. The latter, as an example, provides 200 cloud and online services to 1bn individual customers and 20m businesses around the world. Companies of this caliber have realized that nobody is better suited to design the infrastructure to support their scale than themselves.

 

Google’s software defined WAN

Google homegrown hardware includes switches. And the company created its own Software Defined Network (SDN) technology to manage them. While it was getting economies of scale out of its compute and storage infrastructure, the Wide Area Network (WAN) its services rely on so much did not deliver in the scale department. “We need to manage the WAN as a fabric and not as a collection of individual boxes,” Google engineers wrote in a white paper the company released on its SDN approach.

 

Google’s WAN is organized into two backbones: an Internet-facing one and and an internal one that carries traffic between Google data centers. The latter is managed via SDN.Data center sites – each site has multiple switch chassis – are all interconnected and managed through OpenFlow controllers. Google also has a proprietary traffic-engineering service, which collects utilization and topology data from the network and assigns paths for traffic flows, programming switches through OpenFlow.

 

Google admits in its whitepaper that OpenFlow is not perfect, but it does the job for many network applications. Google spokespeople declined to comment further.

 

Microsoft is software defined

Microsoft has gone beyond SDN, building out a global infrastructure managed entirely via software. Christian Belady, its general manager of data center services, says it has outstripped capacity of tools on the market long ago. “We approach every aspect of the physical, software, hardware and operational environment as an integrated system, and use software to engineer in resiliency and provide data analytics for our operations teams,” he says.

 

With advanced telemetry and tools, Microsoft can debug software faster, and its management solutions enable the company to handle incidents very quickly. Because applications are not bound to hardware they are deployed on, workloads are moved easily between data centers in case of failure.

 

“From the cloud platforms to the network to the hardware, our data centers today are more automated and integrated with software, and these solutions are critical in helping us maintain high service availability for customers,” Belady says.

 

Facebook dipping toes in SDN

While Facebook is a much younger company than Google and Microsoft are, its breakneck growth has forced it to develop a serious in-house infrastructure competency very quickly. The company has designed its own servers and storage hardware and has recently kicked off an effort to design its own network switch for reasons similar to Google’s: off-the-shelf switches create a performance bottleneck that prevents it from reaping the full benefits of innovation in other layers of the infrastructure.

 

Najam Ahmad, director of technical operations at Facebook, says the company does not have any definite SDN plans yet, but there is a lot of activity around it. Facebook uses BGP (an Internet routing protocol) heavily in IP networks to make routing decisions, he says.

 

There is never one best network path for services like Facebook’s, so the company needs the ability to manage traffic dynamically, based on some business logic, and BGP does not allow for such capabilities. This is why the company’s engineers are experimenting with some Facebook-specific use cases for having a forwarding plane controlled by software that sits outside of the network switch.

 

Whether the solution they eventually come up with will qualify as SDN or use OpenFlow remains to be seen. Ultimately, that will not matter, since terms like SDN and OpenFlow are simply names for tools, and Facebook’s tool may not be one of them but do the job Facebook needs.

 

It has been clear for a while that Big Internet is becoming less reliant on the traditional vendors and service providers by the minute. It seems that once a company like Facebook reaches a certain critical mass, there is little an outside vendor can do to serve it better than it can serve its own needs.

 

This leaves traditional enterprises and service providers as primary markets for software defined data center technology. As with other products, vendors’ challenge will be in delivering solutions that satisfy everyone – a challenge much greater than the challenge of designing something exclusively for one company’s use.

 

This piece originally ran in the 30th edition of the DatacenterDynamics FOCUS magazine. Subscribe for free on the DCD website.

CONNECT WITH US

Sign in


Forgotten Password?

Create MyDCD account

Regions

region LATAM y España North America Europe Em Português Middle East Africa Asia Pacific

Whitepapers View All