Domain registrar and cloud hosting firm GoDaddy recently ramped up its international expansion with the addition of 10 languages and 14 markets across Asia. The company has also launched AWS-like GoDaddy Cloud Servers, a service that helps customers to roll out new resources.

It is less well that the company has expanded its infrastructure within its nine data centers in the last couple of years, growing from around 37,000 servers in mid-2014 to some 55,000 servers used to host 10 million websites this year. This works out to 18,000 new servers in 20 months, or close to a thousand servers being added every month.

So how do you put together the requisite infrastructure and software stack to support such rapid growth? To learn more about the work being done under the hood to create the GoDaddy cloud, DCD spoke to Elissa Murphy, the CTO and enterprise vice president of cloud platform at GoDaddy, and David Kim, vice president, global data center services.

GoDaddy cloud
– ThinkStock / pzAxe

Web-scale engineering  

“What you want to be able to do is to isolate to a particular level of the container in a secure way to get as much utilization [out] of the machine, based on the product offering that you have,” explained Murphy as she shed some light on the design tenets behind the GoDaddy cloud. “And on the backend for storage, being able to have a distributed store, or highly resilient store to serve up data to make sure things are secure and redundant.”

Where the physical infrastructure is concerned, Kim shared that GoDaddy deploys industry standard commodity equipment and hardware. He aims to right-size and standardize the infrastructure supporting GoDaddy’s cloud platform.

“From a physical infrastructure perspective, it is to support our scale units, really understanding our use cases and application profiles at each different layer,” said Kim. “We have applications by name, there is a platform layer, and then there’s an infrastructure layer.”

Unsurprisingly, the entire architecture is designed with the assumption that all layers of the system can, and will fail. Murphy says this is vital, because not making this assumption from the get-go will eventually necessitate changes to the architecture when scaling up.

“If you assume some failure rate of the disk, you build a redundant store. If you assume some failure rate of the network switches, then you locate that in different types of location. If you assume some failure of the data center, then you have geo-located data,” she said. “And you are thinking through the design problem from a scale perspective… where you are able to take any components of the system, and you are able to be resilient and that’s critical.”

Elissa Murphy
Elissa Murphy, CTO, GoDaddy – GoDaddy Inc

Going open(Stack)

It is no secret that GoDaddy has built its current infrastructure on OpenStack, Hadoop and various open-source software. But what role did OpenStack play with the existing architecture, and is it even possible without OpenStack? According to Murphy, OpenStack emerged as a logical choice, being open-source and cost efficient, although GoDaddy already had infrastructure technologies which was “quite good” from a container standpoint.

“With OpenStack emerging and being able to embrace open sources and the innovation that you get from open source and cost efficiency, it made a lot of sense to move over to OpenStack,” said Murphy. “There’s economies of scale, there’s [also] giving back to the open source community to make the industry better as a whole.”

“Coming from [an] infrastructure perspective, there’s quite a bit of industry best practices and reference architectures that we can lean on and learn from, as well as contribute to,” said Kim. “And continue to further our standardization of our infrastructure, and [driving] our operations.”

But how does the GoDaddy team cope with the rapid release cadence of open source projects like OpenStack? Murphy says they typically wait until a particular version “hardens” instead of rolling it out immediately. Parallel runs do sometimes happen though, such as with Magnum, an OpenStack API service designed to work with container orchestration engines such as Docker.

“Very rarely do we work with the trunk of an actual project, or immediately adopt the latest versions,” said Murphy. “You wait for the community to come and test it out, and usually a few other big players that will adopt it. You jump on after some of the issues have been worked out, and you try to contribute [back as you] work out issues as well.”

David Kim
David Kim, VP data center services, GoDaddy – GoDaddy Inc

Taking the cloud approach to infrastructure

“When you take that at a global scale, of course it gets way more complex, said Kim. “If you are deploying, racking and stacking, and provisioning servers one at a time, you experience a long backlog [of] requests and the developers can’t get their servers quick enough.”

But the move is inevitable, he explained: “We were able to really standardize to a handful [of] SKUs that we buy, and we were able to standardize our deployment methodology, and really optimize our supply chain to deliver full racks of servers. That involves good demand planning, capacity planning and supply planning as well. End to end, we’ve made really great strides in planning and delivering global infrastructure based on standards.”

“[We also evolved] our network architecture and operations to be more automated and software defined, moving away from traditional switches and routers to software based,” said Kim. “Where your network and security are part of the product rather than something that gets added – a more cloud scale design.”

It is worth noting that the transformation has to take place alongside existing systems that must continue functioning without any downtime. The company’s Singapore data center, set up in 2010, illustrates the continual need to upgrade ageing infrastructure, he said. And a refresh is necessary, as non-standard infrastructure increases the failure rate and deployment complexity of new hardware, he said.

“When you are dealing with 5,000 versus 30,000 servers, the type of scale problems you see, particularly if you geo-distribute them, change,” said Murphy in response to our question about challenges faced by the team in its journey. “Sometimes what happens is when you get to that scale you learn about the physical limits, things that you haven’t considered in the past. Or failures where that you haven’t considered.”

Kim ticked off a number of milestones and decisions that GoDaddy encountered in this journey. “Deploying our latest infrastructure stacks and standards. Getting the product into the particular region, deploying common infrastructure framework to monitor. The pace of deployment, and logistical challenge of deploying at scale,” he said.

Moving on to the next level

What will the next-generation GoDaddy platform look like, and does the cloud hosting specialist have any immediate plans to expand its presence in Singapore and in the region? The answer to the latter appears to be a “yes”, though Kim was quick to point out that GoDaddy has different peering locations in the region in addition to the main content hosting location in Singapore.

“We actually have plans to open a second site later this year in Singapore,” confirmed Kim “We will continue to expand our footprint in the region. I don’t believe our current data center in Singapore will be enough.”

“[The next-gen GoDaddy platform] looks pretty similar to what you have in large scale companies like Facebook and Yahoo and Google. Pieces of that infrastructure becomes more purpose built,” said Murphy. ”If you look at Hadoop, and you compare it to what a database looks like, you can look basically its breaking apart certain functions in the database and into purpose built components. That same kind of architecture approach is the same way we’re approach building our platform here.”

“From a physical infrastructure perspective, there’s a lot less complexity, a lot fewer workloads to design physical infrastructure to,” added Kim. “It’s about our full stack optimization. Standardizing our infrastructure based on designing purpose built infrastructure. It’s about optimizing and for maximum CPU optimization.”

We don’t just do technology for technology’s [sake]. We’re all in it to help our small business owners. Their success drives our success. Now our focus [is] on Asia,” he summed up.