In 2009, a group of PhD students from the Reliable, Adaptive and Distributed systems Laboratory (RAD Lab) at the University of California, Berkley, developed a software platform for management of computer clusters.

This platform aimed to solve a single challenge: how to make the most of limited data center resources like compute, memory and storage, avoid overprovisioning and yet ensure reliable operation of all applications and workloads.

In 2010, it was accepted as an Apache Incubator project under the name Apache Mesos, and today, it is used by the world’s largest online platforms including Twitter, Netflix and eBay.

Mesosphere Inc was co-founded in 2013 by Benjamin Hindman, one of the original creators of the Apache Mesos project, to commercialize the open source technology and make it accessible for organizations that haven’t reached the size – and levels of expertise – of Uber or Airbnb.

Mesosphere took the core of Mesos and built the Data Center Operating System, or DC/OS, integrating a wide variety of open source projects along the way. “You can compare DC/OS to the platform that runs Amazon Web Services, minus the Infrastructure-as-a-Service layer,” Hindman told DCD

Your own personal AWS

Benjamin Hindman, co-founder and chief architect of Mesosphere
Benjamin Hindman, co-founder and chief architect of Mesosphere – Mesosphere

DC/OS uses the features of Linux to abstract server resources, pool them, and apply them in the most efficient way possible: it’s an intelligent overlay that can be used to manage your traditional IT with its virtual machines, and now, application containers. “The really exciting news about version 1.10 is it features Kubernetes. It’s one of the easiest ways to install and manage Kubernetes in your data center,” Hindman said.

Back in the days when Mesos began, there was no Kubernetes, and there was no Docker, so the team had to create their own container orchestration system called Marathon. It was written in Scala, and relied heavily on the properties of Java virtual machines. Later, Marathon was taught to orchestrate with Docker, and with version 1.10, Mesosphere can finally support all of the popular container frameworks.

“You no longer have to make a choice when it comes to the orchestrator, you no longer have to make a choice when it comes to the platform, and you can choose any cloud you want and move between them, so you will never get locked in.”

And if it looks like Mesosphere are collecting open source projects like they are trading game cards, you’re not far off the mark. “We probably have about 100 open source distributed systems that we have enabled on DC/OS today,” Hindman said. “When I worked with Airbnb, it took three weeks to get [Apache] Kafka up and running, and production-ready. Now with DC/OS, we have codified the learnings; we have essentially taken the O’Reilly books for these projects, and put the operational knowledge described in these books in the code.

“And this code enables Kafka to run in production today for a quarter of the Fortune 50, two out of three of the world’s largest car makers, five out of ten largest North American banks and many other customers, including the government. They all rely on our data services so they have the ultimate freedom of choice when it comes to cloud; they can still leverage cloud, get more flexibility and cost control, but more importantly, they get access to the latest and greatest, production-proven open source distributed systems.”

Hindman believes that the future of data centers will be defined at the edge, not the core, so the role of distributed software systems will become increasingly important.

“Edge computing is becoming much bigger, and it has nothing to do with your home sensors - it’s more about the autonomy of cars and 5G. A connected car today produces about four terabytes of data for every eight hours of driving. That data needs to be stored and analyzed somewhere. Is all of this data going into an Amazon data center? Probably not.

“We’ve seen companies building up data centers at the edge that can run some of these computations on subsets of data. And it’s very convenient if the APIs are the same as the ones that are running in a centralized data center – and that’s what our software allows you to do.”

This article appeared in the October/November issue of DCD Magazine. Subscribe to the digital and print editions here: