There’s a fine line between courage and foolishness. Too bad it’s not a fence. - Jim Fiebig

I can honestly state that I have never felt so compelled to defend a technical purpose or cause in my life.  

A few hours ago an article was made available blasting OCP (the Open Compute Project) and its efforts around certification.  This article’s author probably hasn’t won any open source fans in the last day or so but let me take the high ground.

Like all open source projects Open Compute started out with a vision and desire to bring new, alternative and sometimes superior capabilities to the masses.  This is almost always initiated by a benevolent dictator.  Linux was given to us by Linus Torvalds. GNU was given to us by Richard Stallman, OpenStack by Rackspace and NASA.  In each case these companies or people altruistically gave the world freedoms we didn’t have before.

Like those before them Facebook saw an opportunity to make the world a better place and created the Open Compute Project in an effort to remove gratuitous differentiation from hardware, all the while providing our ecosystem the same fundamental freedoms that have existed in open source software for decades.

facebook yosemite hardware
Facebook’s OCP-based Yosemite – Facebook

Open Compute Certification

OCP’s certification project started out as the Compliance and Interoperability project.  Through the much appreciated efforts of people from companies ranging from the financial services industry to big Texas based cloud providers to Linux vendors whose name literally translates to “community” we had compiled a suite of tools for ensuring the power of validating hardware shifted from the OEM’s to us.  

I know what its like to push technology on people. I also know what its like to pull technology and I can tell you the latter is far more democratic and the C&I project was doing just that. We created a democratized process whereby anyone could validate a config using open source tools.

There was no sinister motive behind who participated.  Open source has been, and always will be, the ability to manufacture your own medicine for the pain you are in.  Is Rackspace an OCP member? Yep.  Does it have the means to help the world create configs based on open standards, Yep.  Were those capabilities conveniently located down the street? Yep.

C&I matured with the tools we were given and we created OCP Certification with two trademarks to assist those who could self validate (OCP Ready) and those that wanted help certifying and ensuring industry standard metrics were performed (OCP Certified).

Now, that being said I know what deep certification is.  If you’ve ever had to deal with the pain of Common Criteria certification you know it can be a nightmare. There’s Evaluation Assurance Levels, there is FIPS criteria for objects in motion and absolutely none of it is cheap.  Specialized test centers can charge upwards of a million dollars for this type of cert but when dealing with the requirements of the Department of Defense it’s good to ensure your hardware is fault tolerant.

Fault Tolerance vs High Availability

Many people conflate these two things.  Fault tolerance creates redundancy in the hardware while high availability creates application resiliency through hardware abstraction.  If I’m a faceless, nameless, kosher test engineer who works for a company that would love to sell you two power supplies instead of one, I might be singing the same tune.

There is a place for fault tolerant gear and even a place for expensive certifications but hyper scale / cloud environments is not that place.  Marc Andreessen has famously said “Software is eating the world” and “Software is eating hardware”.  He’s right.

Facebook / Rackspace / Google / Amazon could all lose hundreds of servers / racks a day and you would never notice.  If your workload requires fault tolerance, OCP probably isn’t right for you and you’d most likely benefit from insanely expensive certifications that ensure your hardware will last.  If you’ve designed your environment to abstract away the hardware I’d argue there’s no better choice on the market today than OCP.

In a future world maybe entire cloud environments are run on private VLANs distributed across a million mobile phones.  Do I need to certify that a mobile phone works? If I can lose 10,000 phones and not care I’d argue that I don’t.

The new reality

The big US vendors gave up full control of their  destiny the second they all decided to go down the ODM (Original Design Manufacturer) route.  Do you honestly think the ODMs that build EVERY SINGLE motherboard for EVERY SINGLE  Tier 1 OEM don’t have those boards tested and certified to meet industry standards before they are stamped out in their hundreds of thousands?

Wouldn’t it be safe to assume that organizations like Facebook, Amazon and Google do this exact same thing?  The only difference is that Facebook wanted to share its designs because it didn’t see the infrastructure as a differentiator to its business model.

In closing

All of OCP’s processes and methodologies are there for anyone to critique and help improve.  It’s a transparent organization with no hidden agenda. It’s here to serve the common good.

To attack an open source project directly is at least courageous. It takes guts.  To hide behind a nameless, faceless “engineer” who hasn’t been transparent is just foolish.  In this particular case I wouldn’t mind building a fence.

Cole Crawford

@coleinthecloud

Founding Executive Director of the Open Compute Project