Just over ten years ago, Google published a whitepaper called “Power provisioning for a warehouse-sized computer” which among other things such as emergence of PUE and wider public awareness of data centers, started what I like to call the first industrial revolution of the data center.
With industrialization came commoditization, and the cloud guys came along and disrupted the data center market, accelerating the commoditization by massively reducing the time and cost to provision and use compute, storage and networks.
In the enterprise we saw the IT departments taking ownership of the facilities folks responsible for the data centers’ power and cooling infrastructure in an effort to improve performance, productivity and efficiency, having recognized the gap between the two departments and the typical disconnect between the CIO and data center energy bill payer in facilities.
So where are we ten years on?
This is where I quote my friend Peter Gross who once said: “This industry loves innovation – so long as it’s twenty years old!”
While a lot of vendors (or at least their marketing departments) out there will undoubtedly disagree with me, I don’t think there has been nearly enough disruption or adoption of innovative technologies in the data center space over the last ten years. There are experiments pushing the boundaries like gas powered servers and deep-sea data center containers, but the vast majority of operators are still building big boxes with air or water cooling systems controlled using building management systems and networks protocols from twenty years ago.
Yes, we’ve got data centers performing at much less embarrassing levels of inefficiency thanks largely to a really simple metric called PUE and we’ve somewhat improved the facilities/IT gap in terms of now having one person barking the orders at the humans in IT and facilities, but there is still a huge gap between where we are and where we could be.
The good news is that there is plenty of existing technology and innovation, but the real issue is adoption – as per Peter’s quote, sadly without a real market pressure to change, we humans, like it or not, are all biologically wired to be risk averse. Compound that with downtime being the primary metric operators are measured on, that leads to an industry that when you step back and look, really hasn’t changed that much.
What will the next ten years look like?
I believe the second industrial revolution is on its way and the data center market will have to react with a series of step changes in order to meet the new challenges of scale.
edge. There, I said it! Eight paragraphs in and the current industry buzzword makes its appearance. So, this isn’t an article espousing how edge is the new cloud but rather the fact that it is one of a number of things happening that will (hopefully) force the second revolution of the data center.
As is usual, various marketeers out there will tell you almost every size, shape, age and type of data center can be an ‘edge’ data center and if that fits with your own definition of edge then that’s fine, feel free to carry on as you were.
For my purposes here I’m going to define edge as any purpose built unmanned data center of any size that has some sort of active air (or liquid) handling and some sort of power distribution system.
However you define edge, I think we all agree that there are going to be many more data centers as the explosion in data creation, data storage and data processing continues to impact pretty much every aspect of business, industry, the economy and our lives in general.
Not all data will end up in the mammoth public cloud data centers. These are ideally suited to large scale long term data storage and low frequency analysis and archiving, but for real-time low latency applications like autonomous vehicles, VR/AR and all the other smart data enabled applications out there, the edge is an important addition to the world-wide network of data centers.
The edge as defined above will be deployed at a ratio of at least 50:1 compared to large scale cloud, colo and enterprise sites. Perhaps that will be 5000:1 or even 50,000:1 within the next ten years as we see IoT data enhanced by AL and analytics really being adopted more broadly.
Over the last ten years, the number of humans involved in managing a live data center has remained much the same. Very little is actually automated on the facilities side and where it is automated, the humans involved usually fiddle with it and break it.
For the snowflake data centers out there (which is still most of them), show me one of a decent size that doesn’t have the ‘hero’ engineer who knows pretty much everything about the site as he/she has been there for ten years and essentially is the front of (almost) all knowledge about that site.
This industry cannot scale to where it is going and where it needs to get to with the current reliance on the good old human being. As the most advanced machines on the planet, humans are amazing, but they are terrible at scaling in an industrial way.
I’m willing to bet that if you have more than three data centers to manage in your business then you have need for more than one ‘hero’ and you’ll have silos of knowledge and information within your business about the status, performance and condition of your data centers.
Now imagine you have fifty, two hundred or thousands of data centers. Telcos have had the latter to experience for many years so far as switch sites and central offices are concerned – and things got a lot more complex in those when we moved from the PSTN to IP and all the enhanced capabilities that IP enabled for customers.
My point – which I know for some isn’t particularly popular – is that humans are already the constraining factor within the data center market. Training more of them, getting kids interested in data centers is all great but the second revolution will need to be one of automation, and I mean of everything from racking and stacking to control. That means data centers that are designed to house IT equipment and specialized robots with everything controlled by computers and software not humans with laptops and psychrometric charts.
Walk before you can run
Every revolution starts with a first step. So what is that first step? Is it collecting data? It must involve some AI right? Should someone say DCIM at this point (I’m trying hard not to smile)?
Step one should be to extract as much of the knowledge as possible from the humans and digitize it. Digital knowledge can be tested, validated, replicated, shared broadly and stored forever.
Step one doesn’t involve making all the humans redundant, rather it augments them by democratizing their knowledge for the greater good of the entire business.
It also for the first time allows that knowledge to be electronically integrated, used and augmented with the data that you collect from monitoring and site instrumentation. It’s like interfacing “Joe the local site expert” directing with the site control and monitoring systems – just imagine how much smarter and more automated things become just with this first simple step.
Now I can hear you saying that this isn’t a simple step. You perhaps think that the way you do that is to try and create an AI persona called “Joe the site expert” and then having that AI persona ‘learn’ how the data center works by feeding it tons of training data and waiting for the magic to happen.
I’m all for AI but I can assure you that Joe the AI data center engineer has nowhere near the smarts of Joe the human engineer today.
There is a much simpler way – we don’t need AI to digitize the knowledge that Joe has, data centers are extremely deterministic systems. If they were not then they’d be continually failing on us due to unforeseen conditions.
We can build a digital model of site and encode how the devices and control systems are supposed to work/perform separately and together as a system, then calibrate this model and feed it with live site data to continuously compare what’s actually happening, compared to what should be happening.
An outcome of this process is that all data is cleaned and validated and given context which is further knowledge that is captured electronically as metadata. Now providing this post-processed data stream to an AI system makes those much more valuable also without needing the masses of training data, which still often don’t help the AI systems learn about corner case issues and valid but difficult to predict changes in the operating baseline, such as when an ATS unit switches paths.
Beyond step one, we need to eventually replace the still largely disparate control and monitoring systems across facilities and IT with an integrated system level operating system that can operate at site, local region and global level.
The large numbers of unmanned edge sites that we’re about to see deployed over the coming ten years will require such full integrated and automated control. It’s a challenge and for sure it will not totally eliminate the need for humans on its own but that’s why robots are coming to a data center design near you soon.
The only question is will you be ahead or behind the curve?