Telcos, cloud service providers, large corporates and other organizations look set to invest vast sums of money in Edge data center infrastructure over the next five years. But provisioning and maintaining large numbers of small hosting facilities will put a serious strain on staff time and operational cost.

“The Edge is merely a location where you put some form of processing power - because you have to,” says Duncan Clubb, the head of digital infrastructure advisory at CBRE.

There’s no doubt that Edge capacity will be growing rapidly, owing to the applications that need it. It’s perhaps less obvious that these Edge resources will be a catalyst for more automation in data centers.

Edge sites will be small boxes, where space is concentrated on delivering capacity. They aren’t designed to have humans in them regularly, and the space and capacity inside is given over to processing resources.

The goal of Edge providers is to set up a collection of small facilities which can be treated like a normal data center while delivering ultra-fast local services for applications like the Internet of Things.

But there’s at least one big difference to centralized facilities. While a large cloud data center will have engineering staff on site, ten thousand small facilities distributed across a continent will not. Edge facilities will have to be built to be independent of people.

Automation and a “lights out” approach to data center management will go some way to enabling the Edge to be autonomous, but in the end mechanical automation - actual robotics - may also be necessary.

This article appeared in our Automation Supplement. Read the whole thing here

Predictive maintenance

Lights-out data centers operate with little or no intervention from engineers and technicians. The idea’s been promised for some time, but has proved elusive. Centralized facilities always have some staff available, so making equipment fully autonomous has not been a top priority.

As the first generation of Edge facilities appear, lights-out operation has become a more urgent goal - or maybe a necessity.

These facilities are based on the same technologies used in centralized data centers, but they are being delivered in smaller footprints, potentially in such large numbers that manual intervention will simply be unsustainable.

There’s one small win here: banning people from these tiny sites should make them more robust in many ways. There’s less risk from spilled coffee and errant limbs accidentally dislodging cables. but more than that, fewer interventions means less wear and tear.

Engineer know that things tend to fail when you fiddle with them. Bulbs fail when you switch them on, and a surge of electricity goes through them. Turning things on and off is asking for failures to occur.”

If anyone doubted this, Microsoft proved it by running an extreme light-out Edge data center: it operated 12 racks of equipment in a sealed container at the bottom of the sea off Scotland. The equipment lasted seven times longer than similar gear, doing the same jobs in a land-based data center rack.

Edge facilities will need to have resilient, high availability IT infrastructure made from components with low mean time between failure (MTBF) ratings, which will run without any local experts.

Redundant hardware configurations are a must to minimize downtime - so if infrastructure fails in one part of the Edge facility, it can automatically switch over to another. Edge infrastructure will rely heavily on “self healing” components that deploy accurate status monitoring and real time analytics to process the data collected and deliver predictive maintenance.

A range of different interconnected systems - everything from power delivery and backup, heating, ventilation and air conditioning (HVAC), cooling, environmental monitoring and physical security – will constantly check for changes in component behavior and respond accordingly.

That is a major step towards automation, and the addition of artificial intelligence (AI) for IT operations (AIOps) platforms will add analytics and machine learning to automatically identify and respond to issues in real time. By using AI/ML to interrogate large sets of historical data obtained during normal operations, systems can quickly spot anomalies that indicate potential faults or issues which undermine performance and fix them either automatically or by alerting engineers to remotely log in and orchestrate reconfiguration processes.

It is likely that 2020 will have accelerated the migration to lights out data centers both at the Edge and elsewhere, if only because site visits have become more difficult, thanks to social distancing restrictions designed to hinder the spread of Covid-19.

Operators have found that customers are much more interested in using remote systems than they were before, and that increase in expertise will feed through into automated Edge implementations.

Standardization cuts complexity

Smaller Edge sites are likely to be a single dedicated, pre-integrated hardware appliance. Larger facilities may be put together from different components. All sites will demand out-of-band management of servers, networking and power, combined with processes which automate common tasks to minimize engineer visits, prioritized by frequency and complexity - so maintenance can be done remotely, and preferably automatically.

The use of standardized and certified APIs in Edge architecture will go some way to simplifying the complexities caused by having different suppliers’ hardware and software within their facility, making automation possible at all.

Most operators will take a cleaner approach, and standardize on one particular type of architecture in the first place. This means a small parts list, and the potential to keep some replacement kit ready on site, either installed in spare rack bays, or ready for use.

Orchestration tools built around RESTful APIs can help with remote server provision, configuration and monitoring. The same applies to network components based on software-defined networking (SDN) and network function virtualization (NFV) technology that allows engineers to remotely control and manage connectivity between different components within the server/network/storage infrastructure as well as wide area network (WAN) links to the Internet.

Ultimately, the lights-out Edge node may not be one that operators can simply lock into place and never touch again. But maintenance costs can be significantly reduced if systems are left alone to run at full or partial capacity for a number of years before successive equipment failures drop it below the performance threshold that finally demands a site visit.

The logistic burden

But those support visits are still a problem. The weak link in any true Edge business model is the need to have the manned support model, and the logistics business around it. That's a big cost overhead.

For some operators that’s a surprise: firms offer data center management services have reported customers expecting them to manage many tiny data centers - on the same terms they handle one large one.

Experienced Edge operators are already factoring in the need to have a logistics service simply to get parts to a large number of sites and getting them installed and serviced. One of these, DartPoints, is in the market to buy remote services, CEO Scott Willis told DCD in a 2020 interview.

“As we get more scale, we could look at working with a third party organization that has a large footprint of tech resources round the US,” said Willis.

With or without a support partner, DartPoints will need a regional structure to support its locations, and Willis said he could see the company effectively operating as a logistics company for virtual environments.

Bring on the robots

But could operators go further in the quest for truly lights out Edge facilities?

EdgeConneX builds small regional data centers that effectively operate unmanned, using remote monitoring and sensors to track operations, so staff are only deployed when needed.

It also has remote control security, so customers can be buzzed into their space remotely.

"Our whole business premise is based on lights out data centers,” EdgeConneX CIO Lance Devin told DCD. “We have 2MW sites, not 100MW behemoths. I can’t afford to put three engineers and 17 security people and two maintenance people in a site like that.”

For small remote facilities, things may have to go further: DCD has heard proposals for physical robots inside Edge facilities, ready to swap in hardware to replace or upgrade older servers.

A facility the size of a shipping container could have 12 full racks of hot-pluggable equipment and a rack of spares at the end. A gantry-style robot, like those in automated tape libraries, would be able to replace failed equipment as needed.

It might seem that physical moving parts would be a bad idea, adding to the complexity of Edge sites and needing maintenance themselves - but there are upsides.

Already, robots operate unattended in factories for long periods, and the duty cycle in the Edge box would be quite light. It would require extra hardware that would be rarely used, adding to the cost of the Edge unit, but it would pay off in cutting site visits to an absolute minimum.

A FedEx or UPS courier could simply post new parts into a secure external bin, and collect failed or broken equipment from a hopper.

Human free zones

Robots would increase the benefit of low footfall, creating an isolated facility like Microsoft's underwater Natick project.

“We operated this thing for 25 months and eight days, with nobody touching it,” David Cutler, project leader of Microsoft Natick told DCD. “From the 135 land servers, we lost eight,” says Cutler. “In the water, we lost six out of 855.”

Keeping the outside environment out could be a significant benefit to small outdoor facilities, as they don't have lobbies. Opening the door of an Edge facility can let could or moist air directly into the racks, where it can damage equipment.

A robot Edge facility could be sealed and filled with unreactive nitrogen, so equipment would have a longer lifetime.

These robotic Edge data centers do not yet exist, and they are a long way from the current business model of data center operators, but they could be important - and may eventually change the next generation of centralized facilities.

Edge robots colonize the cloud

The IT equipment in these sites is going to be the same kit as is used in centralized data centers, and the power and cooling architectures will be compatible - so what works in the Edge could influence the centralized cloud sites.

Robotic Edge data centers will be backwardly compatible the large sites. If an automated approach is found to work within them, it won't be a huge step to apply that technique within the centralized facility.

Edge facilities are expected to catalyze change in areas such as liquid cooling, which can be expensive to retrofig to large facilities, but which may be the most efficient approach in tiny greenfield resources.

Edge facilities may even make changes in areas seen as sacrosanct, such as the 19-inch rack. Compared to that, robot arms may be easy to bring in.

Software automation developed for the Edge will be easy to retrofit onto centralized sites, and that could pave the way for robot hardware.

If the robotic Edge data center does come into existence, it could be that existing service providers may need to sit up and take notice.

Right now, no one is going to redesign a hyperscale data center for robots only. That won't happen for many years. But when it does happen, those who have automated the Edge may be the ones who lead the way in automating the hardware underlying the cloud.