A data center is a complex system comprising technology and the humans that run it. Increasing reliability is a combination of many things. You need a good design, with no single points of failure. You also need to make sure your staff are well trained, to minimize the risk of human error. And you need to maximize the fundamental reliability of your hardware.

Research suggests that there’s an upper limit to reliability, and no systems can be expected to work continuously for more than 200,000 hours (more than 22 years). But the only way to get anywhere near this figure is to address the mechanical issues that will eventually bring down any hardware.

Preventing hardware failure is a combination of making sure you have reliable, high-quality equipment, and taking all steps to make sure it doesn’t wear out. Most of the hard work is going to be done by vendors, to ensure kit is reliable before it arrives, but once inside the data center, it’s up to the owner of that equipment to take care of it.

“Maintenance is an insurance policy,” says Brian Kinkade, market development manager at Nye Lubricants. “It makes sense to do it.”

This article appeared in Issue 39 of the DCD>Magazine. Subscribe for free today

covid cover story nyi lead.jpg
– NYI

While most data center technicians already keep a close eye on temperature, humidity, and dust in the circulating air, Kinkade says the industry should also be aware of physical wear and tear. His company specializes in preventing it, and his social media slogan is: “Solving reliability and performance challenges with grease.”

Data centers include plenty of moving parts, and also some non-moving parts which benefit from lubrication, says Kinkade, who has been developing business in the sector recently. To keep data centers up-and-running and minimize unplanned downtime, engineers have to know about potential failures and address the issues, he says.

Cooling fans and hard drives need specialized lubrication, and so does the large-scale mechanical and electrical plant, including cooling units, air handlers, and diesel generators.

Surprisingly, some of the more static parts of the data center also benefit from lubrication, Kinkade explains. The part of the data center which first got him involved in the sector is the busbar.

Busbars deliver power from the electrical room to the racks and servers. Although they’re stationary, they can still suffer from “fretting corrosion,” where a steady period of micromovements produces wear on the contacts.

The wear can remove plating on the contacts, exposing the underlying copper, which can then be oxidized. The oxide layer acts as an insulator, making the power connection less efficient, and eventually causing it to fail.

A lubricant film prevents that by minimizing contact between the metal surfaces, says Kinkade. It’s standard practice in high-vibration applications such as in motor vehicles, but he has found from experience that data center operators need to consider it on their busbars.

In normal operation, most data centers have little serious vibration - though this is something that might change when Edge roll-outs start placing some near are subway stations - but to push reliability high, lubricant on busbars can be good insurance.

But there’s another issue: to make the most of data center technicians, many organizations have moved to “rack and roll” installation, where racks are populated with equipment in a central factory, then shipped to the data center for installation.

“When they plan on shipping racks fully loaded to the data center, operators and OEMs should do a transport, shock and vibration test,” says Kinkade. The vibration and shock during transport may be more than the equipment is designed to handle - and applying a lubricant during assembly may be a way to ensure the equipment isn’t harmed during transport.

At this point, the operator needs to look at the warranty, he says: “It’s a certain bunch of agreements as to where some equipment will work and for how long.” The connections between a server and a busbar will be designed to support a certain number of insertions, and a certain amount of vibration and shock.

Servers and switches are mostly designed to be inserted in a stationary rack, not inserted in a rack which is transported. It’s conceivable that shipping a rack with the servers inserted might exceed the lifetime expectation for the physical wear on those contacts.

As rack and roll has become standard, the OEMs have understood how to handle the equipment, and Kinkade believes hardware vendors have taken it into account, perhaps thickening the plating on contacts.

“There are still people that choose to ship stuff separately and install it at the data center. There is a mix,” he says. It will be down to the customer to make sure that the equipment they use is up to the use to which it’s being put. Operators should keep careful track of the tolerances - to make sure that the equipment is sufficiently rugged, and apply any extra lubrication that is needed.

Another point about lubricating those contacts is that applying more lube can mitigate existing damage to connectors and prevent further fretting corrosion. Lubricants designed for the automotive industry are a good choice, as a car is an environment with plenty of vibration.

Nye also provides lubrication for cooling fans. Bearings are sealed inside the unit, and the best kind are sintered bearings - which consist of pressed metal powder with pores which are impregnated with a lubricant designed to keep the fan permanently lubricated for its lifetime. The lubricant should include antioxidants and be rated to work at high temperatures.

Because they are sealed, the lubrication is down to the fan manufacturer, and the same goes for hard drives, which have fluid dynamic bearings to allow the spindle motor to move freely without friction and wear. As well as physical properties, that lubricant should have the right thermal range, so it doesn’t oxidize and cause debris.

Most of the time, apart from busbars and connectors, any lubrication is handled directly by the equipment vendor, and is a one-off intervention during manufacture. But there is one exception to this - which may be on the increase.

As the industry becomes more focused on its environmental impact, moves to refurbish and re-use data center equipment are coming to the fore.

“If there’s a refurbish in the business model, where last-generation servers are repurposed, you could want some grease in that process,” says Kinkade. “As an insurance policy to minimize failures, it might make sense to do it anyway. You would also want to test and measure it to make sure it works.”

And of course, any reuse process is going to involve transportation again. If equipment is re-shipped loaded into a rack, then that will introduce further vibration and wear, and might need further grease.

It’s worth underlining this point. IT people may focus on refurbishing the hardware, and clearing any data by scrubbing memory and drives.

The physical danger to IT systems might only be obvious to someone who lives and breathes lube.