The typical life cycle duration of a modern data center through design, implementation and operation is approximately 15 years.
During this time, one of the most important areas is the testing and commissioning programme. Surprisingly though it represents the shortest of time during the data centres life and is often neither thorough nor rigorous enough to ensure that the facility has been both designed and built to provide a stable environment for the IT systems it will host over the next 10-15 years.
Two types of test are crucial: site acceptance testing (SAT) and integrated systems testing (IST). What makes them so important during the end of the build phase of a data center project?
Testing is a tiny part of a facility’s life span
Never switched off
Unlike most other building environments, once critical infrastructures such as data centres, nuclear power stations & hospital life support systems is powered up and supporting their mission critical application, it will never be intentionally turned off. For some clients, the engineering behind the resilience and redundancy can be referred to as Armageddon planning with the intention for a data center to be the last thing running no matter the circumstances.
For that reason, the testing phase represents the only opportunity to ensure that the facility is as robust as expected: to make sure the design and construction work deals with a variety of different scenarios and maintains availability. The negative impacts of down-time can be losses in both financial terms and brand reputation. The testing phase gives the owner the safety and knowledge that the facility has been designed and built well and operates to the planned processes of design.
What is covered in a SAT and IST?
- Capacity and Load testing of all systems.
- System redundancy.
- Fault resilience and tolerance (to Uptime Tier Relationships, I, II, III and IV).
- Generators are proven to operation, and ensure function in sufficient time.
- UPS systems are fully proven to expectation and requirement.
- Automatic transfer switches (ATS) and static transfer switches(STS) relationships are fully proven to expectation and operation.
- Cooling system restarts at maximum load and time to bring the temperatures to within SLA (service level agreement).
- Control system behaviour and expected outcome.
- Behaviour of systems during seasonal operation.
- Effect of thermal run away.
- Integrated testing of the various systems to ensure integrated operation of the facility as a whole.
- Black-build (full power off and re-start of the facility) operation.
- DCIM platform is fully integrated and will perform as you expect, alerting, monitoring, predicting.
Before SATs and ISTs are carried out, the basic level of testing and adherence to local and international (where applicable) mechanical and electrical systems guidelines should be conducted successfully. Also, a clear and detailed testing and commissioning programme is required.
You also need detailed test scripts of what, how and where the testing will take place, to have successful SAT and IST for your data center. The test procedures and script will normally form part of the sign-off with the end user or client. They may also wish to send a representative to witness some of the testing: this could be internal or an appointed specialist consultant.
Is a FAT good enough?
The SAT procedures should be driven by the vendor and added to by the data center specialist. The SAT procedures should be similar to the factory acceptance testing (FAT) procedures carried out by the equipment manufacturer before the equipment is shipped from the factory.
Because a data center will never be intentionally turned off, the testing phase represents the only opportunity to ensure that the facility is as robust as expected.
This may seem to be repetitioous and it may be tempting so ask; ‘Isn’t just doing the FAT acceptable?’ The answer is always NO!
The FAT is carried out in ideal conditions on a daily basis. The SAT tests that the equipment has been installed correctly and so has its supporting infrastructure. For example the SAT testing of a UPS would include the input/output cabling and batteries not generally present during the FAT, and similarly a test of a chiller would check that the pipe work and pump systems function correctly.
The SAT will also test capacity equipment to its designed rating and usually overload it for a short time.
Heater load banks positioned in the white space
Verify under full load
The final stage of testing for a data center after successfully completing the SATs is the integrated system test (IST). The primary objective for the data center IST commissioning is to verify the electrical and mechanical systems under full load operating conditions, failure scenarios and maintenance operations, and ensure that the data center is ready for the IT/Server equipment to be installed.
There are two ways to replicate the heat load and airflow created by the servers.
When no racks are installed in the white space, floor standing heater load-banks are positioned within the white space and provide the same kW output as the design IT load (see above).
It is better to install rack mounted 19 in heater banks (below). These ‘server replicators’ are the closest thing to the real things but, obviously, they can only be used if the racks have been already installed.
When using server replicators, in is important to replicate airflow as well as heat output, so the delta temperature (temperature difference) across the server is also replicated.
19 in heatload tester
Server airflow typically ranges from 45.5 l/s/kW to 75.5 l/s/kW. Typically, airflow of 65.5 l/s/kW would provide a sufficient design basis. Whether it is floor standing load banks or rack mount load banks, both the kW capacity and airflow are important deciders when comparing different types of load banks.
The IST puts the data center through its paces and this is the ideal opportunity to ensure that everything is right before being handed over to the client – ultimately this is where issues are identified and then there is time to rectify or modify before re-testing.
We would always recommend a pre-IST where time allows, to give a buffer for corrective actions to be made. Those who work within the data center industry will understand the tight timescales to get the facility up and running but should also appreciate the importance of thorough testing, as you would rather correct issues before the end user has migrated all their IT equipment over.
John Rippingale is associate director of Sudlows DMCC