Sky has rolled out a new Data Centre Infrastructure Management DCIM solution across its UK data center estate. We take a look at how it was designed, built and installed.
Sky operates multi-channel and multi-platform television services in the UK, Ireland, Italy, Germany and Austria covering 20m households. It extends from TV to mobile devices, fixed-line, broadband and on linear and On Demand.
Sky’s data centers are based throughout the UK and includes over 1,000 server racks located in a combination of Sky owned and operated facilities along with some co-located facilities.
Sky’s primary goals
Riccardo Degli Effetti, head of data center operations (DCO) at Sky said, “DCO is the foundation of IT at Sky and it needs to enable our business to grow. Uptime is the primary goal and responsibility of the operations team, but maximizing capacity and efficiency while minimizing our carbon footprint is high on the agenda too: As a company, Sky has set bold environment targets to reach by 2020.”
With the existing asset management solution approaching end of life, Riccardo and his data center operations team sought a new solution which would help Sky understand how its servers could be better utilised, whether idle equipment could be safely turned off, and what impact that could have on available capacity throughout the data centre estate, as well as the energy consumption of the system.
“Capacity costs money,” Riccardo said. “The most efficient data center is the one you don’t have to build. At Sky, we want to be able to measure what capacity we have as a resource, and understand how long it will be before we need additional infrastructure. We want to know which of our servers are working efficiently and which ones are not and therefore are candidates for decommissioning or further virtualisation. In a sense, power usage effectiveness (PUE) is not that useful to us – our key metrics are CPU-utilisation and power consumption.”
Sky had an initial list of key requirements when it decided to revamp its data center infrastructure.
- Replace multiple tools with a single integrated system
- Centralise asset management in the data centre infrastructure
- Improve efficiency by optimising power utilization
- Improve service by matching IT demand with data center capacity and performance
- Establish an approach to supplier engagement for longevity and continuous improvement
The business case
The corporate aim is simple: to reduce energy use and increase efficiency. Funding was granted because Riccardo and his team were able to demonstrate, to a cross-discipline panel, how the project would do this.
The aim of the project is to raise awareness of IT usage across the business. It is also one which Riccardo Degli Effetti believes will enable DCIM to show its full potential “The objective is to be at the forefront of innovation – not to follow but to lead – and this project will achieve that”.
By developing the requirements further to encompass a range of other desired and required functionality, Sky produced a performance specification and went to the market in a competitive tender. The process involved eight potential suppliers, but Sky picked a partnership between Keysource and Schneider Electric, which promised future-proofing and ongoing collaboration, with a data center specialist and a heavyweight DCIM solution provider.
To get relevant people on board with the project objectives, Keysource organized workshops to model the proposed approach, and explain the functionality of the toolset, and then demonstrate its impact on current systems and ways of working. It’s important to get stakeholders to buy into the project, says Rob Elder, director at Keysource: “With any roll out of DCIM, the success will be derived by what is achieved as a result of the data from the tool, not by deploying the tool itself.”
To ensure the relevant technical, security, network and operational resources were required more than 30 subject matter experts at various stages.
Phase 1 commenced in June 2014 and a pilot was delivered in early October 2014. Within this time all aspects of the planning, engineering and development, as well as deployment and validation testing, were conducted. The second phase involves integrating the DCIM solution with the building management systems (BMS) across all sites. This will provide further benefits, and require more collaboration and interfacing.
Schneider Electric’s StruxureWare for Data Centers is a DCIM toolset which includes
- Asset Management
- Change Management
- Capacity Planning
- IT Optimisation
- BMS Integration
The aim of integration was to develop an approach which optimised and streamlined the whole process. This resulted in a single dashboard in which assets, racks and BMS data can be viewed through a single screen.
The project lets Sky track CPU utilisation, and identify assets which could be candidates for virtualising, re-provisioning or retiring. Armed with this insight the operations team can go back to the different parts of the business, who were responsible for the specific platforms or applications, and help them put their IT to better use.
The BMS is linked to the IT assets, so real time changes in status or availability can be flagged via alarms through the tool. Either due to a planned or unplanned event, workloads can be deployed or managed based on suitable data centre capacity and availability on current conditions.
Combined with real time data the operations team out in the field can access the DCIM data and tools through mobile devices. Providing a dashboard and the ability to drill down into different areas improves service delivery by allowing for better more informed decisions and reduces the need for duplication.
The real driver is efficiency and cost, says Effetti, who believes that the industry-standard measurement of data centre performance - PUE - has limitations when it comes to actually showing how usefully a data center is performing, relative to its specific IT workload.
“PUE has changed the industry, but it is not a final indication,” he says. “It doesn’t tell you whether a data centre is doing something useful or not. To do that, you need to go very granular on the IT, and previously where we had the capacity to do that, we didn’t have the intelligence in the tools to enable it.”