Ahead of the DCD>San Francisco virtual conference, Intel's IT Chief Technology Officer Shesha Krishnapura sat down for a Q&A with DCD's Kisandka Moses to discuss Intel's cost-efficient data center strategy and how it's designed to balance improvement in quality of service (QoS), lowest unit cost and resource utilization efficiency.
Q: It is widely reported that Intel's data center strategy, which began almost ten years ago, has resulted in up to $2.8bn in savings. Can you pinpoint any one project which contributed the most to a record reduction in cost?
A: When we published our paper last year, the savings totaled $2.8 billion from a period between 2010 to the end of 2018. We have now updated our data through the end of 2019 and can report that we have achieved a cost savings of $3.8 billion. This stems from our efforts to turn an old Intel fabrication plant into a hyperscale data center as opposed to adopting cloud.
However, looking at any one project would not provide an accurate calculation. You need to also look at facility efficiencies versus the projects conducted in other areas.
If you look at Intel, having our own energy efficient data center versus the use of a co-located or externally leased data center, it costs only 54% to operate our datacenter when compared to the same amount of capacity in other options.
This figure is based on a total cost of facility operating cost measurement, which includes the capital depreciation costs, electricity, water and maintenance bills.
Tune in to gain more insights into computing at scale
Q: How are these widespread operational efficiencies being made possible internally?
A: The cost advantage we are receiving is based on the use of our own disaggregate server innovation. We work with server vendors to ensure that in any future server refresh, we can cut 44% of the cost by selectively upgrading components such as CPU and memory, which add value for our workloads, and continue to reuse other server components like drives, network, power supplies, fans, cables, and chassis. We’re also making use of a variety of servers from Intel® Xeon® E processors, Intel® Xeon® processor workstation, to Intel® Xeon® Scalable processors which make-up our fleet of around 300,00 servers today, growing at 50,000+ servers annually. If you compare a four-year depreciation of those servers to the four-year guaranteed instance from the public cloud, the cost saving is greater than 500%.
The next largest saving would fall in the networking and storage category where we have been very focused on trying to make the right kind of decisions. For example, we have 3 different networking vendors. Through them, we have been able to get pricing efficiency, performance, functionality, reliability while significantly reducing nearly two thirds of our costs using the per dollar/per port cost metric.
From a storage standpoint, we look to see if data is more than 90 days old, then decided if it should reside in expensive storage or if it can sit in lower cost storage. Using this method, we have seen our dollar per terabyte going down, roughly more than 10% year-over- year. On top of this, our use of multi-tiered storage has substantially dropped our overall cost of storage.
We are also looking at breakthrough techniques to overcome bandwidth constraints. We were able to use the same WAN (Wide Area Network) capacity to get four times as much improvement on the data transfer rate without increasing the bandwidth.
We have lowered the dollar per kilowatt for data center facility construction so that the depreciation is lower by using a minimalistic approach to facility design, nontraditional cooling techniques like close coupled recycled water evaporative cooling and running the data centers at a very hot inlet temperature – up to 91 degrees Fahrenheit. This technique has contributed significantly to the operational efficiency which we have achieved, while not negatively affecting the reliability of servers, network, and storage.
As I mentioned, we removed the complex and expensive systems from our data centers so the capital depreciation is lower. And since we are operating at 1.06 PUE, the energy we spend on cooling has been substantially reduced. By adopting evaporative cooling using grey water as opposed to fresh water, in addition to removing expensive chillers and CRAC (Computer Room Air-conditioning) units, the construction costs and operating costs have been reduced.
We have created a very high-density data center. We have 20-inch racks at 60U vs. 24-inch racks at 42U/45U in a traditional data center. The peak power per rack is around 43 kilowatt and average power is around 34 kilowatt per rack. For example, in one data center, we put 31-megawatt power capacity in 30,000, square feet. It has 996 server racks, and each rack is 60U and is 20 inches wide so there is phenomenal density there.
Q: The strategy revolves around improvement in quality of service (QoS), lowest unit cost and resource utilization efficiency. Which of these metrics is most important?
A: If you look at our data center strategy key performance indicators, it's about optimally balancing the 3 corners of the equilateral triangle. At the top of the triangle is the quality of service and the bottom two edges of the triangle consist of asset utilization and unit cost.
The reason it's an equilateral triangle, is that these variables have three cornerstones of metrics. If I try to focus on one heavily, then the other two will suffer. For example, if I focus too much on unit cost, then I may not be able to meet my quality of service or service level agreement to the business. Similarly, if I try to offer the highest level of service to every application, whether it's needed or not, like Tier 1 storage for all types of data, then it is a waste of resource. So, we need to be smart about it and is the reason asset utilization and choice, from a cost standpoint, matters.
Shesha will broadcast his session, "Eco-watts: lifting the veil on transforming the data center for green computing at scale", during the DCD>San Francisco virtual conference on 14-16 October.