While data centers are a complex combination of various pieces of computing equipment and industrial machinery all working in tandem like a great, living creature, at the heart of most lies the Central Processing Unit (CPU).

Servers form the center of a data center, and the chips within them the core of those servers – most of which are x86 servers, that is, servers which are backward-compatible with the instruction set which developed from Intel’s original 8086 and 8088 CPUs, starting in 1978. With most of the 8086’s successors ending in ‘86,’ the name stuck, as did the architecture.

In Q3 2016, x86 servers brought in 90 percent of the server market revenue, according to IDC. Outside the x86 sector, vendors like IBM and ARM fight for scraps.

A new dawn

Intel sometimes refers to itself as the guardian of Moore’s Law – an observation made by one of its founders, Gordon Moore, who stated that the number of transistors that can be packed into an integrated circuit will increase twofold approximately every two years.

This has been a double-edged sword: on one hand, it highlighted Intel’s importance as the dominant supplier of CPUs, responsible for roughly 98 percent of the x86 server market today, and as a result, the direction of the industry as a whole.

On the other hand, it made Intel’s product launches as predictable as the rising and the setting of the sun: we know that performance of a new generation of chips will conform to the well-established tradition.

The fact that the Xeon Scalable Platform (Xeon SP) processor family, launched in June, delivers approximately 1.65 times more performance than last year’s Xeon E5 v4 will surprise absolutely no one. The new CPUs feature more cores, more inter-CPU bandwidth, and support more memory channels. But they also introduce never-before-seen features that should, in theory, enable IT equipment to do new and exciting things.

According to Navin Shenoy, who assumed responsibilities for Intel’s data center division in May, Xeon SP was designed to tackle three ‘mega trends’ that are changing the face of the industry: cloud computing, artificial intelligence and 5G connectivity.

“Enterprises will need to think about how they handle these new data workloads, seamlessly move them between public and private cloud,” he said during the launch event. “Cloud service providers will need to think about how they improve performance, security, agility, utilization across their infrastructure. Communications service providers will need to deliver a completely new level of network agility and efficiency, in the face of the new service demands as 5G is deployed. And in the world of artificial intelligence, we will see a broad range of algorithms develop, and this will require a broad range of solutions.”

All new Xeon SKUs are now divided into four categories, based on their relative level of performance and target workloads: from Bronze for low-power applications to Platinum for high-performance computing. Theoretically, the same hardware could be made to serve very different purposes, depending on the choice of silicon.

The processor family is the first to replace the tried and tested ring architecture with a mesh that adds more interconnects between cores. It also supports a larger number of cores on the same die – up to 28, whereas Broadwell would max out at 22.

“It’s a fundamental change to the processor design that allows us to increase performance by optimizing data sharing and memory access between all CPU cores,” Lisa Spelman, the director of product marketing for Intel’s data center group, explained.

“We realized that Xeon Scalable wouldn’t actually scale the performance that we wanted to see from the product, if we didn’t increase the bandwidth for the interconnect between the cores, the cache, the memory and the I/O. Without change, the interconnect, which adds a lot of value, suddenly becomes the limiter.

“That’s why I’m excited about this innovation: it provides a more direct data path for traveling between cores than the previous ring architecture did.”

Building blocks

One of the clear benefits of Xeon SP is increased server density: a single motherboard can now support up to eight sockets without the need for additional node controllers. This means system builders can squeeze more compute into a single rack unit, and for data center operators, this means an increase in demand for power and cooling per square meter. Average power densities in data centers are growing all the time, so this is nothing new, but a rapid increase could catch some providers off-guard.

“With PowerEdge FX, we can have four dual-socket Xeon SPs in 2U – that gives me 224 cores in 2U, with a large memory footprint, 1.5TB worth of RAM per sled,” Mark Maclean, server product manager at Dell EMC, told DCD. “It’s not just a speed bump – it’s an architectural change. Estimates will always vary, but in certain workloads, we are seeing up to 50 percent performance increase.”

Xeon SP places more emphasis on security, with Intel’s Quick Assist Technology – previously available as an optional extra, now bundled as standard. It involves a dedicated chip for tasks like compression, encryption and key management – and according to Intel, the new Xeons encrypt data twice as fast as the previous generation’s processors.

Another interesting feature that aims to cash in on a change in server hardware is Intel Volume Management Device (VMD), which brings RAID-like features to your expensive, high performance NVMe solid state drives, making them hot-swappable.

“What this means is we can take advantage of some of the capabilities in the new CPUs, with enhanced bandwidth around PCIe, to have better NVMe capability,” James Leach, director of platform strategy for Cisco UCS, told DCD.

“Because that’s the real difference – when we were switching between SAS and SATA, it was very simple because we were routing the same kind of connectivity. NVMe depends on the PCIe bus, and we’re just seeing the tip of the iceberg with some of the performance that it can offer.

“As the CPUs become more capable, and as we see more cores added through FPGAs and GPUs, we need to be able to feed those cores and NVMe is one way that we can really crank up the performance on the storage side.”

Personal goals

With Xeon SP, Intel is indeed improving its HPC credentials: besides increased performance, the new Platinum chips are the first ever to offer support for AVX-512 instruction set extensions and integrated OmniPath networking. Both will be useful in the design of systems that deal with machine learning, analytics and 3D rendering.

Before the official launch, Intel supplied Lenovo with several thousand Xeon SP Platinum 8160 chips, which were then integrated into Mare Nostrum 4 – a beast of a machine hosted at the Barcelona Supercomputing Center. This immediately propelled it to the position of the 13th most powerful computer on the planet, with 11.1 petaflops of raw processing power.

But Mare Nostrum 4 is currently outperformed by supercomputers based on CPUs from IBM, Oracle and even AMD – that’s got to hurt if you are positioning yourself as the world’s foremost authority on chip design.

Lenovo and Intel plan to get a Xeon SP supercomputer into the Top 10 in November, and we’ll see a lot more Xeons in the Top500 going forward – but whether the company can unseat the reigning champion, China’s Sunway TaihuLight (based on processors from The National High Performance IC Design Center, Shanghai), is anyone’s guess.

As for Intel’s references to 5G – while there has been plenty of industry buzz around next generation wireless, the first networks based on this standard are not expected to appear before 2020, and will be limited to China. Meanwhile, major manufacturers still have no idea how to build this technology into handsets without having to double their size.

Intel will release several generations of processors before it needs to contend with 5G – there’s simply no market for it at this stage. But there’s plenty of market for servers that can run existing telecommunications networks, as more and more operators experiment with SDN and NFV in order to cut their costs.

Taking all of this into account, we can conclude that Xeon SP is much more than a marketing exercise. Although not revolutionary, it offers real solutions to actual challenges in server design – with Moore’s Law gradually losing its role as the force driving Intel’s agenda. But in a new world where silicon features are king, and in the face of renewed competition from companies like AMD and Qualcomm, can it continue to simply meet expectations?

For Part I, where Sebastian Moss talks CPUs with AMD’s chief executive Lisa Su, click here.

This article appeared in the August/September issue of DCD Magazine. Subscribe to the digital and paper editions here.

Arms race:

While the big boys of the CPU market are slugging it out on the ring, a number of smaller, more agile chip vendors are quietly making inroads into the data center using cores designed by ARM (which was recently acquired by SoftBank). Despite the venerable (and vulnerable) Opteron branding, AMD’s A1100 – launched in the beginning of 2016 – has failed to set the world on fire, and no one, including AMD itself, is even mentioning it these days – the Zen architecture looks like a much stronger contender.

Su told DCD: “I think ARM is a good architecture, and an important architecture. We partner with ARM, we use ARM in some of our semi-custom products. But I think relative to the data center, x86 is the dominant ecosystem and will continue to be the dominant ecosystem.”

That hasn’t stopped another two American companies – Qualcomm and Cavium – from having their own shot at the title of the welterweight champion. The former is an expert in mobile devices, and wants to apply its knowledge in the enterprise IT market. The latter used to specialize in networking, before deciding to try its luck with servers.

Qualcomm’s Centriq 2400 is the world’s first server processor based on the 10 nanometer process. For comparison, Xeon SP is still using 14nm, which was also used in last year’s Xeon E5. The number of nanometers defines the resolution of the chip lithography equipment – smaller numbers mean more transistors on the same size of the die, increased speed and reduced power requirements.

Qualcomm’s first foray into server chips offers up to 64 cores and will be able to run not just Linux, but Windows Server too – a testament to its lofty ambitions. It should ship in the second half of 2017.

Meanwhile, Cavium’s ThunderX2 is a refined, updated version of the silicon launched in 2015, with new cores and I/O. The original chip was among the first to implement 64-bit on ARM, and the latest version continues the legacy of technical innovation, with up to 54 cores running at up to 3GHz. Each ThunderX2 supports up to six memory channels and up to 1.5TB of total memory – just like the latest Xeon SP.

The platform promises to be extremely workload-specific, with “hundreds of integrated hardware accelerators for security, storage, networking and virtualization applications,” across four main varieties. Just like Centriq 2400, ThunderX2 is expected to ship in the second half of 2017.

This will be a fight to remember.

Battle of the Chips, Part II: Intel doesn't do surprises

A new dawn

Building blocks

Personal goals

Tags

The make vs. buy decision for data center infrastructure management software – A clear choice

2023 Data Center Market Trends: Hong Kong Asia's Connectivity Hub

Emerging Energy Storage Technologies

Success story: Kao Data and Cadence