Intel has been talking about its vision for the software-defined hyperscale data center for several years, but with few tangible results. Originally dubbed Rack Scale Architecture (RSA), this has now been rechristened as Intel Rack Scale Design (RSD), with the recent release of the version 1.0 specifications. Systems that are compliant with these are expected to be available from key vendors before the end of 2016.
Like any organization, Intel knows the pressures now facing data centers from the massive growth in cloud services, plus new trends such as the Internet of Things (IoT) driving the need for ever-increasing volumes of storage and compute power to handle all of the data being generated.
With this in mind, Intel has presented RSD as a radical reshaping of data center infrastructure in order to make it more flexible and simpler to manage, and thus easier to scale out as required.
The aim was to disaggregate the compute, memory and storage from individual server nodes and form these into pools of resources that can be allocated under software control to precisely match the requirements of each workload.
Getting to that end goal has proven a challenge for several reasons, one of which is that Intel also intended its high-speed silicon photonics optical interconnect technology to form a key piece of the puzzle, and that has been delayed due to manufacturing difficulties in getting both the laser and logic circuitry onto the same chip.
Another is that Intel’s role in the IT industry has traditionally been that of a technology enabler, rather than an infrastructure provider selling solutions to the customer, and the firm has now realized that it has to take a more holistic approach to the entire infrastructure.
Start from the solution
“In the past we used to take our ingredients, such as CPU, Ethernet fabric and memory technologies, and would plan those independently of each other,” said Charles Wuischpard, vice president of Intel’s Data Center Group, when DCD spoke to him at the Intel Developer Forum event (IDF 2016) in San Francisco in August.
“Where we are starting to head now is to say that to truly be a solutions provider, we have to start with the solution in mind, and then make sure that the portfolio of our products implements the right hooks and features. Rack Scale Design is a kind of system-level thinking, and to do it properly, it needs to incorporate the roadmaps of all our ingredients in a coordinated way,” he added.
Intel also appears to be treading carefully, as it needs to keep its hardware vendor partners on side. If the RSD specifications prove too prescriptive, it runs the risk of firms such as Dell, Ericsson and Quanta Cloud Technology (QCT) drifting away and developing their own hyperscale solutions instead. “We don’t want to end up in a situation where [the major vendors] are each building their own proprietary solution,” Wuischpard said.
Intel appears to be treading carefully, as it needs to keep its hardware vendor partners on side.
For this reason, the RSD 1.0 specifications largely focus on the top-level architecture and put in place a common management framework to drive the whole infrastructure. As the name suggests, Intel’s Rack Scale Design revolves around making the rack the basic unit of compute. Each rack is made up of drawers, which are populated with modules comprised of either compute nodes or storage. Intel requires that each drawer has a pooled system management engine (PSME), which is a microcontroller responsible for configuration and identification of the hardware resources within the drawer.
Each PSME links to a Rack Management Module (RMM) and back to the overarching Pod Manager, where a Pod is the label used for a collection of racks that fall within a single management domain. Intel specifies that this management hardware should be connected using a separate network from the main production network fabric.
The Pod Manager discovers all the hardware within the Pod by querying the PSMEs, and exposes the resources to the orchestration layer above, which could be a commonly available platform such as OpenStack or a proprietary tool such as QCT’s System Manager software or Ericsson Command Center.
Intel claims that RSD exposes all of these resources through open application programming interfaces (APIs), and has based its APIs on the Redfish management specifications developed by the Distributed Management Task Force (DMTF) as a replacement for the Intelligent Platform Management Interface (IPMI). It has also released code for the Pod Manager, PSME and RMM under an open-source license.
The specifications call for shared power and cooling within the rack, in a similar vein to the OCP’s Open Rack standard
The RSD specifications call for an Ethernet-based fabric within the rack, but leave it open as to whether a top of rack (TOR) or end of row (EOR) switch is used to connect to the backbone network. The specifications also call for shared power and cooling within the rack rather than in each node, in a similar vein to the Open Rack standard developed by the Open Compute Project (OCP).
This is no coincidence, because there are parallels between Intel’s RSD and the efforts of the OCP, which can be seen in QCT’s Rackgo X-RSD platform, announced at IDF 2016 this summer.
This is based on RSD but uses the 21in OCP Open Rack format rather than a standard 19in rack. Rackgo X-RSD can be populated with 2U four-node compute modules and 2U storage modules, the latter capable of holding up to 28 3.5in drives and four NVMe solid-state drives (SSDs).
Vendors on board
Other hyperscale platforms based on RSD include Ericsson’s HDS 8000, which features an optical backplane to interconnect modules in its rack, and Dell’s DSS 9000, the latter due for availability some time before the end of this year.
While each of those suppliers is pursuing its own hardware designs, Intel claims this does not matter, because the common management APIs in RSD will allow for mixing of resources in a heterogeneous data center.
“They are all building their own hardware rack designs, but you can conceive of an environment where you have racks from more than one vendor, but run through the same software and composed into virtual servers made of compute from one vendor’s rack and storage from another, and it wouldn’t make a difference,” said Wuischpard.
With RSD 1.0, the goal of full resource composability is still some way off, but Intel claims it already offers reduced costs for data center operators through simplified management. RSD 2.0, due in 2017, will add support for pooling of resources such as FPGA accelerators, with future updates set to upgrade the orchestration support and hardware telemetry capabilities.
This article appeared in the October issue of DCD’s magazine