The greatest advance in computing was the invention of general-purpose computers in 1945. Prior to this, computing was done by physically wiring electronic components to perform calculations. Needless to say, this was manual, slow, expensive, and prone to error. What this concept did above all else was to introduce the idea of programming as a formal way to describe computations: It abstracted the logical view of a computation from the physical in a way that delivered large improvements to productivity and computational performance.
The principal inventor, John Von Neumann, clearly understood the awesome power of this concept, but it is likely that even he did not foresee the incredible advances that would be unleashed in the next 75 years: an increase in the total computational power available to the world of around fifteen decimal orders of magnitude.
Orders of magnitude larger
This increase was driven by the insatiable demand for computing and came from a combination of improvements in general-purpose computer architecture and advances in silicon technology. By the early 2000s, both sources of improvement had slowed sufficiently that the industry turned to “scale-out architectures.”
Throughout the evolution of computing, the industry has endeavored to improve its agility, defined as the speed with which new applications can be deployed on existing hardware while also increasing their runtime performance. Unfortunately, there appears to be an unavoidable tradeoff between agility and performance: improving agility comes at a cost in performance, and conversely, improving performance involves compromises to agility.
At its core, the architecture of a computing system is about making two distinct choices: a choice of the functionality implemented by elemental building blocks and a choice of how to interconnect them. The functionality of elemental building blocks—general-purpose CPUs, specialized GPUs, DRAMs, SSDs, HDDs—and particularly the way they are assembled in a server has not changed significantly in decades, although their individual performance has made large improvements.
The situation for server interconnection at data center scale is similar: the architecture of networks has stagnated even as the performance of the network building blocks has improved dramatically. Consequently data centers across the board face severe challenges: hyper-scalers have huge power bills, need dozens of server SKUs to cover the range of computations they need to support, and encounter security and reliability problems that are increasingly difficult to resolve. At the other end of the scale, Enterprise data centers have abysmally low utilization (below 8 percent%), cannot deploy new applications quickly, and also face a multitude of security and reliability challenges.
It is in this context that Fungible asked a few key questions at its inception: was there a new elemental building block that would (a) dramatically improve scale-out data centers along relevant dimensions; (b) whose use would be pervasive enough to justify building silicon; and (c) would facilitate the deployment of infrastructure as code? A Data Processing Unit (DPU) was proposed to resolve the top three problems in data centers:
- Data-centric computations are ubiquitous but are performed poorly by general-purpose CPUs because they are fundamentally at odds with the techniques used to provide high performance for user applications. Interactions between servers in a data center are highly inefficient.
- Network utilization is low (typically less than 30 percent); latency and jitter are unacceptably high; and CPUs waste a significant fraction of their resources in pushing packets. Data centers face a difficult choice: either they have to pay for inefficient interactions if they want resource pooling, or they have to give up on resource pooling entirely.
- The configuration, control, and telemetry interfaces to servers are inconsistent and ad-hoc (this is true for Enterprise, less so for hyper-scale). This results in major complexity for data center operators and is a significant barrier to agility, reliability, and security.
The DPU concept provides comprehensive, clean sheet solutions to all three problems: It performs high-performance data-centric computations at least 20x more efficiently than general-purpose CPUs, enabling these computations to be offloaded from CPUs freeing them to run applications more efficiently. It implements TrueFabric on top of standard IP/Ethernet to enable the efficient disaggregation and pooling of virtually all data center resources. And it provides standardized APIs to orchestration software for control configuration and telemetry. The Fungible DPU also significantly improves the reliability and security of data centers. Finally, and perhaps most significantly, the DPU is fully programmable in a high-level language to provide agility.
To complement the resource disaggregation enabled by the DPU, we propose a data center Composer to assemble complete virtualized “bare-metal” data centers in minutes starting from underlay infrastructure built using DPU powered servers connected via a standard high performance IP/Ethernet network.
We introduce a Fungible data center as a data center built using three complementary pieces:
● A standard high-performance IP/Ethernet network
● DPU powered server instances chosen from a small set of server types
● A logically centralized data center Composer implemented on standard x86 servers
The Composer makes full use of the configuration, control, and telemetry interfaces supported by the DPU. It treats an entire virtualized data center as code. Executing this code results in the creation of a new instance of a virtualized data center implicit in the code. Treating a virtual data center this way results in a highly agile, reliable process for creating a ready-to-use data center, time and time again; it also permits templatization of commonly occurring themes so that creating a new data center that is similar to a previous one is straightforward. In fact, all of the learnings from developing agile software at scale are applicable directly to delivering infrastructure as a service.
Finally, the ability to create and destroy virtual data centers in minutes opens the door to maximizing the utilization of the underlying resources by multiplexing them across multiple tenants.