Brocade's new benchmark: Testing x86 servers as routers

Archived Content

The following content is from an older version of this website, and may not display correctly.

The idea of an x86 server that serves as its own virtual router is very new. Indeed, it’s so new that we don’t know quite what to expect when servers with NFV components are installed in telecommunications environments on a large scale. The goal is to be able to control the composition, as well as adjust the performance characteristics, of a communications network in real-time, by moving the gist of network routers into software that runs as virtual machines.

Manufacturers (if makers of software can be called that) are only just now taking measurements of those characteristics, or of the various observable properties of server performance that may soon be identified as characteristics.

Tuesday’s announcement from Brocade Networks and Spain-based Telefónica tells of run-of-the-mill servers with Intel Xeon E5 processors, equipped with a Red Hat orchestration platform, and with Brocade’s Vyatta 5600 virtual router installed, being able to register performance numbers in just a few hours’ time.

But those numbers don’t mean much just yet, just as the first Dow Jones Industrial Averages, published in 1900, didn’t mean much either. It takes time for benchmark tests on a multitude of parts to yield enough data for it to start making sense. When benchmarks are done right, they result in models that tell interesting tales about manufacturing strategies, marketing tactics, and the best possible returns on investment.

Before these models can be built, however, a wealth of data must be collected. That takes time. And we’re starting at square one.

The short ramp up
Brocade Networks’ VP for service provider strategy and a lead liason with customer Telefónica Andrew Coward stated over the last few years, Intel has ramped up the amount of packet processing capabilities that are in its chipsets. “But the software has kind of lagged behind, until fairly recently,” Coward said.

The first Brocade Vyatta components were released at a time when the whole idea of making an x86 server run virtual routers came with a question mark as standard equipment. CPUs are not designed to cope with the tremendous speed of throughput that 10 GbE networks entail. It has only been very recently — some say, this year — when an ordinary server could be deemed adequate to the task. It’s not the logic that’s difficult; it’s the speed.

Many hardware engineers believe ASICs will always be superior to CPUs in high-speed network processing, for this reason alone. It’s this fallback position that has kept Cisco executives from collapsing from heart attacks.

Intel has seen this kind of problem before, where its CPU may not be suited to a given task-at-hand by itself (case in point, high-speed graphics) but additions made to the package, or the cores’ handling of cache memory, more than compensates for such deficiencies. This could happen again. We won’t know unless we have the tools to reliably measure.

“A 1Gb pipe on a server, for the most part, has been really hard to do,” said Coward. He noted that Intel’s first big step in tackling this problem has been the release of its Data Plane Development Kit — a set of API libraries that software, including Vyatta virtual parts, can use to optimize their use of the CPU. This way, different NFVs on Intel hardware don’t have to reinvent the wheel every time.

Starting from the bottom
DPDK, Coward continued, “has enabled companies such as Brocade to start really reaching into the performance capabilities of the Intel chipset. So for the most part today, without DPDK and without a lot of tuning, most software networking products running on Intel processors actually have very low throughput — hundreds of megabits [per second], if that. So we’re starting with a low base, and frankly, low expectations as to what can really be achieved.”

Brocade has been dealing with the frankly low expectations that customers such as telcos have had. But Coward explains that this low baseline serves as a starting point, and with proper measurement tools, Brocade and its customers can see — literally for the first time — the amount of performance improvement that DPDK, as well as Brocade’s own innovations, contribute to the design of virtual telco data centers.

“This year, the number of servers that ship with 10GbE interfaces, for example, will outsell the shipments of 1Gb interfaces,” he noted. “So the benchmark is, can you actually fill a 10Gb pipe with traffic coming into a server? Because if you can do that, the cost dynamics become really interesting. What it says is, I can now generate as much traffic as the physical interface is capable of. If I can do that, the cost-per-bit of using that platform drops significantly.”

In a classic price/performance benchmark system, the two measurements are plotted on x- and y-axes in a chart. Since high-performance typically comes as a premium, the resulting plot is rarely a straight line (a sign of market commoditization) but a dipping curve.

Components that fall toward the lower right of such a chart are rated the best investments.

Brocade’s Coward said the first challenge for his company’s benchmarks was to fill a 10-gig pipe, and to do so while consuming the least amount of CPU resources for the highest efficiency. It’s not looking for high utilization here, because a server, after all, has other things to do than route IP traffic.

In recent months, Intel’s chipsets have become more sophisticated with the resources they offer to a virtual component. In the beginning, the Vyatta team worked hard just to reach the 1 Gbps milestone. What they learned, Coward described, was that it matters how many of the Xeon E5’s cores get consumed in the process of reaching that milestone.

“Ideally, what you want to do is fill that 10Gb pipe with traffic,” he said, “but still have some processing cycles left to do other things. In a virtualized world, you want to be able to do a firewall or NAT, or even run Office applications — think of this as a shared environment where the server is really getting a lot of application-type or database-type work, but watches all these pipes. So you want to consume as few cores as possible to generate as much traffic as possible.”

Doing one thing at once
Parallelism helps ordinary user applications to run more efficiently, and Intel has worked out new strategies for enabling more implicit parallelism for applications. But a purely sequential operation such as traffic monitoring doesn’t benefit much, if at all, from parallelism. A processor with a high core count can actually work against such a task. Even Intel has acknowledged this in recent months, and is applying itself to a solution.

“So part of what we did was [ask], can we take one core by itself and make it generate 10Gb of traffic?” he goes on. “Meaning, if you’ve got 8 cores, then each core is capable of generating 10Gb of traffic by itself.”

This is the current state of the work that Brocade Networks has been conducting with customers like Telefónica and partners such as Intel. It will take time to determine how they can best make sense of what they’re measuring. But as fast as the NFV market is growing, they’ll be under considerable pressure.

Brocade's new benchmark: Testing x86 servers as routers

Archived Content

The make vs. buy decision for data center infrastructure management software – A clear choice

2023 Data Center Market Trends: Hong Kong Asia's Connectivity Hub

Emerging Energy Storage Technologies

Success story: Kao Data and Cadence