With artificial intelligence (AI) and machine learning (ML) seemingly encroaching into every aspect of business, data centers are being pushed ever closer to their limits. Flash-based storage is often specified for the data ingestion stage, thanks to its superior random read/write performance during data preparation. Then, for training and inferencing, data must be streamed from the storage location, demanding high throughput and low latency to ensure a rapid response. NVMe SSDs supporting PCIe 4.0 are supporting these requirements by overcoming the limits imposed by PCIe 3.0. In addition, the latest SSDs are delivering power savings, contributing to operating expenditure (opex) savings, and offering new form factors that enable higher server storage densities.

While the PCIe 4.0 standard has been around since 2017, it takes time before the interface is readily available in both server and storage. Furthermore, a move to new technology needs to coincide with a data center’s upgrade plans. Compared to PCIe 3.0, PCIe 4.0 has doubled the transfer rate from 8.0 GT/s to 16.0 GT/s, allowing data to be transferred at around two gigabytes per lane versus about one gigabyte per lane previously. For PCIe 4.0 SSDs, testing has shown that more than double the throughput performance can be achieved over PCIe 3.0 generation drives, while random read IOPS can improve by up to 87 percent (Figure 1). Write latency has also seen a drop of 67 percent.

Kioxa fig 1.png
Figure 1: PCIe 4.0 NVMe SSDs show a two-times throughput performance improvement, 87 percent more random read IOPS, and a 67 percent lower write latency. – Kioxia

Of course, the upgrade to PCIe 4.0 delivers benefits to other hyperscale, cloud, and general-purpose server applications too. Webservers, online transactional processing (OLTP), and e-commerce rely heavily on relational database management systems (RDBMS) such as MySQL, and here significant improvements in throughput and latency are attainable.

Test data

With a focus on analyzing the attainable transactions per minute (TPM) and latency, an AMD QCT S5K server was tested with both PCIe 3.0 and PCIe 4.0 SSDs. A MySQL OLTP database environment was emulated using a test schema based upon the TPC-C benchmark. With 5,000 data warehouses loaded, employing over 400 GB of the server’s storage capacity, the environment was configured for 64 virtual users. Query response time was also configured to 1 millisecond (ms). Using HammerDB, a leading benchmarking and load testing solution for databases, TPM and latency benchmarks were obtained.

Using random new order transactions interspersed with additional transactions, using Kioxia CD6 Series PCIe 4.0 SSDs achieved over 1,000,000 TPM, an 89.9 percent improvement over PCIe 3.0 generation SSDs. Latency also showed a marked improvement. Recording the time between request and data transaction, the CD6 Series managed an average read latency of just 0.16 ms and an average write latency of 0.73 ms. Compared to the PCIe 3.0 SSD, this was an improvement of 66 percent and 90 percent, respectively.

kioxa figure 2.png
Figure 2: TPM and read/write latency improve when upgrading MySQL servers to PCIe 4.0 SSDs. – Kioxia

Data center architects are also under pressure to reduce the costs of their infrastructure, and the improvements delivered by PCIe 4.0 can be used to reduce the amount of DRAM needed. For example, after limiting MySQL to use 32 GB of DRAM, the PCIe 4.0 CD6 Series-based server managed more than 600,000 TPM under the same test conditions. The PCIe 3.0-based implementation, with MySQL configured to use 64 GB DRAM, could only manage around 560,000 TPM. This allows storage architects to reduce the quantity of DRAM specified for their servers while still delivering performance improvements to customers.

Considering opex

Another key consideration during server definition is opex, where SSDs can contribute to both energy savings and reduced drive replacement costs. According to the EU, the annual energy consumption of data storage products is expected to reach 30 TWh by 2030, or 47 TWh when infrastructure (such as uninterruptable power supplies and cooling systems) is also included. This translates directly into bottom-line expenditure for data center operators, so replacing SSDs for lower-power versions is a sensible choice.

kioxa 3.png
Figure 3: A move from PCIe 3.0 to PCIe 4.0 SSDs can also enable a 50 percent reduction in DRAM use for MySQL while still delivering a better TPM. – Kioxia

NVMe 1.4 compliant SSDs can use just 19.0 W when active or 5.0 W in its ready state. With a mean time to failure (MTTF) of 2,500,000 hours, infrastructure architects can plan how many SSD drives to hold on stock locally for eventual replacement. While drive replacement itself does not cause too much of an issue, the chief frustration is the need to rebuild storage.

Traditional drive form factors are also a leftover of yesteryear’s storage, designed to accommodate the spinning platters of HDDs. Thanks to the Storage Networking Industry Association (SNIA) efforts, it has been recognized that SSD cooling can be implemented more efficiently using form factors that better match solid-state flash memory. SSDs are offered in an enterprise and data center SSD Form Factor (EDSFF). Provided in the asymmetric E1.S, it enables data centers to implement denser storage while also improving cooling, thus reducing opex.

kioxa 4.png
Figure 4: The EDSFF E1.S form factor targetS the thermal, performance, and power requirements of cloud SSD implementations. – Kioxia

Disaggregated storage will also see tremendous benefits from the move to PCIe 4.0 SSDs. NVMe over Fabrics, or NVMe-oF, offers higher capacity utilization and storage availability while improving overall efficiency. One of the challenges in recent years has been the turmoil in the NVMe-oF market, with promising software solutions disappearing as the businesses behind them are acquired. This makes planning a challenge as many SSD providers have focused on supporting a single NVMe-oF solution. KumoScale, KIOXIA’s own NVMe-oF solution, continues to make headway in this space. Managing storage in much the same way Kubernetes manages containerized applications, the Linux-based KumoScale can support any NVMe™ compliant SSD and any NVMe-oF compliant client. Bypassing the Linux operating system’s I/O stack completely, the full performance of PCIe 4.0 is passed onto the client while the latency adder can be measured in tens of microseconds.

Figure 5: KumoScale provides NVMe-oF for next-generation cloud servers, improving resource utilization, movement of workloads, and cost management. – Kioxia

While each new PCIe release can be quantified in throughput and latency improvements, for data center architects there are many more benefits to be had. Firstly, thanks to better TPM, DRAM requirements can be reduced on MySQL database servers while also improving performance. PCIe 4.0 NVMe SSDs also leverage next-generation form factors defined in EDSFF, enabling opex savings through higher density storage and improved heat management. Additionally, for those considering the move to NVMe-oF, PCIe 4.0 could be the reason to make the move. Software solutions such as KumoScale ensure that the performance benefits of this new interface revision are delivered to clients almost in their entirety, while KIOXIA’s ownership of the technology ensures that it will continue to be available for the lifetime of your storage strategy.