AMD launches new Radeon Instinct GPUs for data centers, cloud

MI50 and MI60 are here - the latter being the industry’s only hardware-virtualized GPU

November 08, 2018

By: Paul Mah

Comment

AMD has launched two GPUs designed specifically for data centers and cloud computing, offering much faster floating-point performance and higher efficiency than its previous generation silicon.

The new cards offer mixed-precision capabilities that are only marginally faster than last year’s MI25. The big jump comes from double precision floating point performance, with the MI50 and MI60 delivering up to 6.7 TFLOPS and 7.4 TFLOPS respectively, an almost 10-fold increase from the 768 GFLOPS (0.7 TFLOPS) on the MI25. AMD says this makes the MI60 the world's fastest double precision PCIe accelerator.

Fast and faster

Under the hood, the use of 7nm technology translates to a higher transistor count of 13.2 billion, and means that the MI60 has a smaller die size than the MI25 device – while maintaining the same 300 watt power envelope. HBM2 ECC memory serves as a safeguard against memory errors and delivers bandwidth of up to 1TB per second.

Interface bandwidth has also been beefed up, with support for next-gen PCIe 4.0. Meanwhile, the incorporation of two Infinity Fabric links per GPU enables GPU-to-GPU communication of 100GB/sec per link. Up to four devices can be daisy-chained into a 'hive,' with a maximum of two hives per server, making eight GPUs in total.

Open, multi-user platform

Speaking on stage at the company’s Next Horizon event, David Wang, SVP of engineering at the Radeon Technologies Group, said the MI60 was designed and optimized for data centers and the cloud, being the industry’s only hardware-virtualized GPU. Hardware safeguards make it possible for the device to be shared among multiple users without interference, he added.

This makes the MI60 ideal for applications in Virtual Desktop Infrastructure (VDI), Desktop-as-a-Service (DaaS) and cloud environments. This capability is strengthened by the ROCm open software platform, which AMD says lets customers deploy high-performance, energy-efficient heterogeneous computing systems.

AMD also announced a new version of ROCm, adding support for 64-bit Linux operating systems such as RHEL and Ubuntu, and the latest versions of popular deep learning frameworks such as TensorFlow 1.11 and Pytorch (Caffe2).

“Google believes that open source is good for everyone. We've seen how helpful it can be to open source machine learning technology, and we're glad to see AMD embracing it. With the ROCm open software platform, TensorFlow users will benefit from GPU acceleration and a more robust open source machine learning ecosystem,” Rajat Monga, engineering director of TensorFlow at Google, said in a statement.

The MI50 comes with 16GB of NBM2 ECC memory and will begin shipping by Q1 2019; the MI60 comes with 32GB of NBM2 ECC memory and is expected to ship by December.

AMD launches new Radeon Instinct GPUs for data centers, cloud

Fast and faster

Open, multi-user platform

Tags

The make vs. buy decision for data center infrastructure management software – A clear choice

2023 Data Center Market Trends: Hong Kong Asia's Connectivity Hub

Emerging Energy Storage Technologies

Success story: Kao Data and Cadence