AMD's new Radeon Instinct GPU – Paul Mah

AMD has launched two GPUs designed specifically for data centers and cloud computing, offering much faster floating-point performance and higher efficiency than its previous generation silicon.

The new cards offer mixed-precision capabilities that are only marginally faster than last year’s MI25. The big jump comes from double precision floating point performance, with the MI50 and MI60 delivering up to 6.7 TFLOPS and 7.4 TFLOPS respectively, an almost 10-fold increase from the 768 GFLOPS (0.7 TFLOPS) on the MI25. AMD says this makes the MI60 the world's fastest double precision PCIe accelerator.

Fast and faster

Under the hood, the use of 7nm technology translates to a higher transistor count of 13.2 billion, and means that the MI60 has a smaller die size than the MI25 device – while maintaining the same 300 watt power envelope. HBM2 ECC memory serves as a safeguard against memory errors and delivers bandwidth of up to 1TB per second.

Interface bandwidth has also been beefed up, with support for next-gen PCIe 4.0. Meanwhile, the incorporation of two Infinity Fabric links per GPU enables GPU-to-GPU communication of 100GB/sec per link. Up to four devices can be daisy-chained into a 'hive,' with a maximum of two hives per server, making eight GPUs in total.

Open, multi-user platform

Speaking on stage at the company’s Next Horizon event, David Wang, SVP of engineering at the Radeon Technologies Group, said the MI60 was designed and optimized for data centers and the cloud, being the industry’s only hardware-virtualized GPU. Hardware safeguards make it possible for the device to be shared among multiple users without interference, he added.

This makes the MI60 ideal for applications in Virtual Desktop Infrastructure (VDI), Desktop-as-a-Service (DaaS) and cloud environments. This capability is strengthened by the ROCm open software platform, which AMD says lets customers deploy high-performance, energy-efficient heterogeneous computing systems.

AMD also announced a new version of ROCm, adding support for 64-bit Linux operating systems such as RHEL and Ubuntu, and the latest versions of popular deep learning frameworks such as TensorFlow 1.11 and Pytorch (Caffe2).

“Google believes that open source is good for everyone. We've seen how helpful it can be to open source machine learning technology, and we're glad to see AMD embracing it. With the ROCm open software platform, TensorFlow users will benefit from GPU acceleration and a more robust open source machine learning ecosystem,” Rajat Monga, engineering director of TensorFlow at Google, said in a statement.

The MI50 comes with 16GB of NBM2 ECC memory and will begin shipping by Q1 2019; the MI60 comes with 32GB of NBM2 ECC memory and is expected to ship by December.