At a virtual GTC event, chip-maker Nvidia has announced its latest GPU - the A100.
Based on a new architecture, Ampere, the graphics processing unit marks the largest performance leap over a previous generation of GPU in Nvidia's history.
The majority of the world's cloud providers and server manufacturers said that they would offer the A100, which Nvidia claims will have six times the performance of the last-gen Volta architecture for training and seven times higher performance for inference. "In FP32, or single-precision, A100 is 20 times more performant than Volta for AI inference that uses INT8," Paresh Kharya, director of marketing for accelerated computing, told reporters in a pre-briefing attended by DCD. "It's again 20 times more powerful than Volta delivering over a petaops of performance in a single GPU."
A big leap
The A100 has more than 54 billion transistors, which the company claims makes it the world’s largest 7-nanometer processor. "The die size is 816 millimeters squared - almost 26 millimeters on one dimension and almost 32 millimeters on the other dimension," CEO Jensen Huang said in a call with reporters.
"And that basically is near the radical limits of what's possible in semiconductor manufacturing today. The largest die the world's ever made, the largest number of transistors in a compute engine the world's ever made."
The chip supports Multi-instance GPU, which means a single A100 can be partitioned into up to seven independent, less powerful, GPUs. Conversely, with upgraded NVLink technology for GPU to GPU connectivity, the company claims A100 servers can "act as one giant GPU."
Alibaba Cloud, Amazon Web Services, Baidu Cloud, Cisco, Dell Technologies, Google Cloud, Hewlett Packard Enterprise, Microsoft Azure, and Oracle were among the companies announced to begin incorporating A100 GPUs into their cloud services and server designs.
The ongoing Covid-19 pandemic forced Nvidia's huge GTC event in Santa Clara to be canceled, with the keynote instead delivered today from Jensen Huang's kitchen. When asked how the novel coronavirus would impact the roll-out of a new GPU, the CEO called the pandemic "terribly tragic," but countered that it would be good for cloud growth and Nvidia's business.
"People are staying at home and using cloud computing a lot more than before, so the hyperscalers' compute demand has gone up tremendously. And video conferencing has gone up tremendously. Companies are realizing the importance of ensuring the continuity of work are moving a lot of their workloads to the cloud, and so that they have a hybrid computing environment. And so, cloud computing is going to see a surge.
"That surge is really quite good for our data center business and Ampere A100 is designed for data centers. It's designed to increase the throughput as well as dramatically reduce the cost of cloud computing and AI in the cloud. And so my expectation is that Ampere is going to do remarkably well. And it's our best data center GPU ever made."
Haven't I heard that name before?
Nvidia's Ampere chip has nothing to do with chip-maker Ampere, which develops an Arm chip called Ampere Altra.
Both companies' products are named after French mathematician and physicist André-Marie Ampère, one of the founders of the science of classical electromagnetism, and inventor of the solenoid and the electrical telegraph.