Nvidia has launched a new cloud supercomputing service allowing enterprises access to infrastructure and software to train advanced models for generative AI and other applications.
Offered through existing cloud providers, the DGX Cloud services provide access to dedicated clusters of Nvidia DGX hardware, which can be rented on a monthly basis. Each instance of DGX Cloud features eight Nvidia H100 or A100 80GB Tensor Core GPUs for a total of 640GB of GPU memory per node. DGX Cloud instances start at $36,999 per instance per month.
“We are at the iPhone moment of AI. Startups are racing to build disruptive products and business models, and incumbents are looking to respond,” said Jensen Huang, founder and CEO of Nvidia. “DGX Cloud gives customers instant access to Nvidia AI supercomputing in global-scale clouds.”
Early customers include biotech company Amgen, insurance software firm CCC Intelligent Solutions, and ServiceNow in conjunction with on-premises Nvidia DGX supercomputers.
“With Nvidia DGX cloud and Nvidia BioNeMo, our researchers are able to focus on deeper biology instead of having to deal with AI infrastructure and set up ML engineering,” said Peter Grandsard, executive director of research, biologics therapeutic discovery center for research acceleration by digital innovation at Amgen. “The powerful computing and multi-node capabilities of DGX Cloud have enabled us to achieve 3x faster training of protein LLMs with BioNeMo and up to 100x faster post-training analysis with Nvidia RAPIDS relative to alternative platforms.”
DGX Cloud will initially be available through Oracle’s OCI cloud service; its OCI Supercluster provides a purpose-built RDMA network, bare-metal compute, and high-performance local and block storage that can scale to superclusters of over 32,000 GPUs.
Nvidia said Microsoft Azure is expected to begin hosting DGX Cloud next quarter, with plans to ‘soon expand’ to Google Cloud and others.
“OCI is the first platform to offer an AI supercomputer at scale to thousands of customers across every industry. This is a critical capability as more and more organizations require computing resources for their unique AI use cases,” said Clay Magouyrk, executive vice president of Oracle Cloud Infrastructure. “To support this demand, we continue to expand our work with Nvidia.”
Manuvir Das, vice president of enterprise computing at Nvidia, added: “The limitless opportunities for AI-driven innovation are helping transform virtually every business. Nvidia’s collaboration with Oracle Cloud Infrastructure puts the extraordinary supercomputing performance of Nvidia’s accelerated computing platform within reach of every enterprise.”
In its own announcement, Oracle added that Nvidia is running its newly-announced AI Foundations services through Oracle OCI on the DGX Cloud platform.
According to Oracle, the OCI Supercluster includes OCI Compute Bare Metal, a latency RoCE cluster based on Nvidia networking, and a choice of storage. The system can scale up to 4,096 OCI Compute Bare Metal instances with 32,768 A100 GPUs.
Oracle is also adding Nvidia’s BlueField-3 data processing units (DPU) to its networking stack. DPUs can offload certain tasks from the CPU and take on the task of managing data moving through a data center, offloading networking tasks, and helping optimize application performance.
Clay Magouyrk, executive vice president of Oracle Cloud Infrastructure, said. “Nvidia BlueField-3 DPUs are a key component of our strategy to provide state-of-the-art, sustainable cloud infrastructure with extreme performance.”
Nvidia Hopper GPUs come to the cloud
Nvidia's Hopper GPUs are now available as virtual instances through a number of cloud providers.
Oracle this week announced that OCI Compute Bare Metal instances with Nvidia H100 GPUs are currently in limited availability. Microsoft last week announced a preview of its own H100-powered virtual machine, the ND H100 v5.
AWS will soon be offering H100 GPUs via its EC2 P5 instances. Each P5 instance features eight H100 GPUs capable of 16 petaflops of mixed-precision performance, 640 GB of memory, and 3,200 Gbps networking connectivity. Customers will be able to scale their P5 instances to over 20,000 H100 GPUs.
H100 instances are generally available from Cirrascale and CoreWeave.
Nvidia said Google Cloud, Lambda, Paperspace, and Vultr plan to offer H100 instances in the future.
On-premise H100 DGXs coming soon
During his GTC keynote this week, Nvidia CEO Huang said that the company’s latest generation of DGX H100 supercomputers, powered by its Hopper GPUs, are in full production and will be coming soon to enterprises worldwide.
Each H100 DGX will feature eight H100 GPUs, and provide provides 32 petaflops of compute performance at FP8 precision. Early customers set to receive the system include the KTH Royal Institute of Technology in Sweden, Japanese conglomerate Mitsui, and Ecuadorian telco Telconet.