Despite being worth just north of three trillion dollars, Nvidia is nervous.
One of the world’s most valuable companies, the GPU maker’s rapid ascent in the last two years has been built on rocky foundations: While its chips are at the heart of the AI revolution, the hyperscalers that Nvidia relies on as its most important customers are also its biggest rivals.
Amazon Web Services, Microsoft, and Google are all developing their own AI chips as they look to save costs and find a unique edge.
At the same time, those very same cloud companies are set to make more money off of the GPUs Nvidia sells to them by renting them out to end customers - the chip designer has claimed that, for every dollar a cloud provider spends on buying a GPU, they make back five over four years.
Nvidia could theoretically cut out the middleman and simply offer GPUs over its own cloud service, raking in all the profits. But, regulatory concerns aside, that would risk completely alienating its biggest customers and cost tens of billions to set up.
Instead, the company is trying to carve out a third option: A cloud within a cloud.
Last year, Nvidia announced the ‘DGX Cloud,’ a service offered on top of other companies’ cloud platforms. The cloud providers lease Nvidia's servers and deploy them as a cloud that Nvidia can market and sell to enterprises looking for large GPU supercomputers.
Google, Microsoft, and Oracle agreed to the proposal early on, but AWS held out until December, when it finally caved. The hyperscalers have been tight-lipped about the exact arrangement and none promote the service on their own websites.
Instead, DGX Cloud could be seen as Nvidia using the desperate demand for its GPUs as a way to carve out a space for itself in the cloud, leveraging its market position to grab some of the service revenue and develop direct relationships with end users.
“I wouldn't characterize it as a Trojan Horse at all,” DGX Cloud head and former Meta infrastructure VP Alexis Bjorlin tells DCD at Nvidia's March GTC event.
“This is a deep partnership,” she says repeatedly throughout the interview. “What we're doing is we're working deeply with the Cloud Service Providers [CSPs].”
In its May 2024 earnings report, Nvidia said it had committed to spend at least $9 billion on cloud computing services over the next few years - up from a commitment of $4.5 billion in January. That figure includes DGX Cloud, hinting at a rapid expansion of the effort.
“The DGX Cloud is an opportunity for us to enable access to Nvidia's latest technology in all the CSPs,” Bjorlin says. “When you step back from the whole thing and think about the end-user experience, it is about a full stack capability.”
Instead of being Nvidia’s attempt at building a cloud business, Bjorlin says that the point of DGX Cloud was to provide “the end-to-end experience. We have a full platform as a service software stack, we offer an AI foundry service.”
She adds: “A lot of our customers want to spend their software and their AI/ML expertise developing applications, not necessarily managing the underlying infrastructure - which you may think is a cloud service provider element - but it's actually the AI infra stack. I think that's where DGX Cloud offers something that's uniquely differentiated, we're meeting the customers wherever they are in their journey.”
Customers will “get access to all of the Nvidia internal experts that we have on model optimization, runtime optimization, or whatever,” she says. “I think people are looking for something that's a little more easy to consume so that they can focus on building out their own revenue-generating applications as opposed to a cost center.”
Those customers come to Nvidia and pay prices set by the company, despite the underlying hardware operating in CSP data centers.
The CSP still matters as the customer is often using one of them for non-DGX services. “Usually, the customers come very opinionated,” Bjorlin says. “They have their data somewhere, data gravity is a significant aspect here. Yes, egress fees are going down or going.”
Bjorlin is somewhat contradictory on how much input Nvidia has in recommending which CSP customers should use: “We're not making the recommendation of which cloud to use, but we are able to share where we think the performance will be maximized,” she says.
The company works with each CSP to build their version of the DGX Cloud, sometimes with unique features.
“With AWS, we announced that we'd be working with their EFA [networking] and their Nitro system,” Bjorlin says. The company plans to deploy 16,384 GPUs for DGX Cloud - alongside thousands more directly through AWS.
“With OCI [Oracle’s cloud], in that instance of DGX Cloud on Blackwell, it will be with InfiniBand - which is a departure from what they've done in the past,” Bjorlin continued. “So that was a result of our co-engineering for what would be the optimal experience and capability for this generation.
“We are building the capability to integrate deeply into the CSPs, whether it's their services, or their actual infrastructure, so that we can optimize the solution across the board.”
Bjorlin claims such choices are “their decision; nothing is mandated. We're willing to work with the CSPs on what their platform is and optimize that. They are making their own choices on how to address the end-user experience.”
Last year, The Information reported that Nvidia had spoken to at least one data center about leasing its own space to run the DGX Cloud, but Bjorlin declined to comment, saying that the company was focused on working with the CSPs in a “deep partnership.”
“We're a platform that's on top of all the clouds; what we're also offering is expertise and utilization, understanding the customer's workload, their end-to-end workflow, and making the recommendations,” she says.
This is necessary for the future development of AI, she argues. “A model can change. When you move into a mixture of experts, it puts different stresses on the network, it changes how the workloads are performing.
“DGX Cloud is really just about encompassing the broadest set of where AI will evolve to, to make sure that we're designing for that, so that ultimately Nvidia GPUs are the ultimate landing spot for any of these AI workloads.”