“We're not really thinking about chips.”

Ian Buck has spent most of his life thinking about chips. But now, the head of accelerated computing at Nvidia, the world’s largest chip company, is thinking bigger.

“You can't buy Blackwell as a chip,” Buck, also the VP of the company’s data center and HPC business, tells DCD, referencing the next generation of its GPU line. “It's for good reason - It wants to be integrated with the CPU. It wants to be integrated with NV Link. It wants to be connected.”

Instead of dealing with single semiconductors, Nvidia has transformed itself into a platform business. It no longer worries about one accelerator and is instead focused on large, integrated systems.

“That was the decision we made back in the Pascal generation [in 2016], because the AI wanted to be across multiple GPUs,” Buck says. “The P100 days changed what we build and what we take to market or make available. Now it’s systems.”

This has begun to change the makeup of data centers, Buck says. “The opportunity for computing to be transformative started in supercomputing, but with the advent of AI that has broadened.

“Every data center is turning into an AI factory. It's not measured in flops or megawatts, but tokens per second and how many terabytes of data you are turning into productivity gains for your company.”

This opportunity - whether a bubble or not - has led to a rush of new data center builds. “But they can’t wait two years for a construction project,” Buck says. “So we’ve seen an acceleration of people retiring old infrastructure; they're simply just moving out their CPU infrastructure, moving in their GPUs and accelerating, so that every data center can be an AI factory.”

He added: “What you're going to see is not just one GPU from Nvidia, but a mix of platforms and ecosystems, allowing everyone to build the right kind of AI factory and workload that they need. Everyone will be at different phases of that journey or different optimization points.”

Of course, as much as Nvidia is trying to get away from focusing on the specific chips inside these so-called “AI factories,” their thermal design point (TDP) defines the makeup of much of the rest of the system. “Hopper is 700W, and we did air cool,” Buck says.

NVIDIA Blackwell Architecture Image
– Nvidia

“HGXB100 is also 700W, it is designed to fit right in where Hopper was,” he adds. “So when HGXB100 comes to market, all of our servers, all that data center, even the row-rack power, can stay the same.”

The industry can “take that entire ecosystem and upgrade it and deploy at scale,” Buck claims. And, he says, customers “get the full benefit of the Blackwell GPU, that P4, the transformer engine, double the NV Link speed between it. So Blackwell will come to market so much faster than Hopper did, partly for that reason.”

The company also has a 1,000W version of the HGX - “same silicon, a slight modification of the servers, they have to be a little bit taller, and a different air-cooled solution. Basically, the max you can do on air cool.”

But after that point, things get a little more complicated. “For the NVL72, we want to make sure we have the best available,” Buck says, with the rack featuring B200 GPUs. “That's 1,200W per GPU, and it becomes the real driver for liquid cooling.

“Four GPUs in 1U? Liquid is critical to get the benefit of the NVL72. And that gives you the benefit of 30 times more inference performance.”

The best isn’t always best, however. “TDP is not the right way of answering the question,” he argues. “What is the workload and what makes the most sense for your configuration? If you're doing a 7 billion parameter model inference, or 70 billion, HGX may be ideal, and it may not need 100 percent wattage the whole time.”

However, the trend is clearly for larger chips, consuming more power, and needing to be cooled to lower temperatures. Nvidia is itself part of the US Department of Energy’s Coolerchips program, focused on radical cooling solutions for ever-hotter semiconductors.

Buck declined to comment on where TDP will go, especially as the company shifts to an annual GPU release cadence. “We’re just running as fast as we can,” he says. “No waiting. No holding anything back. We'll build the best we can and keep going.”