AI is a phenomenon currently taking the world by storm, and for good reason. But what if we told you that AI isn't necessarily a new concept?
Born from HPC techniques that have been in development for the last 20 to 40 years, AI leverages decades of data and research from the HPC space, turning it into a predictive AI model.
“We do a lot of simulation in HPC, we have done for decades, but all of a sudden, along comes AI and now we’re bringing these HPC techniques forward into this whole new space, solving problems in a radically different way. I'm not sure that I'm seeing any limits to it yet. This is a big deal and I think it's going to change the world,” says Brendan Bouffler, head of developer relations for HPC Engineering at AWS.
In a recent DCD>Talk with DCD’s Alex Dickins, Bouffler explains that this essentially means simulations that might’ve previously taken weeks, months, or years – for example, the simulation of molecule interactions to aid in drug discovery – can now happen orders of magnitude faster.
You’ll probably agree that medicinal use cases like drug discovery are important, so you’d think this kind of technological advancement would have scientists of the world jumping for joy, right?
Unfortunately, when we hear the word ‘scientist’, the media might have us picturing a team of great minds, hard at work in white lab coats day in, and day out, for the good of humanity. What you might not envision, is a bunch of frustrated individuals waiting for the compute resources they need to achieve scientific breakthroughs.
“In many institutions, these folks can be waiting hours, days, or even weeks to get hold of compute resources,” says Bouffler. “If we can give these people the tools they need as and when they need them, we can eliminate the wait, we can move them onto the next step in their experiment, move them onto the next iteration of their theory, so much faster.”
But of course, time is money. So even if an institution does have access to a multimillion-dollar supercomputer, there are likely hundreds, perhaps thousands of scientists waiting to use it, all of whose salaries – not to mention the cost of the training that got them to where they are in their careers – eclipse the cost of that computer.
“To put it into perspective, we need more computers. They need to be more efficient and they need to be able to deliver massive compute resources anytime these folks need it,” explains Bouffler.
But with global data center power consumption predicted to double between now and 2030 to almost 1,300TWh – accounting for almost five percent of global electricity consumption – how do we go about getting these resources into the hands of those that need them, in the most energy efficient way?
The answer lies in cloud computing, and offering scalable and flexible accelerated computing resources – powered by AWS and Nvidia – to meet the demands of HPC workloads whenever scientists and engineers need it.
“This is where Nvidia does some really incredible stuff. How do we get as many compute cycles as possible for every watt of heat that we just pump out the back? That’s why Nvidia GPUs are so important to us. We could build even larger supercomputers made out of purely CPUs, and run all of this AI on them, but it would cost us more electricity, generate more heat, and do greater damage to the earth,” explains Bouffler.
To put the efficiency of GPU-powered HPC to the test, the US National Energy Research Scientific Computing Center (NERSC) measured how fast a number of applications ran on CPU-only and GPU-accelerated nodes on its latest HPC infrastructure, powered by Nvidia A100 Tensor Core GPUs. At the same time, it measured the energy consumed by these applications to calculate exactly how energy-efficient running intensive applications on HPC really is.
NERSC found that its Nvidia-powered HPC infrastructure was five times more energy efficient, on average. In terms of energy and cost savings, on a server with four Nvidia A100 GPUs, NERSC achieved a 12-times increase in performance compared to a dual-socket x86 server.
For the same level of performance, NERSC researchers concluded that the Nvidia GPU-accelerated system would consume 588 megawatt-hours less energy per month than a CPU-only system, amounting to cost savings of about $4 million running the same workload.
Overall, if all CPU-only data center servers migrated to Nvidia GPUs for HPC, AI, and data analytics workloads, organizations would save 12TWh annually – equating to global savings of between $2 and $3 billion.
“We might look at GPUs and think, wow, they're really hot and they’re expensive, but for the compute results that they produce, hundreds of times more efficient than what you would get otherwise, that's a big deal.”
“When you zoom out and look at it from the broader infrastructure point of view, when you're looking at creating buildings full of these things like we do, the way that you get energy efficiencies in there is you have to do it at scale.”
“We also need to move to a more sustainable footing for all of our energy sources, and again, that's where we think scale can actually come to the party. I believe now, AWS is the biggest corporate buyer of renewable energy in the world, and we're building our own wind and solar farms. We're on track, at the moment, for meeting our very aggressive 2025 deadline for Amazon to be carbon neutral.”
Of course, when it comes to efficiency at scale, Amazon is better positioned than most, a fact that does not escape Bouffler.
“If you’re doing this in your own data center, you're not going to be able to do it at the kind of scale that we're at, you're not going to be able to get the efficiencies that we're able to squeeze. But I think that's one of the places where the cloud particularly can contribute to how we solve this problem.”
AWS has been offering cloud services since 2006, bringing together compute, networking, and storage into cloud-based data centers that has enabled a huge improvement in data center efficiency. The shift to Nvidia GPU-based HPC and generative AI represents another leap forward in this arena.
And as the proliferation of cheaper, sustainable computing continues to ramp up, it’s clear to see that partnerships like AWS and Nvidia are key to solving the world’s biggest challenges of today – and tomorrow – by putting these resources into the hands of those that need them most, when they need them.
From helping us answer the great mysteries of the universe, to curing disease, together, we can help catalyze the change we want to see in the world, with less cost to ourselves and our environment.
Learn more about how AWS and Nvidia can help accelerate HPC workloads here.
More from AWS & Nvidia
-
AWS on track for $100bn annual revenue run rate, CapEx to "meaningfully increase" due to AI
Year on Year revenue growth for Q1 reached 17.2 percent
-
AWS announces $11bn data center campus in Indiana
Company behind Razor5 project in New Carlisle
-
Nvidia to establish second R&D facility in Taiwan
Company already employees 400 workers in original AI-focused R&D center