Amazon Web Services (AWS) is launching EC2 instances featuring the latest Nvidia GPUs and Trainium chips.

Unveiled during a keynote speech by CEO Matt Garman at this week's Re:Invent event in Las Vegas, the cloud giant also revealed plans for its Trainium3 chip.

Trainium2 instance
– AWS

Garman revealed three new EC2 instances: the P6 and Trn2 instances and the Trn2 UltraServers.

The P6 family of instances feature Nvidia's latest Blackwell GPUs, and will be available in 2025. AWS expects the instances to offer up to 2.5x faster compute than the current generation of GPUs.

“AWS and Nvidia have been collaborating together for 14 years to ensure that we are really great at operating and running GPU workloads,” said Garman of the partnership.

The company has also been developing instances using its self-developed Trainium chips.

The Amazon EC2 Trn2 instances are now generally available and are - according to Garman - the most powerful instances for generative AI. They offer 30-40 percent better performance than current GPUs. The instances have 16 trainium2 chips connected by a high bandwidth, low latency interconnected NeuronLink, and can deliver 20.8 petaflops.

AWS is also launching the Amazon EC2 Trn2 UltraServers - comprising four Trn2 instances and connected with a NeuronLink, the UltraServers have 64 Trainium2 chips and can offer up to 83.2 FP8 petaflops of compute power.

Effectively, the UltraServers are four instances combined into one node, explained Garman. “Now you can load one of these really large models all into a single node, delivering much better latency and much better performance for customers without having to break it up.”

Garman also announced Project Rainier, being developed alongside generative AI company Anthropic. Project Rainier is going to build a cluster of Trainium2 UltraServers containing hundreds of thousands of Trainium chips interconnected with third-generation, low-latency petabit scale EFA networking.

When completed, it is expected to be the world’s largest AI compute cluster.

“The cluster is going to be five times the number of exaflops as the current cluster that Anthropic used to train their leading set of quad models for healthcare, five times the amount of compute that they use in the current generation. I am super excited to see what the Anthropic team comes up with that size,” said Garman.

Garman also unveiled plans for the Trainium3 chip, which he said will be coming "later next year."

"Trainium3 will be our first chip in the AWS made using a 3 nanometer-process node. It will give you 2x more compute than Trainium2, and will be 40 percent more efficient," said Garman.

"It will allow you to build bigger, faster, and more exciting applications."