Amazon Web Services' (AWS) next-generation Trainium3 chip will likely consume 1kW of power, according to the company’s VP of infrastructure services, Prasad Kalyanaraman.
Speaking to news outlet Fierce Networks, although Kalyanaraman did not specify the chip’s wattage, he noted that “the next generation [of Trainium chips] will require liquid cooling. When a chip goes above 1,000 watts, that's when they require liquid cooling."
First announced in December 2020, Trainium chips are purpose-built for ‘high-performance ML training applications in the cloud.’
The company’s chip portfolio also includes Graviton2, a CPU based on Arm's Neoverse cores, and the Inferentia, a dedicated inference chip designed to help customers run AI applications.
Currently, the only other chips that reach 1kW of power density are Nvidia’s incoming Blackwell family of GPUs, although Intel is rumored to be developing a 1.5kW chip.
Currently, almost all of AWS’ data centers use air cooling technology, but Kalyanaraman said AWS is now looking to adopt single-phase cold plate technology instead of immersion cooling to support high-density workloads.
Furthermore, he said the company has also been looking to further optimize its data centers through strategic rack positioning and networking setups, work which includes ensuring the company’s next-generation switches support 51.2Tbps and working with EML, laser, and transponder providers to mix and match optical components.
According to Fierce Networks, Kalyanaraman did not provide any information about when Trainium3 would be available or when AWS would start rolling out liquid cooling in its data centers.
In September 2023, Amazon announced it would invest up to $4 billion in generative AI startup Anthropic, with the company stating that Anthropic will use AWS Trainium and Inferentia chips to build, train, and deploy its future models.