Google announced a number of artificial intelligence-focused tools and services for its cloud platform.
The company said that the Cloud TPU v5e was now available in preview, and is the latest in its in-house Tensor Processing Unit. Compared to TPU v4, which was released back in 2021, Google says that the chip has up to two times faster training performance per dollar and up to 2.5 times the inference performance per dollar for large language models and generative AI models.
The new TPU will be available in eight different virtual machine configurations, going from one TPU chip to over 250 within a single slice. For those needing more compute, the company is rolling out 'Multislice,' a way to sale models to tens of thousands of TPU chips.
"Until now, training jobs using TPUs were limited to a single slice of TPU chips, capping the size of the largest jobs at a maximum slice size of 3,072 chips for TPU v4," Google's VP of ML, systems, and cloud AI Amin Vahdat and VP of compute and ML infrastructure Mark Lohmeyer said in a joint blog post.
"With Multislice, developers can scale workloads up to tens of thousands of chips over inter-chip interconnect (ICI) within a single pod, or across multiple pods over a data center network (DCN)."
Alongside the new TPUs, Google said that A3 virtual machines (VMs) will be generally available next month, featuring eight Nvidia H100 GPUs, dual 4th Gen Intel Xeon Scalable processors and 2TB of memory. The instances were originally announced in May, and can grow to 26,000 Nvidia H100 Hopper GPUs - although it's not clear how many H100s Google will have, given the ongoing GPU shortage.
The cloud company said that generative AI startup Anthropic was an early user of the new TPU v5e and A3 VMs. While Google invested $300m in the startup, it is also a vocal Amazon Web Services user.
"We’re excited to be working with Google Cloud, with whom we have been collaborating to efficiently train, deploy, and share our models," Tom Brown, Anthropic co-founder, said.
"We’re excited to be working with Google Cloud, with whom we have been collaborating to efficiently train, deploy and share our models... Google’s next-generation AI infrastructure powered by A3 and TPU v5e with Multislice will bring price-performance benefits for our workloads as we continue to build the next wave of AI.”