Meta has shared details about the next generation of its Meta Training and Inference Accelerator (MTIA), custom-made chips designed for specific Meta AI workloads.
The company said the latest version of its chips shows significant performance improvements over the MTIA v1 and is already being used to support generative AI products and services, such as Meta’s ranking and recommendation advertising models.
The updated MTIAs are based on 5nm nodes and provide 354 Tops (tera operations per second) of Integer (8-bit) accuracy computation, or 177 teraflops of FP16 accuracy computation. The chips run at 1.35 gigahertz, have a thermal design power of 90W, and are about 421 millimeters square.
The MTIA v2 also features an improved network-on-chip architecture and contains 256MB of on-chip memory and 2.7TB/s of on-chip memory bandwidth, while local memory capacity is 384KB per processing element (PE) and 1TB/s of bandwidth.
Meta said these upgrades represent a 3.5x increase in dense compute performance over the MTIA v1 and a 7x improvement in sparse compute performance. The company has also tripled the size of the local PE storage, doubled the on-chip SRAM, increased its bandwidth by 3.5x, and doubled the capacity of LPDDR5.
The MTIAs will be housed in a rack-based system that can hold up to 72 accelerators. The system consists of three chassis, each containing 12 boards housing two accelerators each. Meta said it has also upgraded the fabric between the accelerators and between the host and accelerators to PCIe Gen5 to increase the bandwidth and scalability of its system.
The inference accelerator is part of Meta’s broader full-stack development program for custom, domain-specific silicon that addresses its unique workloads and systems. The company said MTIA has already been deployed in its data centers and is now serving models in production.
“We’re designing our custom silicon to work in cooperation with our existing infrastructure as well as with new, more advanced hardware (including next-generation GPUs) that we may leverage in the future,” wrote Joel Coburn, software engineer at Facebook; Eran Tal, director of hardware systems at Facebook; and Nicolaas Viljoen, technical lead director of AI/HPC systems at Meta, in a blog post announcing MTIA v2.
“Meeting our ambitions for our custom silicon means investing not only in compute silicon but also in memory bandwidth, networking, and capacity as well as other next-generation hardware systems. We currently have several programs underway aimed at expanding the scope of MTIA, including support for GenAI workloads.”
As was first reported in February 2024, Meta added that the company was still looking for engineers to help the company continue to develop its chips.
Meta's pivot to AI and changing chips forced the company to redesign its data centers with GPUs and other accelerators in mind and cancel multiple projects.
The company said in February that it expects to spend $37 billion on digital infrastructure in 2024.