IBM has announced two new AI-focused chips for its Z-systems mainframes.
At the Hot Chips 2024 event this week in California, Big Blue revealed architecture details for the upcoming Telum II processor and Spyre accelerator chips.
“The new technologies are designed to significantly scale processing capacity across next-generation IBM Z mainframe systems helping accelerate the use of traditional AI models and large language AI models in tandem through a new ensemble method of AI,” the company said.
Designed to power IBM Z systems, the new Telum II chip features increased frequency, memory capacity, a 40 percent growth in cache, and integrated AI accelerator core as well as a coherently attached Data Processing Unit (DPU) versus the first-generation Telum chip.
Each Telum II features eight cores running at 5.5GHz, with 36MB L2 cache per core for a total of 360MB. The virtual level-4 cache of 2.88GB per processor drawer provides a 40 percent increase over the previous generation.
Each Spyre features up to 1TB of memory, built to work in tandem across the eight cards of a regular IO drawer and designed to consume no more than 75W per card. Each chip will have 32 compute cores supporting int8 and fp16 datatypes.
The new processor is expected to support enterprise compute solutions for LLMs, servicing the industry's complex transaction needs. The DPU is engineered to accelerate complex IO protocols for networking and storage on the mainframe. Each Spyre accelerator chip is attached via a 75-watt PCIe adapter and is based on technology developed in collaboration with IBM Research.
Both chips will be manufactured by Samsung Foundry, IBM’s long-standing fabrication partner, on the former's 5nm process node. Both Telum and Spyre are set to be available next year.
"Our robust, multi-generation roadmap positions us to remain ahead of the curve on technology trends, including escalating demands of AI," said Tina Tarquinio, VP, product management, IBM Z and LinuxONE. "The Telum II Processor and Spyre Accelerator are designed to deliver high-performance, secured, and more power-efficient enterprise computing solutions. After years in development, these innovations will be introduced in our next-generation IBM Z platform so clients can leverage LLMs and generative AI at scale."
IBM said a system equipped with a Spyre cluster could leverage much more complex AI models to identify intricate fraud patterns that a less sophisticated model might have missed.
The company announced the original Telum chip in 2021. It launched its latest mainframe and first with the Telum chip equipped, the z16, in 2022.