Cerebras trains 20 billion parameter AI model on a single system, sets new record

A huge milestone - but still not enough to handle the largest models

US semiconductor startup Cerebras claims that it has trained the largest AI model on a single device.

The company trained AI models with 20 billion parameters on its Wafer Scale Engine 2 (WSE-2) chip, the world's largest chip.

The WSE-2 has 2.6 trillion transistors. Built on TSMC 7nm, it has 850,000 'AI optimized' cores, 40GB of on-chip SRAM memory, 20 petabytes of memory bandwidth, and 220 petabits of aggregate fabric bandwidth.

The WSE-2 chip is sold packaged with the Cerebras CS-2, a 15U box that also includes HPE’s SuperDome Flex. This combined system was used to train the models.

"Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company said in a blog post. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes."

However, larger neural network systems are used by AI enterprises - they just use more than a single system to train them.

Cerebras raised $250 million late last year at a $4bn valuation. Supercomputing institutions Argonne, Lawrence Livermore, and PSC, as well as AstraZeneca, GSK, Tokyo Electron Devices, and oil and gas businesses are known to use the system.

Cerebras trains 20 billion parameter AI model on a single system, sets new record

More in IT Hardware & Semiconductors

Intel to receive $8.5bn in direct funding from US CHIPS Act

Google settles $1.67bn TPU patent infringement lawsuit

Episode Harnessing power dense technologies to support AI and HPC

Tags

The make vs. buy decision for data center infrastructure management software – A clear choice

2023 Data Center Market Trends: Hong Kong Asia's Connectivity Hub

Emerging Energy Storage Technologies

Success story: Kao Data and Cadence

Cerebras trains 20 billion parameter AI model on a single system, sets new record

Subscribe to our daily newsletters

More in IT Hardware & Semiconductors

Episode Harnessing power dense technologies to support AI and HPC

Tags