Biopharmaceutical company AbbVie said that it used a Cerebras CS-2 system to train biomedical natural language processing (NLP) models.
The CS-2 features the Wafer Scale Engine 2, the world's largest semiconductor, with 2.6 trillion transistors. Built on TSMC 7nm, it has 850,000 'AI optimized' cores, 40GB of on-chip SRAM memory, 20 petabytes of memory bandwidth, and 220 petabits of aggregate fabric bandwidth.
The WSE-2 chip is sold packaged with the Cerebras CS-2, a 15U box that also includes HPE’s SuperDome Flex.
The company claimed that it was able to achieve a performance in excess of 128 times that of a GPU, while using 1/3 the energy (however, the CS-2 costs 'several million').
“A common challenge we experience with programming and training BERT LARGE models is providing sufficient GPU cluster resources for sufficient periods of time,” said Brian Martin, Head of AI at AbbVie.
“The CS-2 system will provide wall-clock improvements that alleviate much of this challenge, while providing a simpler programming model that accelerates our delivery by enabling our teams to iterate more quickly and test more ideas.”
AbbVie has developed large AI language models to build its machine translation service, Abbelfish. This service translates and makes searchable libraries of biomedical literature across 180 languages using large Transformer models such as BERT, BERT LARGE, and BioBERT.
But the Abbelfish model is six billion parameters, making it hard to train on GPU clusters - while Cerebras's CS-2 is large enough to handle the model.
“At Cerebras Systems, our goal is to enable AI that accelerates our customer’s mission,” said Andrew Feldman, CEO and co-founder of Cerebras Systems.
“It’s not enough to provide customers with the fastest AI in the market — it also must be the most energy efficient and the easiest to deploy. It’s incredible to see AbbVie not only accelerating their massive language models, but doing so while consuming a fraction of the energy used by legacy solutions.”
Cerebras raised $250 million late last year at a $4bn valuation. Supercomputing institutions Argonne, Lawrence Livermore, and PSC, as well as AstraZeneca, GSK, Tokyo Electron Devices, and oil and gas businesses are known to use the system.