Amazon Web Services (AWS) is giving artificial intelligence (AI) researchers who want to use its Tranium chips free computing power.
In what is likely an effort to challenge Nvidia's popularity among researchers, AWS is giving credits to use its cloud computing capacity with a value of up to $110 million to researchers.
The program has been dubbed "Build on Trainium." In total, Amazon is making 40,000 of its first-generation Trainium chips available to researchers. The company has released two generations of Trainium, with a third in development.
Amazon will also enable researchers to directly program the chips and will publish documentation about the instruction set architecture.
AWS will conduct "multiple rounds of Amazon Research Awards calls for proposals, with selected proposals receiving AWS Trainium credits, and access to the large Trainium UltraClusters for their research."
Gadi Hutt, lead of business development for AI chips at AWS, said that this approach is designed to tempt customers who want to make small tweaks to chips that could lead to big gains when using tens of thousands of chips simultaneously.
"Think about folks that are using infrastructure and putting hundreds of millions of dollars, if not more" toward rented computing power, Hutt said. "They would take any opportunity possible to increase performance and reduce the cost."
Among those to have taken advantage of the offering are Carnegie Mellon University (CMU), and the University of California, Berkeley.
“AWS’s Build on Trainium initiative enables our faculty and students large-scale access to modern accelerators, like AWS Trainium, with an open programming model. It allows us to greatly expand our research on tensor program compilation, ML parallelization, and language model serving and tuning,” said Todd C. Mowry, a professor of computer science at CMU.
“Trainium is beyond programmable—not only can you run a program, you get low-level access to tune features of the hardware itself,” added Christopher Fletcher, an associate professor of computer science research at the University of California at Berkeley. “The knobs of flexibility built into the architecture at every step make it a dream platform from a research perspective.”
First announced in December 2020, Trainium chips are purpose-built for ‘high-performance ML training applications in the cloud.’
Trainium2 was launched last year, Amazon is currently developing its third generation of Trainium chips, which are expected to consume 1kW of power and require liquid cooling.