Installation work for the world's most powerful computer is now finished.
HPE, and Intel last week said the installation of the 2 exaflops Aurora supercomputer was completed by the Argonne National Laboratory in Illinois.
The high-performance computing (HPC) system will have a performance capability of more than two exaflops, making it the world’s most powerful verified supercomputer, offering double the capacity of the current top of the list, Frontier. It is expected to go live later this year.
"Aurora is the first deployment of Intel's Max Series GPU, the biggest Xeon Max CPU-based system, and the largest GPU cluster in the world," said Jeff McVeigh, Intel corporate vice president and general manager of the super compute group.
The HPC system, housed at the Department of Energy’s Argonne National Lab in Lemont, is comprised of 10,624 blades, 21,248 Intel Zeon CPU max general-purpose processors totaling more than 1.1 million cores for workloads that require CPU, and 63,744 Intel Data Center GPU Max Series for AI and HPC workloads.
Aurora also has 1.36 PB of on-package HBM2E memory and 19.9 PB of DDR5 memory that is used by the CPUs as well as 8.16 PB of HBM2E carried by the Ponte Vecchi compute GPUs. Its storage subsystem uses 1,024 distributed asynchronous object storage nodes providing 220TB of storage and the HPE Slingshot high-performance Ethernet.
The system spans a space equivalent to two basketball courts and is distributed across 166 racks with 66 blades each. The Argonne National Laboratory does not publish office power consumption for Aurora. Its storage subsystem uses 1,024 distributed asynchronous object storage nodes providing 220TB of storage and the HPE Slingshot high-performance Ethernet. It has been suggested that for Aurora to reach its peak performance of 2.4 petaflops, it will need to consume 60MW.
While Aurora will be the most powerful supercomputer recognized globally, China reportedly has developed at least two exascale supercomputers already, with plans for up to 10 exascale HPCs by 2025. The country has not publicly released the specifications of the supercomputers, but the two in operation are both reportedly below 2 exaflops.
"While we work toward acceptance testing, we are going to be using Aurora to train some large-scale open-source generative AI models for science," said Rick Stevens, Argonne National Laboratory associate laboratory director. "Aurora, with over 60,000 Intel Max GPUs, a very fast I/O system, and an all-solid-state mass storage system, is the perfect environment to train these models."
Once acceptance testing is complete, the supercomputer can be used for a variety of workloads including weather prediction, aerodynamics research, medicine, and nuclear fusion simulations.
Aurora has had a bumpy ride to installation. The system was first pitched in 2015 as a 180 petaflops supercomputer using Intel’s Zeon Phi chips and is expected to go live in 2018. However, Intel delayed the launch of its Phi chips and eventually canceled them, leading to Aurora being reconfigured. Works eventually began in October 2021, with the system instead offering Intel’s new GPUs.
The 1.194 exaflops Frontier system, located at the Department of Energy’s Oak Ridge National Laboratory in Tennessee is currently the No. 1 system on the most recent Top500 list of most powerful supercomputers. It is currently the only Exascale system on the list, though more are set to join in the coming months and years.
In the US, the upcoming El Capital is set to offer 2 exaflops when it launches later this year. NERSC-10, set to launch in 2026, could offer as much as 40 exaflops.
India is planning to deploy its own exascale system using home-grown processors. Param-Shankhis is set to launch in 2024.