The US National Science Foundation (NSF) is funding a new project to create a composable supercomputer.
The project, known as the Accelerating Computing for Emerging Sciences (ACES) system will utilize a composable framework via PCIe (Peripheral Component Interconnect Express) Gen5 on Intel’s Sapphire Rapid (SPR) processors.
Texas A&M University, along with the University of Illinois at Urbana-Champaign and the University of Texas at Austin, will build a $5 million prototype for the NSF.
ACES will allow applications and workflows to dynamically integrate with different accelerators, memory, and in-network computing protocols in order to rapidly process large volumes of data. A&M said the system will remove bottlenecks by introducing flexibility to aggregate various components on an as-needed basis and allow researchers to run workloads on processors and accelerators best suited for their workflows.
The system will also include Intel Ponte Vecchio (PVC) GPUs (Graphics Processing Units), Intel FPGAs (Field Programmable Gate Arrays), NEC Vector Engines, NextSilicon co-processors, and Graphcore IPUs (Intelligence Processing Units), as well as Intel Optane memory and DDN Lustre storage interconnected with Mellanox NDR 400Gbps InfiniBand.
The accelerators are coupled with Intel Optane memory and DDN Lustre storage interconnected with Mellanox NDR 400Gbps InfiniBand. The system will reportedly use Liqid's Matrix technology. ACES will allow applications and workflows to dynamically integrate the different accelerators, memory and in-network computing protocols.
Timothy M. Cockerill, director of user services, TACC at UT Austin, and a co-principal investigator on the ACES project said researchers will be able to essentially build the custom environment they require on a per job basis and not be constrained to the contents of a physical server node.
“We are thankful to NSF for the opportunity to lead such an important initiative and to our Texas A&M HPRC staff and collaborators at UT Austin and UIUC for making this a successful effort,” said Texas A&M Interim Vice President for Research Jack Baldaulf. “Computational science is critical to our national needs and the ACES platform will not only advance research but also help educate the future workforce in this area.”
Texas A&M Senior Associate Vice President for Research Costas N. Georghiades added, “With the increasing complexity of computational problems in the big-data era we live in, it is no longer sufficient to use traditional supercomputers which rely only on optimizing the software.” He said that the ACES system will also be able to adapt hardware resources on the fly to tackle complex computational tasks more efficiently.
According to HPCWire, the ACES platform is aiming to be deployed by September 2022 and will be housed in a data center on the Texas A&M campus. As well as the initial $5 million grant, NSF will provide an additional $1 million per year over five years to pay for system operation and support.