A special report conducted by the NASA Office of Inspector General (OIG) has found that the space agency’s missions are being hampered by inadequate high-end computing (HEC) resources.

According to the audit, which was first reported by The Register, NASA’s supercomputers are insufficient for its tasks, with its systems largely relying on CPUs.

NASA_MSF_Aitken_module.jpg
NASA's Aitken supercomputer module – NASA

NASA currently has five HEC systems housed at the NASA Advanced Supercomputing (NAS) facility in Ames, California, and the NASA Center for Climate Simulation (NCCS) in Goddard, Maryland. These include the 13.12 petaflops Aitken; the 8.32 petaflops Electra; the 8.1 petaflops Discover; the 7.09 petaflops Pleiades; and the 154.8 teraflops Endeavour.

Although the Aitken has been used to support NASA’s Artemis program, the report found that the machines are almost exclusively powered by old CPUs, with the supercomputers at the NAS facility running on 18,000 CPUs but only 48 GPUs. According to the report, the systems at NCCS have even fewer GPUs.

By comparison, the world’s current faster supercomputer, Frontier, has 37,888 GPUs.

The report also said that NASA’s HEC resources are “oversubscribed and overburdened”, meaning that Mission Directorates are requesting more computing time than existing capacity can provide.

As a result, NASA teams have often resorted to purchasing their own HEC resources to meet deadlines, with the report citing the example of the Space Launch System team which invests about $250,000 annually to purchase and locally manage their own HEC clusters, rather than waiting for existing HEC resource availability.

“NASA also lacks a comprehensive strategy for when to use HEC assets on the premises versus when to utilize cloud computing options—or a widespread understanding of the cost implications for each choice.,” the report read. “Stakeholders told us that while they know NASA has HEC cloud computing options, they were hesitant to use them due to unknown scheduling practices or assumed higher costs.”

The OIG proposed several recommendations to the space agency, which included establishing an executive leadership team to help reposition NASA’s HEC resources so they meet the agency’s “specialized needs.”

Eight other recommendations were made, such as identifying technology gaps essential for meeting current and future needs and strategic technological and scientific requirements; developing a strategy to improve the prioritization and allocation of HEC assets; and implementing an HEC classification designation for identifying HEC assets.

The report concluded by noting that NASA’s management agreed with the recommendation to establish an HEC executive leadership team and partially agreed with the following eight recommendations.

The OIG has requested regular updates from the agency to monitor progress toward implementation and better understand NASA’s planned actions to address the issues outlined by the report.