When you set out to build something that will use technology that has yet to be invented, you know it is going to be a difficult journey.
So it proved with Blue Waters, a project that began in 2006 with a US$300m grant from the US National Science Foundation and other federal agencies to build one of the world’s biggest supercomputers.
It is run by the National Center for Supercomputing Applications (NCSA), located at the University of Illinois at Urbana-Champaign. The project is backed by the National Science Foundation and the University of Chicago and includes as partner the Great Lakes Consortium for Petascale Computation – itself made up of 28 universities and research laboratories.
FINAL PHASE
Four years after the project began, and three years after it was chosen as the main IT supplier, IBM withdrew from the contract. In August last year, a joint statement was issued:
“Effective August 6, 2011, IBM terminated its contract with the University of Illinois to provide the supercomputer for the National Center for Supercomputing Applications’ Blue Waters project.
“The University of Illinois and NCSA had selected IBM in 2007 as the supercomputer
vendor for the Blue Waters project, based on projections of future technology development.
The innovative technology that IBM ultimately developed was more complex and required significantly increased fi nancial and technical support by IBM beyond its original expectations.
“IBM, the University of Illinois and NCSA will explore other opportunities to continue the strong working relationship established during the Blue Waters project.”
In November 2011, NCSA signed a contract with supercomputer maker Cray. In January this year, Thom Dunning, the project director, wrote in a newsletter: “The Blue Waters system that our new partner, Cray, will install in the National Petascale Computing Facility will be a CPU-GPU hybrid computing system.
“With its massive computing capability – large, fast memory subsystem, improved interconnect and large, fast IO subsystem – it will bring sustained petafl op performance to a broad range of science and engineering applications in fields such as climate change, the spread of epidemics, earthquakes, fundamental chemistry and physics, and materials science. And, with its equally impressive GPU capability, it will serve as a bridge to the technologies on which future supercomputers will be based.”
READY FOR DEPLOYMENT
The project has now reached the deployment phase, with more than half the racks and systems in place. (see Box 1).
The NCSA statement says: “The vast majority of the computation resources – 90% – are general purpose CPUs, usable immediately by science teams. To support the transformation of science applications, 10% of the physical resources – but about a third of the peak FLOPs – are the hybrid GPU nodes, making Blue Waters one of the largest GPU clusters in the world. Blue Waters also invested in more aggregate memory and what could be described as the most intense storage subsystems than any other HPC system known.”
Blue Waters will be composed of 244 Cray XE6 cabinets based on the recently announced AMD Opteron 6200 Series processor (formerly codenamed Interlagos) and 32 cabinets of a future version of the recently announced Cray XK6 supercomputer, with NVIDIA’s Kepler GPU computing capability incorporated into a single, powerful hybrid supercomputer, making this by far the largest system that Cray has delivered to date.
The Blue Waters system is intentionally designed to be the most balanced and most
effective system of its generation. Instead of chasing a theoretical peak performance operation rate, or a Linpack measure, the system design follows the consistent project philosophy, says the NSCA. (“The Linpack Benchmark is a measure of a computer’s floating-point rate of execution. It is determined by running a computer program that solves a dense system of linear equations.” Source: www.top500.org/project/linpack.)
This means that the project may choose to remain outside of the SupercomputerTop500.
This list, which is compiled twice annually, rates the top 500 supercomputers in the world using the Linpack method.
BLUE WATERS: IT
Cray XE6 cabinets – 244
Cray XK6 cabinets – 32
Total cabinets, including storage and
server cabinets – >300
Compute nodes – >25,000
Usable storage bandwidth – >1TB/s
Aggregate system memory – >1.5 Petabytes
Memory per core – 4GB
Interconnect topology – 3D Torus
Number of disks – >17,000
Number of memory DIMMS – >190,000
Usable storage – >25 Petabytes
Peak performance – >11.5 Peta_ ops
Number of AMD processors – >49,000
Number of AMD x86 cores – >380,000
Number of NVIDIA GPUs – >3,000
External network bandwidth – 100Gb/s scaling to 300
Integrated near-line environment – scaling to 500 PBs
Bandwidth to near-line storage – 100GB/s
In Part II - Deputy Director Bill Kramer talks about the project.
The full article is available to read now in FOCUS 21 - Read it now, no registration required.