April 30, 2013
Computer scientists at Lawrence Livermore National Laboratory (LLNL) and Rensselaer Polytechnic Institute have set a high performance computing speed record that opens the way to the scientific exploration of complex planetary-scale systems.
In a paper to be published in May, the joint team will announce a record-breaking simulation speed of 504 billion events per second on LLNL’s Sequoia Blue Gene/Q supercomputer, dwarfing the previous record set in 2009 of 12.2 billion events per second.
Constructed by IBM, the 120-rack Sequoia supercomputer has a peak performance of 25 petaflops per second and is the second fastest supercomputer in the world, with a total speed and capacity equivalent to about one million desktop PCs. A petaflop is a quadrillion floating point operations per second.
In addition to breaking the record for computing speed, the research team set a record for the most highly parallel “discrete event simulation,” with 7.86 million simultaneous tasks using 1.97 million cores. Discrete event simulations are used to model irregular systems with behavior that cannot be described by equations, such as communication networks, traffic flows, economic and ecological models, military combat scenarios, and many other complex systems.
Prior to the record-setting experiment, a preliminary scaling study was conducted at the Rensselaer supercomputing center, the Computational Center for Nanotechnology Innovations (CCNI). The researchers tuned parameters on the CCNI’s two-rack Blue Gene/Q system and optimized the experiment to scale up and run on the 120-rack Sequoia system.
The records were set using the ROSS (Rensselaer’s Optimistic Simulation System) simulation package developed by Carothers and his students, and using the Time Warp synchronization algorithm originally developed by Jefferson.
“The significance of this demonstration is that direct simulation of ‘planetary scale’ models is now, in principle at least, within reach,” Barnes said. “‘Planetary scale’ in the context of the joint team’s work means simulations large enough to represent all 7 billion people in the world or the entire Internet’s few billion hosts.”
“This is an exciting time to be working in high performance computing, as we explore the petascale and move aggressively toward exascale computing” Carothers said. “We are reaching an interesting transition point where our simulation capability is limited more by our ability to develop, maintain, and validate models of complex systems than by our ability to execute them in a timely manner.”
The calculations were completed while Sequoia was in unclassified “early science” service as part of the machine’s integration period. The system is now in classified service. Sequoia is dedicated to the National Nuclear Security Administration’s (NNSA) Advanced Simulation and Computing (ASC) program for stewardship of the nation’s nuclear weapons stockpile, a joint effort by LLNL, Los Alamos National Laboratory, and Sandia National Laboratories. The ASC program provided time on Sequoia to the LLNL-Rensselaer team as the capabilities tested have potential relevance to NNSA/DOE missions. This work also was supported by LLNL’s Laboratory Directed Research and Development program.
Since opening in 2007, the CCNI has enabled researchers at Rensselaer and around the country to tackle challenges ranging from advanced manufacturing to cancer screening to sustainable energy. External funding for these research activities has exceeded $50 million and has led to an economic impact of over $130 million across New York state. A partnership between Rensselaer and IBM, CCNI currently supports a network of more than 850 researchers, faculty, and students from a mix of universities, government laboratories, and companies across a diverse spectrum of scientific and engineering disciplines.
Rensselaer Polytechnic Institute
Donald B. Johnston
Lawrence Livermore National Laboratory