First Exascale Computer Around the Corner, as Testbed 'Crusher' Starts Running Code
Crusher walks so upcoming Frontier can run.
The United States’ first exascale supercomputer, Frontier, won’t be fully operational until 2023—but computational users at the Oak Ridge Leadership Computing Facility are now running scientific codes on its architecture via Crusher, a newly unveiled test system with identical hardware and similar software as the in-preparation machine.
“Crusher is the latest in a long line of test and development systems we have deployed for early users of OLCF platforms and is easily the most powerful of these we have ever provided,” OLCF Director of Science Bronson Messer said on Monday. “The results these code teams are realizing on the machine are very encouraging as we look toward the dawn of the exascale era with Frontier.”
Installed late last year at Oak Ridge National Laboratory in Tennessee, Frontier is a $600 million system going through integration and testing efforts ahead of its full deployment, slated for January 1. It’s going to be capable of performing more than a quintillion—or 10 with 18 zeroes—calculations per second, an 8-fold increase in computational power from OLCF’s Summit supercomputer.
Crusher is “a 1.5-cabinet iteration of” Frontier, officials noted in the release. It contains 192 nodes optimized by central processing units and accelerators. Technology company Hewlett Packard Enterprise and chipmaker AMD produced components of the testbed, as well as Frontier.
Taking up 44 square feet of floor space, Crusher is “1/100th the size of the” government’s decommissioned Titan supercomputer, officials wrote. But it’s notably faster than that 4,352-square-foot system was.
“Enabling users via small testbeds is something we have done throughout the history of the OLCF,” Messer told Nextgov in an email.
In his view, there are twin concerns of node-level optimization and parallel scaling behavior for all user codes, “especially for” hybrid computing architectures.
“Code development teams need a stable platform on which to work while other work continues on the main machine, including acceptance testing and other necessary tuning,” Messer explained. “Having the additional, smaller platform allows us to quickly and adeptly make any necessary changes to the software environment and to ensure that those changes will not negatively impact code performance.”
The OLCF is hosting three-day hackathons to get users’ assets up and running on Crusher, and eventually will do the same for Frontier. Multiple projects currently running codes on the Crusher cabinet are detailed in the press release.
A nuclear physics code that can perform massive simulations of atomic nuclei is seeing 8-fold speedups on Crusher, officials noted, and a materials code that can perform large-scale calculations of up to 100,000 atoms has also been successfully deployed on the testbed.
Another project, underpinned by a partnership between the Energy Department and National Cancer Institute, aims to produce next-generation natural language processing models for precision medicine using “deep learning models that identify unseen connections between words in clinical text.” Those involved reported an 80% speedup running a model on a Crusher node, compared to previous systems.
Once Frontier reaches full user operations, Crusher will continue to be used as a test and development system to prepare scientific software for the first-of-a-kind exascale machine, Nextgov confirmed.