Featured Product
This Week in Quality Digest Live
Innovation Features
Eric Whitley
Robotic efficiency coupled with human intuition yields a fast, accurate, adaptable manufacturing system
InnovMetric Software
One software capable of operating portable metrology equipment and CMMs within the same user interface
MIT News
Mens, Manus and Machina (M3S) will design technology and training programs for human-machine collaboration
Gleb Tsipursky
The future of work is here, and AI is the driving force
Del Williams
Starting up on time and with confidence

More Features

Innovation News
System could be used to aid monitoring climate and coastal change
Simplify shop floor training through dynamic skills management
Oct. 17–18, 2023, in Sterling Heights, Michigan
Enables scanning electron microscopes to perform in situ Raman spectroscopy
Showcasing the latest in digital transformation for validation professionals in life sciences
Supports back-end process control
Transforming the development and optimization of bioprocesses using Tetra data
For processed, frozen, and preprocessed vegetables, confections, and more
Signalysis SigQC software now fully integrated with MECALC QuantusSeries instrumentation

More News

Oak Ridge National Laboratory

Innovation

ORNL, NOAA Launch Supercomputer for Climate Science Research

Newest Gaea system boosts performance for more-advanced climate modeling and simulation

Published: Thursday, May 4, 2023 - 12:01

(Oak Ridge National Laboratory: Oak Ridge, TN) -- In partnership with the National Oceanic and Atmospheric Administration (NOAA), Oak Ridge National Laboratory (ORNL) is launching a supercomputer dedicated to climate science research. The new system is the fifth supercomputer to be installed and run by the National Climate-Computing Research Center (NCRC) at ORNL.

The NCRC was established in 2009 as part of a strategic partnership between NOAA and the U.S. Department of Energy, and is responsible for procuring, installing, testing, and operating several supercomputers dedicated to climate modeling and simulations. The partnership’s goal is to increase NOAA’s climate modeling capabilities to further critical climate research. To that end, the NCRC has installed a series of increasingly powerful computers since 2010, each formally named “Gaea.” The latest system, also referred to as C5, is an HPE Cray machine with more than 10 petaflops (or 10 million billion calculations per second) of peak theoretical performance—almost double the power of the two previous systems combined.

C5 is one of three NOAA computers operating at ORNL. Typically, the NCRC only operates two supercomputers at a time for NOAA users. They are replaced on a rotating schedule to provide NOAA users with uninterrupted access to more powerful machines while also minimizing operational and maintenance costs.

“The power efficiency, cooling efficiency, and CPU power all increase significantly over time,” explains Paul Peltz, ORNL technical lead for Gaea. “We can replace all of the computational power of C3 with a single cabinet of C5, which has eight cabinets total.”

Originally scheduled to arrive in fall 2021, C5’s delivery and installation was delayed several months by supply chain issues. “It was a unique period of time that made purchasing a system of this size very challenging,” says Chris Fuson, NOAA program manager at ORNL.


The ORNL team that installed and tested the newest Gaea system included (from left to right) Benny Sparks, Tori Robinson, Chris Coffman, Verónica Melesse Vergara, Nick Hagerty, Paul Peltz, A. J. Ruckman, and Chris Fuson. Credit: Genevieve Martin/ORNL

When the hardware arrived and C5 was assembled in summer 2022, the team began the testing and acceptance process, a standard but critical phase that pushes the system in order to test its reliability, stability, and performance under various workloads. This work was led by Verónica Melesse Vergara, leader of the System Acceptance and User Environment group. Working with her were ORNL staff members Tom Papatheodore, Dan Dietz, and Nick Hagerty. Initial tests, which find faulty hardware and confirm basic functionality, were followed by benchmarks and applications provided by NOAA that were representative of actual workloads.

“We load up the system with the application benchmarks and ensure the system can run with the expected performance,” says Dietz, a high-performance computing, or HPC, engineer at ORNL. “We slowly loaded up the number of copies of each benchmark running at once, easing on the gas to ensure the system doesn’t run into any issues under heavy load. We want to see consistent performance among all copies of the benchmark.”

“Finding problems and fixing them before we open the system to users is rewarding,” says Vergara. “If we did our jobs correctly, then users will be able to run without major challenges; so often they are unaware of the bugs that were fixed before they had access.”

When Gaea goes into full production and is open to NOAA users, the ORNL team will take a step back and focus on system maintenance while preparations for the next system begin.

“ORNL is a custodian of the machine for NOAA,” says Peltz. “We provide strong HPC knowledge and top-class facilities, and we invest heavily in our ability to house these machines in a secure manner. Those are things that NOAA doesn’t have to worry about. This interoperability between agencies is great.”

University of Tennessee-Battelle manages ORNL for the U.S. Department of Energy’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. The Office of Science is working to address some of the most pressing challenges of our time. For more information, visit energy.gov/science.


A team from Hewlett Packard Enterprise worked with ORNL staff to install the new system. From left to right: Sean Smith, Sean Owens, Dave Garman, Cameron Thompson, Mike Sammarco, Conner Cunningham, and Austin Rice. Credit: Genevieve Martin/ORNL

Discuss

About The Author

Oak Ridge National Laboratory’s picture

Oak Ridge National Laboratory

Oak Ridge National Laboratory is a multiprogram science and technology laboratory managed for the U.S. Department of Energy by the University of Tennessee-Battelle LLC. Scientists and engineers at ORNL conduct basic and applied research and development to create scientific knowledge and technological solutions that strengthen the nation's leadership in key areas of science; increase the availability of clean, abundant energy; restore and protect the environment; and contribute to national security.