Argonne National Laboratory’s first exascale computer is coming soon, and will exclusively serve the research community. Scientists will use the new machine, named Aurora, to pursue some of the farthest-reaching science and engineering breakthroughs ever achieved with supercomputing.
Aurora’s revolutionary architecture, designed in collaboration with Intel and Cray, will support machine learning and data science workloads alongside traditional modeling and simulation workloads.
In terms of opening new frontiers in science, the sky’s the limit. Literally. The forthcoming machine will process data from the latest sky surveys, to help answer some of the biggest questions in physics about the nature of the universe.
Researchers can also use the system to develop alternative energy sources, design safer vehicles, invent new materials, understand how our brains work, and find ways to keep us healthier and safer.
Aurora will be housed at the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science User Facility and a premier source of world-class computing resources for open science research since 2006.
For decades, the DOE has been building an aggressive scientific computing research program to give our nation a strategic competitive advantage in the advancement of science and technology. Today, the U.S. program is unrivaled in the world and provides groundbreaking discoveries in all fields of inquiry.
The massive machines at the core of the DOE supercomputing program are technological wonders in themselves—each a unique design, and each one ten to a hundred times more powerful than other systems used for scientific research. For the research community that relies on these machines to push the frontiers of science and engineering, big is never big enough and fast is never fast enough.
Each machine generation provides a fresh challenge to U.S. computer manufacturers—from the racks to the processors to the networking to the I/O system. Similarly, fulfilling the science potential of each new computing architecture requires significant changes to today’s software. The initiative is, and will continue to be, guided by pioneering visionaries in the mathematics and computational science community, stewarded by the DOE’s Office of Science, and operated at the cutting edge.
But while people have been using supercomputers to solve big problems for years, the capabilities of the machines that will soon begin rolling out in national labs around the country will be brand new.
Researchers will be able to run a greater diversity of workloads, including machine learning and data intensive tasks, in addition to traditional simulations. Providing the data science software “stack”—the high-level programming languages, frameworks, and I/O middleware that are conventional toolkits—at exascale, is a major effort in deploying Aurora.
Aurora will feature several technological innovations, including a revolutionary I/O system—the Distributed Asynchronous Object Store (DAOS)—to support new types of workloads. Aurora will be built on a future generation of Intel® Xeon® Scalable processor accelerated by Intel’s Xe compute architecture. Cray Slingshot™ fabric and Shasta™ platform will form the backbone of the system.Programming techniques already in use on current systems will apply directly to Aurora. The system will be highly optimized across multiple dimensions that are key to success in simulation, data, and learning applications:
As the world’s data-centric workloads become more diverse, so do architectures that process that data. Intel’s breadth of architectures span scalar (CPU), vector (GPU), matrix (AI) and spatial (FPGA). These architectures, often referred to at Intel with the acronym SVMS, require an efficient software programming model to deliver performance. One API addresses this with ease-of-use and performance, while eliminating the need to maintain separate code bases, multiple programming languages, and different tools and workflows.
The Aurora Early Science Program will prepare key applications for Aurora’s scale and architecture, and will solidify libraries and infrastructure to pave the way for other production applications to run on the system.
The program has selected 15 projects, proposed by investigator-led teams from universities and national labs and covering a wide range of scientific areas and numerical methods.
In collaboration with experts from Intel and Cray, ALCF staff will train the teams on the Aurora hardware design and how to program it. This includes not only code migration and optimization, but also mapping the complex workflows of data-focused, deep learning, and crosscutting applications. The facility will publish technical reports that detail the techniques used to prepare the applications for the new system.
In addition to fostering application readiness for the future supercomputer, the Early Science Program allows researchers to pursue innovative computational science campaigns not possible on today’s leadership-class supercomputers.
The combination of simulation, data science, and machine learning will transform how supercomputers are used for scientific discovery and innovation.
PI: Anouar Benali, Argonne National Laboratory
DOMAIN: Materials Science
PI: C.S. Chang, Princeton Plasma Physics Laboratory
PI: Thomas Dunning, Pacific Northwest National Laboratory
PI: Katrin Heitmann, Argonne National Laboratory
PI: Kenneth Jansen, University of Colorado at Boulder
PI: David Bross, Argonne National Laboratory
PI: Salman Habib, Argonne National Laboratory
PI: Ken Jansen, University of Colorado Boulder
PI: James Proudfoot, Argonne National Laboratory
PI: Amanda Randles, Duke University and Oak Ridge National Laboratory
DOMAIN: Biological Sciences
PI: William Detmold, Massachusetts Institute of Technology
PI: Nicola Ferrier, Argonne National Laboratory
DOMAIN: Biological Sciences
PI: Noa Marom, Carnegie Mellon University
DOMAIN: Materials Science
PI: Rick Stevens, Argonne National Laboratory
DOMAIN: Biological Science
PI: William Tang, Princeton Plasma Physics Laboratory