The Accelerating Computing for Emerging Sciences (ACES) supercomputer team has announced their testbed is now in operation. The ACES team, which is led by Texas A&M University’s Executive Director of High Performance Research Computing (HPRC) Honggao Liu, is eager to assist nationwide researchers in utilizing the new system for developing programming models and workflows as well as advancing data science projects.
“Our newly deployed testbed, ACES, enables applications and workflows to dynamically integrate the different accelerators, memory and in-network computing protocols,” said Liu, who is the principal investigator (PI) for the cluster. “Users with hybrid programming models will be able to quickly and efficiently process large amounts of data.”
Specifically, the ACES system allows researchers the flexibility to aggregate various components – processors, accelerators and memory – as needed. For instance, users can switch and run on accelerators best suited for their workflows.
One such user of the ACES testbed is Dr. Jian Tao, an Assistant Professor from the School of Performance, Visualization & Fine Arts at Texas A&M University, who has been formulating new methods to handle the results of large-scale scientific simulations. Specifically, he uses the novel Graphcore Intelligence Processing Units and NVIDIA H100 GPUs available on ACES to train neural network models to improve the compression and reconstruction of the output from the Community Earth System Model.
“Without access to a supercomputer like ACES, my work would be very cumbersome,” said Jian. “ACES is revolutionary as it allows researchers like me to seamlessly switch between different accelerator technologies. Its design finally makes it possible to connect numerical simulations with artificial intelligence models in a single setting.”
The ACES cluster consists of Liqid’s composable framework via PCIe (Peripheral Component Interconnect express) hybrid Gen4&5 on Intel’s Sapphire Rapid processors to offer a rich accelerator testbed consisting of Intel Ponte Vecchio GPUs (Graphics Processing Units), Intel FPGAs (Field Programmable Gate Arrays), NVIDIA H100 GPUs, NEC Vector Engines, NextSilicon co-processors, Graphcore IPUs (Intelligence Processing Units). The accelerators are coupled with Intel Optane memory and DDN Lustre storage interconnected with Mellanox NDR 400Gbps (gigabit-per-second) InfiniBand to support workflows that benefit from optimized devices.
“ACES will be an asset for not only artificial intelligence and machine learning (AI/ML) projects, but also beneficial for researchers examining data-heavy studies such as health population informatics, genomics and bioinformatics, human and agricultural life sciences, quantum chemistry, molecular dynamics, climate modeling and the list continues,” said HPRC’s Director of User Services and Research Dhruva Chakravorty, who is a co-PI for the project. “We are grateful to ACCESS for providing users with allocations on ACES and look forward to continuing to make progress during this testbed phase.”
ACES will be allocable through ACCESS. If you’re a researcher interested in accelerating your work via cyberinfrastructure – at no cost to you – visit the ACCESS allocations page.
Additional project co-PIs are Lisa Perez (Texas A&M University), Shaowen Wang (University of Illinois Urbana-Champaign) and Timothy Cockerill (Texas Advanced Computing Center).
The project is funded by NSF award no. 2112356.