This year, the Practice and Experience in Advanced Research Computing (PEARC) conference will be held in Columbus, OH, from Sunday, July 20 to Thursday, July 24. The PEARC conference provides a forum for the research community to discuss challenges and opportunities. If you’re attending PEARC25, you’ll see ample ACCESS representation. ACCESS will be at booth 35/36 in the Exhibitor Hall, where you can stop by and speak with ACCESS staff about the program and be the first to pick up its Plan Year 3 Highlights Book. Below is the list of ACCESS-affiliated presentations with times and locations to add to your calendar. Descriptions of each event are abridged; you can find complete descriptions on the full PEARC agenda here to learn more. All times listed are Eastern Time (ET). We look forward to seeing you there!

Monday, July 21
Tutorial: A Guideline to Writing a Successful Proposal for ACCESS and Other National Compute Resources
9 a.m. – 12:30 p.m.
Room: A213
Authors: Lars Koesterke and Ken Hackworth
In this session, presenters will address the two most persistent problems that researchers face during the application process: selecting the appropriate resource among the variety of choices offered and writing a successful application that translates a solid science project into a strong proposal ready to take on the competition.
Tutorial: Collaborative Cloud Science – Deploying The Littlest JupyterHub on Jetstream2
9 a.m. – 12:30 p.m.
Room: The Eisenman Room
Authors: Julian Pistorius and Stephen Bird
In this 3-hour hands-on tutorial, participants will set up an instance (aka virtual machine) on the Jetstream2 research cloud and install The Littlest JupyterHub (TLJH) to create a shared computing system. Designed for researchers and educators with basic Linux skills, the session focuses on a simple, practical setup that they can repeat at their institutions.
Workshop: Opportunities, Benefits and Challenges of Sharing Memory Between CPUs and GPUs
9 a.m. – 12:30 p.m.
Room: A216
Authors: Igor Sfiligoi, Mahidhar Tatineni, Dan Stanzione, John Cazes and Amit Ruhela
This workshop offers a comprehensive overview of next-generation platforms, focusing on unified shared memory between CPU and GPU cores. Presenters will highlight application-driven performance analysis across diverse HPC systems and share early insights and optimization techniques for workloads on distinct platforms from various vendors. Participants are expected to have at least some experience with either developing or supporting software for GPU-based systems or operating GPU-based HPC systems.
Tutorial: Intelligible, Powerful Tools for Supercomputer Users
9 a.m. – 12:30 p.m.
Room: B132
Authors: Chun-Yaung Lu, Kent Milfeld, Yinzhi Wang and Wenyang Zhang
To help supercomputer users focus on the science of their research work and to minimize the workload for the consulting team, TACC has designed, developed, and maintains a collection of powerful tools for supercomputer users. In this tutorial, we will present supercomputer tools specifically designed for complex user environments (Lmod, mkmod), tools for workflow management (ibrun, launcher, launcher-GPU, Pylauncher), tools for job monitoring and profiling (Remora, Peak, amask, etc.), and GPU tools (Nsight system and compute)performed on Vista and/or Frontera supercomputers at TACC. Tutorial attendees are expected to have some basic Linux experience, have some experience with multiprocessing (MPI/OpenMP), and be familiar with basic HPC CPU and GPU architectures.
Tutorial: Introduction to FABRIC
9 a.m. – 12:30 p.m.
Room: A220
Authors: James Griffioen, Charles Carpenter and Mami Hayashida
FABRIC is an advanced, programmable global network testbed for research and education that enables experimentation, rapid prototyping, and validation of new network and distributed computing applications and services that are impossible or impractical in the current Internet. This introductory tutorial will introduce and onboard attendees to the FABRIC network and then take them through introductory and intermediate hands-on example use cases, including: 1) Creating and deploying basic experiments, 2) Running intelligent big data computations across FABRIC, and 3) Using FABRIC’s integrated measurement framework. The tutorial will be of particular value to RCD facilitators who assist researchers in dealing with the challenges of managing, accessing, and processing large data sets; it’s designed for users who have little or no experience with FABRIC. Basic experience with Linux command line and minimal programming background (preferably Python) is desired. Users should also have some experience with remote login (e.g., ssh) and the use of (remote) virtual machines.
Workshop: National Cyberinfrastructure Resources in the Classroom
9 a.m. – 12:30 p.m.
Room: A214
Authors: Stephen Deems, Jeremy Fischer, Tom Maiden, Julian Pistorius, Zachary Graber and Lena Duplechin Seymour
This workshop aims to demonstrate the value of leveraging NSF-funded shared cyberinfrastructure resources to enhance the educational experience for both instructors and students. Co-led by the Pittsburgh Supercomputing Center and Indiana University, the workshop will provide participants with practical insights into using these resources for educational purposes, showcasing methods for two distinct platforms with differing capabilities, and provide a venue for rich discussion and recommendations. Participants are expected to have a basic understanding of popular computational tools utilized in classroom settings; prior experience with high-performance computing is not required.
Workshop: 2nd Workshop on Broadly Accessible Quantum Computing
Full day workshop
Room: A212
Authors: Bruno Abreu, Tommaso Macri, Santiago Nunez-Corrales and Yipeng Huang
Building on last year’s success, this workshop will explore the latest advancements in quantum computing (QC) and its integration with high-performance computing (HPC) and related applications. This year’s edition expands discussions on practical applications, hybrid quantum-classical strategies, and funding opportunities. Through invited talks, panels and community contributions, we will address workforce development, policy considerations, and strategies for making quantum resources more accessible. Designed for participants of all backgrounds, no prior quantum computing experience is required.
Tutorial: Programming and Profiling Modern Multicore Processors
Full day tutorial
Room: A123-A124
Authors: Amit Ruhela, Matthew Cawood, Yinzhi Wang, Hanning Chen, and Zuzanna Jedlinska
Modern processors are scaling out rather than up and increasing in complexity. Because the base frequencies for the large core count chips hover between 2-3 GHz, researchers can no longer rely on frequency scaling to increase the performance of their applications. This tutorial will cover serial and thread-parallel optimization, including introductory and intermediate concepts of vectorization and multi-threaded programming principles. We will address CPU and GPU profiling techniques and tools, as well as give a brief overview of modern HPC architectures. The tutorial will include hands-on exercises in parallel optimization, and profiling tools will be demonstrated on TACC systems. This tutorial is designed for intermediate programmers familiar with OpenMP and MPI, who wish to learn how to program for performance on modern architectures.
Workshop: Collaborating Your Way to Sustainability (Focus Week@PEARC25)
Full day workshop
Room: A211
Authors: Claire Stirm, Nancy Maron, Juliana Casavan and Maytal Dahan
Digital projects – science gateways, data repositories, educational websites – deliver a great deal of value to users by widely sharing sophisticated tools, large data sets or access to computing capabilities among those in the academic sector who really need them. However, sustaining and scaling these projects in a way that ensures long-term growth and impact is notoriously difficult. This full-day, dynamic and exercise-based workshop offers training on sustainability strategies and practical tools to help those creating and maintaining gateways and other innovative projects, with a focus on understanding your audience and identifying useful partnerships. This workshop is ideally suited for participants who have built or are directly involved in creating or supporting innovative digital projects, such as science gateways, cyberinfrastructure, or other products and services.
Tutorial: ACES Tutorial for using Graphcore Intelligence Processing Units (IPUs) for AI/ML Workflows
1:30 – 5 p.m.
Room: A213
Authors: Zhenhua He, Joshua Winchell, Richard Lawrence, Dhruva Chakravorty, Lisa Perez and Honggao Liu
The Accelerating Computing for Emerging Sciences (ACES) computing platform, funded by the NSF and hosted at Texas A&M University, has been made available to the national cyberinfrastructure (CI) community through ACCESS and the NAIRR Pilot. This computing platform features various innovative accelerators, including the Graphcore Intelligence Processing Units (IPU), which offers a model zoo and other utilities to help researchers speed up their AI/ML computing workflows. Researchers participating in this tutorial will learn how to port their TensorFlow (Keras) and PyTorch models for use with the Graphcore IPUs on ACES and model replication and pipelining techniques to distribute workloads. Prerequisites: Basic Python programming skills, knowledge of deep learning frameworks, such as TensorFlow and PyTorch and an NSF ACCESS ID are required (apply for an ACCESS ID).
Workshop: Campus Champions and NAIRR: Empowering AI Research Facilitation Through Collaboration
1:30 – 5 p.m.
Room: A110-A111
Authors: Michael D. Weiner, Forough Ghahramani, Cyd Burrows-Schilling, Marina Kraeva, Nitin Sukhija, Chuck Pavloski, Mike Renfro, Jason Simms and Juan Jose Garcia Mesa
The Campus Champions (CC) community has been functioning as an independent entity while partnering with other entities in the Research Computing and Data (RCD) ecosystem since the end of the XSEDE era. These partnerships foster a dynamic and connected community of advanced research computing professionals that promote leading practices at the frontiers of research, scholarship, teaching and industry application. This workshop is for anyone currently involved with Campus Champions or for anyone interested in joining and will empower attendees to facilitate AI research by sharing opportunities through the Campus Champions and spreading awareness of the NAIRR Pilot’s resources.
Workshop: Collaborating with K12 Schools: Supporting Secondary Students and Teachers in Computing
1:30 – 5 p.m.
Room: B132
Authors: Sandra Nite, Joshua Winchell and Dhruva Chakravorty
Collaborations between institutions of higher education and P12 schools foster deeper understanding about the work done by teachers, professors and administrators. The students in P12 will become college students and part of the workforce. Collaboration between instructors and administrators at all levels of the education process benefits all of society. In this session, presenters and the audience will discuss collaboration opportunities and share their successful experiences so that we can learn from each other.
Workshop: How Computational Infrastructures Can Support Scalable AI-Readiness of Data to Power Collaboration
1:30 – 5 p.m.
Room: A112-A113
Authors: Sergiu Sanielevici, Laurette Dubé, Christine Kirkpatrick, Raghu Mahiraju, Erik Schultes and Amitava Majumdar
The goal of this workshop is to discuss the synergy between making data AI-ready and the implementation of FAIR principles. Engaging in a deep dialogue with PEARC25 attendees should lead to recommendations for practitioners who develop and utilize scientific and commercial digital ecosystems to advance the creation of trustworthy and productive AI and reusable data infrastructure. These recommendations will be integrated into the stakeholder outreach strategies of the organizing institutions and shared with the advanced computing and data science communities served by the PEARC conference series. No specific skills or background are needed to attend this workshop.
Tutorial: CI Usage and Performance Data Analysis with XDMoD and NetSage for Resource Providers
1:30 – 5 p.m.
Room: A214
Authors: Aaron Weeden, Joseph P. White and Jennifer M. Schopf
In this interactive, hands-on tutorial, attendees will learn how to analyze the usage and performance of the NSF ACCESS-allocated cyberinfrastructure (CI) using the visualization and reporting capabilities of the ACCESS XDMoD and NetSage tools. ACCESS XDMoD provides system support personnel and center leadership with a wide variety of data on usage and job-level and system-level performance in real-time or through custom reporting. NetSage is an open, privacy-aware network measurement, analysis and visualization service that provides near real-time monitoring and visualization of data transfers to help ensure maximum efficiency. This tutorial will instruct attendees on how to use these tools and the wide variety of metrics available to facilitate CI system management, support and planning.
Tutorial: Deploy & Manage Kubernetes on Jetstream2 using OpenStack Magnum
1:30 – 5 p.m.
Room: The Eisenman Room
Authors: Julian Pistorius and Stephen Bird
In this 3-hour hands-on tutorial, participants will learn how to use OpenStack Magnum to create and manage Kubernetes clusters on the Jetstream2 research cloud. Designed for research software engineers and IT support staff with intermediate Linux skills and a basic understanding of containers and container orchestration, this session provides a repeatable process to build a scalable, container-based research system for their institutions.
Tutorial: Open OnDemand Overview, Customization, and App Development
1:30 – 5 p.m.
Room: A226
Authors: Alan Chalker, Julie Ma, Emily Moffat Sadeghi, Travis Ravert, Dhruva Chakravorty and Marinus Pennings
Developed by the Ohio Supercomputer Center and funded by the NSF, Open OnDemand is an open-source portal that enables web-based access to HPC services from any device with a web browser. Key features are as follows: it requires zero installation (since it runs entirely in a browser), is easy to use (via a simple interface), and is compatible with any device (even a mobile phone or tablet). The session leaders, all part of the Open OnDemand development team, will give a short overview of Open OnDemand, demo the features of Open OnDemand, give examples of customizing Open OnDemand and configuring interactive apps and provide an overview of the development roadmap for Open OnDemand, followed by a discussion regarding community Needs.
Tutorial: The Streetwise Guide to Jupyter Security
1:30 – 5 p.m.
Room: Trott Room
Authors: Rick Wagner and Robert Beverly
The Jupyter paradigm alters the threat landscape and presents unique challenges for infrastructure operators. Security concerns are even more pronounced in the heterogeneous environments of modern, complex, data-driven science and workflows. This tutorial – presented by a member of the Jupyter Security Subproject and a professor at a state university with a large JupyterHub deployment – will provide an overview of the Jupyter ecosystem and how it is most effectively used before diving into hands-on exercises covering the current best practices for securing and operating a JupyterHub installation in different environments. This tutorial is targeted at people looking to understand security in deploying and running Jupyter, with an emphasis on multi-user JupyterHub servers.
Expanding ACCESS: Tools and Innovations for the Broader Cyberinfrastructure Community
1:30 – 5 p.m.
Room: A225
In this session, members of ACCESS will highlight current tools, services and initiatives designed to benefit the broader advanced cyberinfrastructure community, extending beyond the direct scope of ACCESS resources. The session will feature a series of brief presentations focused on tools and resources that may be of interest to the wider community, with the goals of sparking interest and encouraging further engagement with ACCESS, and offering solutions that can be adopted and implemented at your home institution.
Tuesday, July 22
Exhibition Hall opens. ACCESS Booth 35/36 open.
8:15 a.m. – 6 p.m.
Batelle North
Systems and System Software Track: Data Driven CI System Design and Procurement with Open XDMoD
11 – 11:25 a.m.
Room: A110-A111
Authors: Thomas Furlani, Matthew Jones and Joseph White
The ability to apply data-driven design principles to customize new CI investment to best serve the intended community, as well as provide fact-based justification for its need, is critical given the important role it plays in research and economic development and its high cost. Here we describe a data-driven approach to CI system design based on workload analysis obtained using the popular open-source CI management tool Open XDMoD, and how it was leveraged in a procurement to increase the size of the cluster by 12% and provide end-users with an additional 5.6 million CPU hours annually. In addition to system design, we demonstrate Open XDMoD’s utility in providing fact-based justification for the CI procurement through usage metrics of existing CI resources.
Workforce Training and Education Track: Developing a Professional Internship Program for Software Engineering
11 – 11:25 a.m.
Room: A216
Authors: Tracy Brown, James Carson, William Allen, Maytal Dahan and Dan Stanzione
TACC operates multiple initiatives aimed at broadening participation in CI and HPC. One of these programs is the TACC Professional Internship Program (TPIP) for software engineering, an initiative that provides early career or non-traditional background participants with an immersive, hands-on team-based experience in the use of software development best practices within a production HPC environment. TPIP aims to provide participants with foundational skills that support productive careers. In this session, we’ll detail the motivation, implementation and structure for this program, as well as the impacts, insights and recommendations resulting from the past eight years.
Applications and Software Track: Developing an Interactive Online Platform for Advanced Cyber Training and Adaptive Learning Paths
11:30 – 11:45 a.m.
Room: A220-A221
Authors: Lan Zhao, Jaewoo Shin, I Luk Kim, Carol Song, Chimdia Kabuo, Jibin Joseph, Venkatesh Merwade, Jacob Hosen, Adnan Rajib and Wanju Huang
As the demand for advanced coding and cyber skills continues to grow, there is an urgent need for specialized training platforms that can keep pace with these advancements. Funded by the NSF Cybertraining Program, we designed and developed an open-access online platform, CyberFaCES, to support advanced cyber training with adaptive curriculum development. Instructors can create learning modules and combine them dynamically to design diverse learning paths. Students have access to a Jupyter Notebook environment that is seamlessly connected to the course catalog, along with access to HPC resources on the back end. Deployed on a Kubernetes-based composable system, the platform has been successfully used in teaching and training events, demonstrating broad applicability and versatility across disciplines.
Abstract Submission: Enhancing an HPC Resources Modeling Framework with a Realistic, Slurm-Like, HPC Resource Model
11:36 – 11:45 a.m.
Room: A226
Authors: Nikolay A. Simakov
HPC resources are essential for computationally intensive tasks in various scientific and engineering disciplines. Given their substantial initial and operational costs, coupled with high demand, it’s vital to utilize such resources optimally; HPC resource simulators play a key role. In this talk, we present our enhancement to HPCMod, an Agent-Based Modeling Framework for Modeling Users on HPC Resources. The new HPC resource model incorporates several features of the Slurm Workload Manager, such as individual node resource allocation (cores, memory, GPUs, and other tractable resources), priority factors and both priority-based and backfill scheduling. This model is significantly faster than the Slurm-based simulator, enabling more extensive HPC workload studies.
Systems and System Software Track: DeltaAI: A National Resource for AI/ML Research
11:40 – 11:55 a.m.
Room: A110-A111
Authors: Thomas Furlani, Matthew Jones and Joseph White
Authors: Brett Bode, Gregory Bauer, Laura Herriott, Volodymyr Kindratenko and William Gropp
DeltaAI is a new NSF-funded resource supporting researchers nationwide via the ACCESS and NAIRR Pilot programs and is the most powerful GPU resource available via the ACCESS program. DeltaAI leverages the Delta environment, sharing storage and other resources while offering a more scalable platform capable of scaling jobs up to the full node count of the system. DeltaAI offers the latest NVIDIA H100 GPUs as part of the innovative Grace Hopper architecture. This paper describes the full architecture of DeltaAI and our experiences benchmarking applications and supporting research teams during the acceptance process for DeltaAI.
Workforce Training and Education Track: Building the HPC Workforce: RMACC’s Cohort Program for System Administrators
11:50 a.m. – 12:05 p.m.
Room: A114-A115
Authors: Shelley Knuth, Craig Earley, Brandon Reyes, Kyle Reinholt, Jan Mandel, Mitchell McGlaughlin, Jarrod Schiffbauer and Joel Sharbrough
As demand for computational power grows, hiring and retaining cyberinfrastructure professionals (CIPs) remains a significant challenge, as many trained in enterprise services lack the specialized skills required for advanced CI administration. To address this gap, a student cohort program was implemented via the Rocky Mountain Advanced Computing Consortium (RMACC), providing hands-on training in CI administration. Students gained practical experience in Slurm configuration, Linux proficiency, hardware procurement and system troubleshooting. This initiative has proven highly successful and the results emphasize the importance of integrating CI administration education into research computing programs to ensure the sustainability and growth of advanced CI support.
Workforce Training and Education Track: ByteBoost: An advanced cybertraining program designed to enhance research on testbed systems
11:50 a.m. – 12:05 p.m.
Room: A216
Authors: Wesley Brashear, Dhruva Chakravorty, Zhenhua He, Dana O’Connor, Eva Siegmann, Paola A. Buitrago and Sergiu Sanielevici
The ByteBoost Cybertraining program, funded by the NSF, was created to promote the adoption of cutting-edge computing platforms into existing and novel HPC workflows. Comprising a team representing three NSF-funded testbed systems, ByteBoost strives to increase the utilization and productivity of these technologies across established and emerging HPC-enabled disciplines. To achieve these objectives, ByteBoost invited early-career researchers from across the nation to participate in a program consisting of a series of virtual seminars followed by a week-long workshop at PSC. Participants have since presented their research projects at international conferences, incorporated training into their classes and continue to utilize the training they received on the testbed systems. We present a broad overview of the inaugural year of the ByteBoost Cybertraining program, including participant feedback and potential improvements for future iterations.
Systems and System Software Track: Sticking (with) the Landing – A modern case for Knights Landing in Resource-Constrained Environments
12:10 – 12:25 p.m.
Room: A110-A111
Authors: Bryan Johnston, Charles Crosby, Quinn Reynolds and Jennifer Schopf
The HPC Ecosystems Project has repurposed decommissioned tier-1 HPC systems into entry-level clusters across Africa for over a decade. Stampede2 Knights Landing (KNL) systems are available for global distribution through the TACC’s Legacy Computing Program and to HPC Ecosystems Project partners. To ensure the novel KNL architecture is fit for purpose for sites contemplating adopting the legacy systems, this publication provides a brief performance reference guide for prospective adopters. Benchmark tests were conducted to evaluate Stampede2’s KNL processors on modern workloads to help inform prospective adoption decisions. It was concluded that the Stampede2 KNL processors remain particularly suitable for applications that benefit from good memory bandwidth, but that multi-node use is only feasible if high-performance networking is also available.
Campus Champions Luncheon
12:30 – 2 p.m.
Room: A213-A215
Authors: Marina Kraeva, Michael D. Weiner, Chuck Pavloski, Forough Ghahramani, Cyd Burrows-Schilling, Juan Jose Garcia Mesa, Elizabeth Kwon, David Reddy, Mike Renfro, Jason Simms and Nitin Sukhija
Campus Champions is dedicated to fostering a vibrant community of research computing and data professionals committed to enabling research across diverse institutions. Our mission is to facilitate seamless access to and utilization of local, regional, and national resources and technologies, fostering collaboration, knowledge-sharing and support. We envision a community where every campus has a Campus Champion, acting as a force multiplier for research computing and data management, and where no researcher is without guidance.
Applications and Software Track: Providing On-Prem GenAI Inference Services to a Campus Community
2 – 2:15 p.m.
Room: A220-A221
Authors: Sarah Rodenbeck, Erik Gough, Athreyan Mohana Krishnan Sangeetha, Ashish, Mihir Ahlawat, Vivek Karunai Kiri Ragavan, Abhishek Muthukumar and Aanis Ahmad
The Rosen Center for Advanced Computing at Purdue University has recently released two Generative AI inference tools, AnvilGPT and Purdue GenAI Studio, to the research and campus communities. These services support over 750 users who use 10+ open-source GenAI models to aid their work. Building on HPC’s long history of using open-source tools, these services are based on customized open-source frameworks and hosted entirely on-prem. This presentation argues that building custom GenAI services from open-source frameworks is a scalable and cost-effective solution for providing access to Generative AI models. We will also share the methodology and resources required to develop and host these services and seek to be a resource for other research computing centers that wish to leverage their HPC investment to create similar services.
Birds of a Feather: Developing and Providing Broad Access to a National Cyberinfrastructure Ecosystem through NSF Support
2 – 3 p.m.
Room: A211-A212
Authors: Katie Antypas, Amy Walton and Sharon Broude Geva
The NSF’s Office of Advanced Cyberinfrastructure (OAC) has defined a vision and investment plans for CI that address the evolving needs of the science and engineering research and education community nationwide. The continuing growth in systems capabilities, speed and scale, and data availability enables exciting new research opportunities and creates challenges to ensuring broad access to and participation in these opportunities that OAC is increasingly addressing. The panelists will include OAC leadership presenting current OAC strategy initiatives and program directors who lead OAC’s program areas. Presentation topics will include an overview of recent funding opportunities in program areas such as: Advanced Computing Systems, Data and Software and Networking and Cybersecurity. Panelists will also briefly talk about current initiatives, such as the NAIRR Pilot, and highlight programs that have undergone changes or enhancements since PEARC24. The session will be tailored for the PEARC audience and is intended for current and past PIs, future PIs and anyone interested in learning about the NSF-funded CI ecosystem, including students and research CI staff at all career stages and all types of involvement with CI use, support and creation.
Workforce Training and Education Track: Understanding AI Education Needs: Insights from the NAIRR Pilot
2 – 2:15 p.m.
Room: A114-A115
Authors: Shelley Knuth, Alana Romanella, Marisa Brazil, Nitin Sukhija, Layla Freeborn and Ragan Lee
The rapid proliferation of AI tools, such as ChatGPT, has driven increased interest among researchers and educators in understanding AI applications, best practices and potential challenges. To support the U.S. research and education communities in responsibly leveraging AI, the NAIRR Pilot was launched in 2024. Through surveys and engagement efforts, the NAIRR User Experience Working Group (UEWG) has identified key obstacles, including gaps in high-performance computing (HPC) knowledge and the need for structured AI education. The “AI Unlocked” workshop, hosted in April 2025, was developed to address these needs, attracting over 768 applicants and covering a range of AI topics. Findings from workshop applications and community surveys revealed strong interest in large language models (LLMs), AI research applications and hands-on training. This presentation presents insights gathered from these surveys, highlighting key findings from the collected responses.
Workforce Training and Education Track: Empowering NAIRR “Pilots” of all skill levels to become “ACES” with HPC
2:15 – 2:30 p.m.
Room: A114-A115
Authors: Zhenhua He, Joshua Winchell, Dhruva Chakravorty, Lisa Perezand and Honggao Liu
The NAIRR pilot aims to provide researchers with access to national cyberinfrastructure to advance AI research and applications. However, utilizing HPC may present a barrier for many. This presentation explores strategies to facilitate the adoption of HPC for those who use NAIRR Pilot resources via the ACES (Accelerating Computing for Emerging Sciences) system hosted by Texas A&M University. ACES, supported by the NSF, is an innovative, composable platform that features a range of accelerators. We present pointers to relevant short courses and user-friendly Open OnDemand tools, in addition to advanced user support, that help streamline AI workflows on HPC in a wide range of domains.
Birds of a Feather: Navigating Open-Source Software Commercialization: From Infancy to Maturity
4:15 – 5:15 p.m.
Room: A211-A212
Authors: Daniel Madren and Geoffrey Lentner
Open-source software (OSS) has become the foundation of modern research computing and scientific collaboration. However, transitioning an OSS project from a single-maintainer project to long-term sustainability remains a challenge for developers and institutions. This Birds of a Feather (BOF) will bring together thought leaders from across the research computing landscape to discuss sustainability and commercialization in OSS. Our panel will feature experts from various stages of this journey. We will also highlight the role of organizations like the Software Sustainability Institute (SSI), NumFOCUS and others in providing structured onramps for many OSS projects, helping to navigate governance, funding, and long-term sustainability.
Wednesday, July 23
Birds of a Feather: XDMoD Users Group
11 a.m. – 12 p.m.
Room: A211-A212
Authors: Joseph White, Aaron Weeden and Thomas Furlani
ACCESS Monitoring and Measurement Service (MMS) is an NSF-funded project that supports the comprehensive management of the NSF ACCESS program and its associated resources, the NAIRR pilot, and HPC and CI systems in general. It does so primarily through the ACCESS XDMoD, NAIRR XDMoD, and Open XDMoD tools, which track operational, performance and usage data for ACCESS, the NAIRR pilot and local CI systems, respectively. This session will include a general introduction to ACCESS XDMoD/NAIRR XDMoD/Open XDMoD to provide attendees with a general understanding of its capability. Presenters will also demo the most recent version of XDMoD, presenting new features, including the recently updated Data Analytics Framework for XDMoD based on the widely used Jupyter Notebooks. Discussion will follow the presentation.
Birds of a Feather: Open OnDemand User Group Meeting
11 a.m. – 12 p.m.
Room: A213-A215
Authors: Alan Chalker, Emily Moffat Sadeghi and Julie Ma
This session will provide a forum for the Open OnDemand (OOD) community to exchange experiences and best practices, as well as to engage with the project development team.
Applications and Software Track: Training a Machine Learned Potential on the Cerebras Wafer Scale Engine
2 – 2:15 p.m.
Room: A112-A113
Authors: Dana O’Connor, Wissam Saidi and Paola A. Buitrago
Machine-learned potentials, which can achieve the accuracy of \textit{ab-initio} methods at the cost of empirical ones, have become a mainstay of materials simulation. However, training these models often requires large amounts of data and can take several days to train on traditional GPU architectures. With new AI accelerators abound, we train a simple feed-forward neural network potential to predict the total energy of a subset of the ANI dataset on GPUs, IPUs and the Cerebras wafer-scale engine (WSE). We examine the effect of the depth of the neural network as well as the batch size on the training time and throughput of the model.
Birds of a Feather: ColdFront – Open-Source Resource Allocation Management System User Group Meeting
2 – 3 p.m.
Room: A211-A212
Authors: Dori Sajdak, Andrew Bruno, Raminder Singh, Claire Peters, Matthew Kusz, Abhinav Thota, John LaGrone and Eric Godat
ColdFront is an open-source resource allocation management system designed by the University at Buffalo Center for Computational Research to provide a central portal for administration, reporting and measuring scientific impact of cyberinfrastructure resources. ColdFront is designed to manage allocations for a diverse set of resources, including HPC clusters, co-located departmental and lab servers, software licenses, digital storage, scientific instrumentation, cloud subscriptions and data access requests. It is designed to complement existing tools, such as Slurm and OpenStack, which manage access to individual hardware and software components. As part of a recent collaboration, the teams are working towards implementing a robust CI/CD pipeline, a governance model, a developer’s guide and a project roadmap. In this BoF, attendees will have an opportunity to hear about these developments and provide feedback on our plans.
Systems and System Software Track: ACCESS Resource Advisor
2 – 2:25 p.m.
Room: A110-A111
Authors: Sandesh Lamichhane, Aiden Hamade, Hunter Brogna, Luis Nunes, Vikram Gazula and James Griffioen
ACCESS provides many essential High Performance Computing (HPC) resources to researchers across diverse scientific disciplines. However, selecting the most appropriate resource from the numerous available options presents significant challenges, especially for researchers without extensive technical expertise. This presentation introduces the ACCESS Resource Advisor (ARA), a web-based tool designed to guide researchers toward suitable HPC resources by using accessible language, only asking about basic HPC concepts that users would understand, identifying workflow patterns and providing recommendations with minimal required input. By simplifying the resource selection process, ARA helps researchers from over a hundred scientific fields make more informed decisions about computational resources, ultimately improving research efficiency and resource utilization across the ACCESS ecosystem.
Systems and System Software Track: Software Documentation Service
2:40 – 2:55 p.m.
Room: A121-A122
Authors: Sandesh Lamichhane, Hunter Brogna, Aiden Hamade, Mathew Damrell, Daniel Segura, Vikram Gazula and James Griffioen
The ACCESS Software Documentation Service (SDS) is designed to help ACCESS HPC users identify the software available on various ACCESS Resource Providers (RPs)resources, an important step toward selecting an RP. Not only does it provide a single webpage with information about all RP software, but it also provides extensive information about each piece of software, enabling researchers to discover software by research domain and type, learn its intended purpose and use, and read documentation and examples showing how to use the software. This presentation describes the SDS implementation, use within ACCESS, its benefits to both researchers and resource providers, and the stand-alone version of SDS available to HPC centers.
Birds of a Feather: At the Intersection of Artificial Intelligence and High-Performance Computing, What Happens to the Users?
4:15 – 5:15 p.m.
Room: A211-A212
Authors: Alana Romanella, Shelley Knuth, Nitin Sukhija and Marisa Brazil
Researchers across disciplines rely on HPC and AI resources, yet many face significant barriers such as complex job scheduling, software dependencies and a lack of institutional support. The NAIRR Pilot, launched in 2024, aims to democratize access to AI resources, but early evaluations highlight some user challenges. This Birds of a Feather session features HPC facilitators from leading institutions to discuss key hurdles for users at the intersection of HPC and AI, such as computational integration, user expectations and effective onboarding strategies. By exploring best practices in user support and training, we aim to enhance accessibility and ensure researchers maximize computational resources for scientific discovery.
Birds of a Feather: Node to Joy: Finding the Right Compute Resources
4:15 – 5:15 p.m.
Room: A213-A215
Authors: Jeremy Fischer, Carol X. Song, Eva Siegmann, Sergiu Sanielevici, Honggao Liu, Virginia Trueheart and Jim Griffioen
In this session, the ACCESS Resource Providers (RP) will give a brief overview of the available resources and their unique characteristics. The presentation portion of this BoF will highlight the variety of available resources and will be followed by a discussion with the community, allowing the audience to interact directly with the RP representatives. We hope to seed the discussion with topics but allow attendees to steer it, perhaps uncovering topics not suggested here.
Birds of a Feather: Expansion of Office of Advanced Cyberinfrastructure (OAC) Learning and Workforce Development (LWD) in the AI Era
4:15 – 5:15 p.m.
Room: A110-A111
Authors: J. Jenny Li
NSF’s Office of Advanced Cyberinfrastructure (OAC) Learning and Workforce Development (LWD) supports efforts to improve the adoption of cyberinfrastructure resources by the research community, integrating core literacy and skills in advanced cyberinfrastructure and adopting computational and data-driven methods into undergraduate and graduate education.
Poster: ACCESS Resource Integration Dashboard
5:30 – 7:30 p.m.
Room: Eisenman / Trott Rooms
Authors: Dinuka De Silva, John-Paul Navarro, Esen Gokpinar-Shelton, Winona Snapp-Childs and Robert Quick
The current process of resource integration with ACCESS is followed through semi-structured documentation. This process is challenging because it is lengthy and not user-friendly enough, so more concierge staff time is required to make it effective. We are going to overcome these challenges by developing a web-based application for resource integration.
Poster: Architecture for Collaboration: Building a Unified ACCESS Resource Catalog
5:30 – 7:30 p.m.
Room: Eisenman / Trott Rooms
Authors: Matt Yoder
One of the key challenges during the early years of the ACCESS program was the lack of a comprehensive online catalog for selecting advanced computing resources. Developed over seven months during 2024, the resource catalog is a collaborative effort to bring together information about resources from across the ACCESS ecosystem. It provides a one-stop online destination for researchers and educators to learn about and select resources. In this poster, I describe how the design and technical choices we made during the development of the catalog enabled the ACCESS teams to work together in new ways and set the stage for future collaboration.
Poster: Continuous HPC Performance Monitoring: Can It Run Without Affecting User Jobs?
5:30 – 7:30 p.m.
Room: Eisenman / Trott Rooms
Authors: Jyothismaria Joseph and Nikolay Simakov
HPC resources are essential for large-scale scientific research, supporting mixed computational workloads, from molecular dynamics simulations to large-scale data analysis. The performance of HPC resources must be monitored to ensure optimal speed for users’ computational needs. One way to monitor the resource performance is through continuous performance monitoring, where a set of tests is executed regularly, often daily, and the performance is analyzed in an automated manner. Despite its benefits, some HPC centers hesitate to implement continuous monitoring due to concerns about consuming CPU cycles that could otherwise be used for user jobs. However, even on busy resources, the jobs are scheduled with idle gaps largely due to job constraints. While these periods are often short and scattered, they present an opportunity to execute performance monitoring tasks without disrupting user workloads.
Poster: Designing Human-Centered Integration Pathways for a Large-Scale Cyberinfrastructure System: A UX/UI Approach to Streamline Resource Contribution
5:30 – 7:30 p.m.
Room: Eisenman / Trott Rooms
Authors: Esen Gokpinar Shelton, John-Paul Navarro, Dinuka De Silva, Winona Snapp-Childs and Rob Quick
This NSF-funded project addresses the challenges of integrating high-performance computing (HPC) resources into ACCESS, a national cyberinfrastructure ecosystem. Resource providers have long struggled with ACCESS’s previous complex interface, resulting in high cognitive load and inefficiency. To address this, we applied human-centered design principles to develop a badge-based interface system; by organizing resources and tasks into intuitive badges, we’ve created a system that significantly improves the user experience on the ACCESS platform. This ongoing UI redesign not only enhances ACCESS but also offers a scalable framework that can be adapted to similar platforms, ultimately benefiting the broader scientific and research community by improving the accessibility and usability of critical HPC resources.
