Schedule

Tuesday May 12th


Jordan Ballroom

Location: Jordan Ballroom

Title: From Deception Detection to AI-Resilient Assessment: Scaling Human-Centered Analytics

Abstract: This keynote traces a research and commercialization journey that began in large-scale deception detection and automated interviewing systems and culminated in RhetorixLab, an AI-resilient assessment platform designed to restore authenticity and trust in an era of generative AI. Drawing on years of work in video-based feature extraction and behavioral signal processing, the talk highlights how rich multimodal data from voice, language, and facial expression can be captured and analyzed using commodity hardware and cloud-scale infrastructure. The discussion then pivots to higher education, where traditional text-based assessments are rapidly losing diagnostic value, and shows how asynchronous oral assessments reintroduce human judgment, communication skill, and genuine reasoning.

Dr. Steven Pentland
Associate Professor, Information Technology Management
Boise State University

Simplot A: 
Real-time HPC Workload Efficiency Monitoring and Alerting
Presenter- Jackson Mckay, Paul Fischer

Achieving efficient utilization of shared compute resources is a primary objective of HPC providers, to maximize value for both researchers and institutions. While providers leverage workload managers to efficiently allocate many resources across many users over time, scheduling alone cannot prevent allocated resources from sitting idle or underutilized in terms of raw compute and memory resources. Common causes are misconfigured workload parameters; a user may unintentionally request too many resources, or of the wrong type. Between wait times and the often-opaque nature of batch execution, users may “fire-and-forget" their batch workloads and be unaware of serious inefficiencies. To address this issue, we develop a real-time workload monitoring and alerting system that rapidly informs users and HPC administrators of inefficient workloads, even while those workloads are still running. We will present the architecture of our system, which includes components from Slurm, Prometheus, VictoriaMetrics, PostgreSQL, and CHPC software. We will also provide a data-driven analysis of the results we have achieved with the system, as well as lessons learned and our future roadmap

Simplot B: Alan Chapman- Updating and optimizing mutation detection pipeline 

Simplot C: 
Warewulf Basics & Cluster Management
Presenters- Michael Ennis and Joe Leister


Simplot D: 
Beyond Files: Streaming Data Management with Globus
Presenter- Vas Vasiliadis

Beyond growing data volumes, researchers are now having to deal with increasing data velocity. For example, instruments are generating data faster than can be consumed by downstream processes and AI/ML-guided research is relying on near real-time feedback for optimizing experiments. Such data streaming applications require different tools for data management and computation than those used for traditional file-based data.

Building on the established and widely used Globus file transfer service, we present newly released capabilities that enable researchers to stream data securely across wide area networks in support of real-time data processing. We will provide an overview of the Globus data streaming service and describe how to incorporate it into your research applications.

Simplot A:
Private Intelligence at Scale
Deploying Local LLMs on HPC using Ollama and Apptainer/Singularity
Presenter- Renn Valo, NOAA

As the demand for localized Large Language Models (LLMs) grows, organizations face the challenge of providing secure, high-performance environments for AI development. This session provides a technical roadmap for deploying private, containerized AI models on High-Performance Computing (HPC) infrastructure using Apptainer.

We will walk through the end-to-end process of building a robust AI service layer, including:
Container Architecture: Leveraging Apptainer for seamless GPU integration and reproducibility.
Model Orchestration: Installing and managing models via Ollama within a containerized environment.
Access & Connectivity: Navigating HPC networking through port mapping and secure tunneling.
Developer Experience: Integrating the backend with modern tools like VS Code to create a familiar, local-feeling development workflow on remote hardware.
Attendees will leave with a practical framework for delivering "AI-as-a-Service" to their community, ensuring data privacy and maximizing shared GPU resources.

Simplot B: NSGA-II at scale

Simplot C: 
Building the Next Generation of HPC Talent: Forming an RMACC Student Cluster Competition Team
Presenter- Layla Freeborn

The HPC community thrives on collaboration, curiosity, and fostering a forward-thinking talent pool. Student cluster competitions give the next generation of HPC professionals hands‑on supercomputing experience, real‑world problem‑solving skills, and direct exposure to the global HPC community. They blend technical skills, teamwork, and professional development in a way that few academic experiences can match. This Birds of a Feather session brings together students, faculty, and HPC professionals interested in establishing an RMACC Student Cluster Competition team. We’ll discuss what it takes to build a competitive team from the ground up, including technical skill development, mentorship opportunities, hardware and resource needs, securing vendor sponsorship, and strategies for preparing students for national HPC competitions. Participants will help shape the team’s structure, recruitment approach, and training roadmap. Whether you’re a student eager to get hands‑on experience or a professional excited to support emerging talent, this session can be the starting point for a vibrant, sustainable RMACC Student Cluster Competition team. 

Simplot D: 
Student Career Panel
Moderator- Dan Jorgensen, RMACC 

Join us for a student career panel where you will hear from HPC professionals about how they entered the HPC workforce and what they are looking for when hiring new employees. This will be a one-hour session with presentations and time for questions

 

Simplot A: 
Redesigning the Hellgate HPC: Lessons in overcoming growing pains and purgatory on mid-sized clusters.
Presenter- Myra Jamison

Over the past four years, the University of Montana Hellgate HPC has rapidly grown from a ragtag collection of individual lab systems to the largest computing resource at our institution, comprising roughly 3,000 compute cores, 130 GPUs, and over 200 users. However, this development has exceeded the scale of our original HPC design, with consequences to stability, performance, and user experience. These issues were addressed by revisiting our stateless provisioning strategies, network/hardware topology, authentication flow, head node stack, and administrative standards of practice. We discuss our experiences in employing Ansible and Git to create scalable infrastructure and sustainable SOPs, publishing a living user documentation base, redesigning physical layout, and the obstacles, decisions, and surprises we encountered along the way.

Simplot B:
NetCDF State of the Union, Roadmap Forwards
Presenter- Ward Fisher

NetCDF (Network Common Data Form) is a set of software libraries and machine-independent, self-describing file formats used to store and share array-oriented scientific data. Widely used in atmospheric and oceanic sciences, it supports efficient access to multidimensional data (e.g., temperature, wind speed), and has been a fundamental archival and operational data format used since 1990. In this workshop, the netCDF lead developer from NSF Unidata, Ward Fisher, will discuss the current state of the netCDF project, how we have adapted to the move to Cloud Computing and object data storage, and what is on the horizon for this storied project.

Simplot C:
Quantum computing HPC integration
Presenter- Coltran Hophan-Nichols 

Coltran Hophan-Nichols from Montana State University will discuss ongoing work to integrate multiple quantum computing modalities with classical high-performance computing. This work involves SLURM configurations, access controls, network configurations, and user interfaces. Hybrid quantum-classical and quantum simulation use cases will be covered.

Simplot D:

Navigating the Flash Crisis: Why Architectural Flexibility Wins in 2026

Flash pricing volatility has reshaped the storage landscape and organizations are rethinking how to get all-flash performance without runaway costs. In this session we will discuss practical approaches to navigating flash economics, cost optimization, and performance at scale.

NAND flash and DRAM prices are surging and supply is constrained, forcing requotes and delaying deals. VDURA’s mixed fleet with intelligent tiering architecture keeps AI/HPC projects moving with more budget stability. Attendees will learn how to:

  • Keep AI/HPC projects on budget.
  • Sustain throughput under concurrency and checkpointing.
  • Avoid over-investing in flash for capacity-heavy data.

In short, VDURA’s mixed-fleet architecture delivers GPU-class performance with lower cost and greater supply-chain resilience. 

 

Wednesday May 13th

Jordan Ballroom

Join us to hear 1- minute lightning talks from our student poster presentations. 

Location- Jordan Ballroom
Bryan Smith, Acting Chief Technology Officer for Nuclear Science & Technology at the Idaho National Lab

Accelerating America’s Nuclear Future: Building Advanced Nuclear Infrastructure at Speed and Scale


The United States has entered a transformative period in nuclear energy development, driven by unprecedented load growth from data centers and AI infrastructure, coupled with the most ambitious federal nuclear directives in decades. This presentation examines the convergence of technology demonstration, industrial partnership, and policy acceleration that is reshaping America’s nuclear landscape. Drawing from ongoing work at Idaho National Laboratory, we’ll explore the reactor and fuel cycle pilot projects progressing from concept to concrete deployment, including INL’s role as the nation’s premier testbed for advanced reactor technologies. The presentation will detail emerging frameworks for industry collaboration that are enabling diverse off-takers, from hyperscale data centers to military installations, to partner in deploying next-generation nuclear systems. Finally, we’ll assess progress against the aggressive timelines established by last year’s landmark nuclear Executive Orders, which call for demonstrating multiple advanced reactor designs and achieving significant new nuclear capacity by decade’s end on the path to quadrupling American nuclear capacity by 2050.

Jordan  Ballroom- Room D

DLI Workshop - Data Parallelism: How to Train Deep Learning Models on Multiple GPUs
Presenter- Daniel Howard

With support of the Deep Learning Institute from NVIDIA, a training workshop is offered to all RMACC attendees. Attendees will also be provided information on how to become certified instructors, such as with community support from Cyberinfrastructure Community-wide Mentorship Network (CCMNet), in order to offer this course and other DLI materials for their own communities. Course content and learning objectives follow:

 

Modern deep learning challenges leverage increasingly larger datasets and more complex models. As a result, significant computational power is required to train models effectively and efficiently. Learning to distribute data across multiple GPUs during deep learning model training makes possible an incredible wealth of new applications utilizing deep learning.

Learning Objectives

  • Understand how data parallel deep learning training is performed using multiple GPUs
  • Achieve maximum throughput when training, for the best use of multiple GPUs
  • Distribute training to multiple GPUs using Pytorch Distributed Data Parallel
  • Understand and utilize algorithmic considerations specific to multi-GPU training performance and accuracy

Simplot A:
Building boisestate.ai: Lessons Learned from Developing a Cost-Effective Internal AI Platform for Higher Education
Presenter- Phil Merrill

As universities race to provide generative AI access to students and faculty, the cost of commercial subscriptions at institutional scale quickly becomes unsustainable. At Boise State University, we built boisestate.ai—an open source AI platform powered by AWS Bedrock. This presentation shares the practical lessons learned from developing and operating the platform, including the consumption-based paradigm shift associated with pay-per-token pricing, seven specific cost optimization strategies (from prompt caching to semantic tool filtering), and approaches for making AI institutionally aware through MCP servers and agent skills. We'll also discuss why 2026 is shaping up to be the year of the AI agent, and how progressive disclosure and codified institutional knowledge are key to building AI that doesn't just chat — but actually gets work done for your campus.

Simplot B:
English is the Hottest New Programming Language: The AI Catalyst Model for Advanced Computing Ambassadorship
Presenter- Liza Long, Ed. D.
Academic Technology Program Manager, Idaho State Board of Education | Ph.D. Student, English, Idaho State University

As "vibe coding" and natural language interfaces become standard in advanced computing, the technical "how" is increasingly decoupled from the conceptual "what" o “why.” This shift creates a critical need for academic leaders who can bridge the gap between high-performance computing (HPC) capabilities and ethical, literate application. In Idaho, we are approaching this challenge by cultivating a culture of "ambassadorship" where faculty—particularly those from the humanities—model the critical inquiry necessary to navigate this new landscape. This presentation introduces the AI Catalyst model, co-developed with BSU Nursing professor Jason Blomquist, as a framework for decentralized AI leadership. AI Catalysts at each institution serve as bridges between technical infrastructure and pedagogical practice. I will discuss how faculty with humanities backgrounds are uniquely positioned to be these ambassadors. By applying the rigor of rhetoric, analysis, and critical thinking to "vibe coding" and AI-driven research, these catalysts model for students and peers how to be the "human in the loop." The AI Catalyst model offers a scalable blueprint for Workforce Development. It demonstrates how to move beyond top-down mandates toward a bottom-up, faculty-led movement that demystifies advanced computing. By empowering humanities-trained faculty as AI ambassadors, institutions can ensure that the next generation of researchers—regardless of their discipline—possesses the sophisticated problem-formulation skills required in a world where English has become the hottest new programming language.

Simplot C:
 Fullmoon: A Modern Web Interface for Warewulf Cluster Management
Presenter- Josh Burks

High-performance computing environments demand efficient and intuitive tools to manage increasingly complex cluster operations at scale. This presentation introduces Fullmoon: a community-developed, open-source web interface for Warewulf clusters designed to improve the administrative experience without compromising control or reliability.

Inspired by Slurm Dash, Fullmoon is built on top of the Warewulf REST API and uses Next.js to deliver a modern, responsive management dashboard. It provides advanced node sorting and filtering, streamlined node lifecycle management, input validation and sanity checks, a web-based overlay file browser, intelligent auto completion for configuration fields, and clear visualization of profile inheritance. 

The session will demonstrate Fullmoon’s capabilities, discuss architectural and security considerations for web applications in HPC environments, and highlight use cases where a web interface improves efficiency compared to traditional CLI workflows. Fullmoon provides the HPC community with an open source solution that modernizes Warewulf cluster management while preserving the robustness required for production systems.

Simplot D:
Unifying Access to Distributed Data for AI and High-Performance Computing
Presenter- Floyd Christofferson, Vice President of Product Marketing at Hammerspace

Modern HPC and AI workloads increasingly depend on data that is distributed across multiple storage systems, tiers, and locations, including on-premises clusters, institutional storage, and cloud resources. While compute performance continues to scale rapidly, data access and data movement have become primary bottlenecks, limiting utilization and complicating workflow design.

This talk examines an open, standards-based approach to unifying access to distributed data for AI and HPC workloads—without requiring proprietary clients, forklift upgrades, or disruptive data migrations. Using Hammerspace as a concrete example, the session explores how modern parallel file system standards and automated data orchestration can be used to present a single, high-performance data namespace across otherwise siloed storage systems and sites.

Attendees will learn how global namespace architectures, combined with pNFS 4.2 and policy-driven data orchestration, enable linear scaling of IOPS and throughput using existing infrastructure. The result is simplified workflow design, improved data locality, and higher sustained utilization of expensive CPU and GPU resources—particularly for AI training, inference, and data-intensive simulation workloads.

Key topics include:

  • Parallel Global File Systems with pNFS 4.2 – Leveraging open standards to provide scalable, high-performance access to distributed datasets without proprietary file systems.
  • Automated Data Orchestration – Using policy-driven data placement and movement to align data dynamically with compute, while maintaining continuous access.
  • AI and HPC Workflow Optimization – Simplifying data access across clusters and sites to reduce staging, eliminate redundant copies, and maximize compute efficiency.
     

 

Simplot A:
Campus generative AI services running on HPC
Presenter- Coltran Hophan-Nichols

Follow Montana State University’s journey to implement a flexible and open-source generative AI suite, backed by our Tempest HPC system. Learn how we were able to leverage existing infrastructure and open-source platforms to provide powerful tools to all our faculty, staff, and students with no direct cost or token restrictions. Takeaways and how the local service coheres with a broader AI portfolio will be discussed. 

Simplot B:
AI Biomolecular structure prediction tools a year later
Presenter- Martin Cuma

The AI biomolecular prediction field keeps moving at a fast rate. In this talk, we'll outline what has happened at Utah in this area in the past year, including setting up a more performant server for Multiple Sequence Alignment (MSA) using mmseqs2, Boltz2 and Colabfold changes, and biomolecular design with Boltzgen.

Simplot C: 
Intel and GCC Compilers: Unleash the Power of Xeon 6 
Presenter-Xinmin Tian, Intel

In this presentation, we provide insights for Intel and GCC compiler optimizations and tunings for Xeon 6 with workload code examples. A case study on performance tuning of Torch.Inductor OpenMP code on Xeon 6 will be presented to show Intel new processor benefits. 

Simplot D:
HPC Technology in a Turbulent Market
Presenter- Chris Reidy

Purchasing equipment has become very challenging, from prices changing almost daily, to rapidly growing power consumption. We take a quick walk through the current state of the market with input from our peers and vendors.

 

Jordan Ballroom

Simplot A: 
Compute Anywhere with Function-as-a-Service with Globus Compute
Presenter- Vas Vasiliadis

Growing data volumes, new computing paradigms, and increasing hardware heterogeneity are driving the need to execute code on diverse distributed computing resources, many of which are outside the bounds of the researcher's institution. This need may be driven by (a) the desire to compute closer to data acquisition sources, (b) exploit specialized computing resources such as hardware accelerators, (c) provide real-time processing of data, (d) reduce energy consumption (e.g., by matching workload with hardware), and (e) scale simulations beyond the limits of a single computer.

Globus Compute addresses these needs by delivering a hybrid cloud platform implementing the Function-as-a-Service (FaaS) paradigm. Researchers first register their desired function with a cloud-hosted service, they can then request invocation of that function with arbitrary input arguments to be executed on remote cyberinfrastructure. Globus Compute manages the reliable and secure execution of the function, provisioning resources, staging function code and inputs, managing safe and secure execution (optionally using containers), monitoring execution, and asynchronously returning results to users via the cloud platform.

This tutorial will describe use cases for FaaS in science and demonstrate how Globus Compute can provide a common interface and approach for portable execution across different systems. Attendees will experiment with Globus Compute on virtual machines and learn how to deploy Globus Compute on their HPC cluster or other advanced computing system.

Simplot B:
Agentic AI for Advanced Research; Data Storage; Data Management
Presenter- Earl J. Dodd, Global HPC Business Practice Leader, Principal Solution Architect
World Wide Technology (WWT), part of the Global Solutions & Architecture (GS&A) team.

Most research environments treat storage as a procurement decision. Agentic AI flips that. Workflow and storage decide whether object, file, and parallel file systems succeed or fail, and “one big, shared filesystem” often collapses under metadata-heavy orchestration.

This session presents a workflow-first approach to infrastructure design for agentic AI and workflow-based pipelines. We characterize the I/O signatures that break classic HPC defaults, including small-file fan-out, high namespace churn, checkpoint bursts, and multi-tenant contention. We then outline a tiered architecture playbook: durable object for curated corpora, high-metadata file for orchestration surfaces, high-throughput scratch for transient staging, and policy-driven movement that preserves provenance. Throughout, we use explicit decision axes, including throughput, metadata ops, latency, and durability, so teams can justify choices to leadership and align investments to measurable bottlenecks.

Simplot C: 
Building Sovereign AI Factories: A Blueprint for State-Level Economic Growth
Presenter- HPE
AI is creating clear winners and losers across economies and regions. Sovereign AI Factories offer a powerful economic fulcrum for states seeking to attract talent, investment, and sustainable growth. By pooling resources at a state level, Sovereign AI Factories unite universities, K–12 systems, corporations, research institutes, and economic development agencies around a shared AI infrastructure that no single organization could afford on its own.

This presentation will outline the three pillars of a successful Sovereign AI Factory:

Defining a Sovereign AI Factory

  • Setting goals
  • Building the coalition
  • Promoting the benefits

Defining the Hardware Architecture

  • Scaling compute, storage, and network resources
  • Addressing the need for direct liquid cooling
  • Key data center considerations

Defining the User Experience

  • Delivering a self-service cloud experience
  • Ensuring user and resource security
  • Supporting AI and HPC workloads

    Simplot D: 
    Workshop on using Generative AI in an HPC environment
    Presenter- Scott Reed

    Generative AI, especially large language models (LLMs) can write, explain, and refactor code from natural-language intent. Used well, it speeds up everything from Python prototyping and debugging to translating messy requirements into clean, runnable scripts. In this workshop we’ll start with the basics of how LLMs behave and why “context” and constraints matter, then apply that directly to HPC workflows: generating scripts and validating them for safety and reproducibility.

    In this interactive workshop each participant will use AI to generate bash scripts and code. Bring a laptop and a gmail account to log into google colab. We will cover LLM architecture, AI agents, skills, local vs remote LLM access, sandboxing, quantization, Retrieval Augmented Generation, and context engineering. We will explore the OpenAI API and multiple models that use this interface. We’ll focus on practical HPC workflows while enforcing safety guardrails and reproducibility.


    Atrium of the Visual Arts Building
    Come and visit the Stein Luminary! Ask Staff for directions 

    The Keith and Catherine Stein Luminary is an all-digital museum space, producing a 
    range of immersive, interactive and sensory experiences. Combining touch-activated 
    screens and immersive projection, we deliver cutting-edge content focused on visual 
    and performing arts and cultural exhibitions for the Boise State community.

 

Simplot A: 
Harnessing AI: Transforming High-Performance Computing for Next-Generation Innovation
Moderator- Shelley Knuth
Panelists- Liza Long, Phil Merrill, Casey Kennington

This panel will explore the role of artificial intelligence in high-performance computing (HPC) environments, highlighting innovative applications that enhance computational efficiency and data analysis. Attendees will gain insights into future trends and collaborative strategies for leveraging AI.

Simplot B: RMACC User Facilitation Meet-Up
Moderator- Andy Monaghan

Calling all facilitators and other user support personnel! Let’s meet up in person at RMACC to discuss our successes and challenges. Topics will include experiences in user education and outreach; the use of AI for case management and other support services; meeting the needs of customers with sensitive data, and anything else that comes up!

Simplot C: System Administrator Meetup
Moderator- Michael Ennis

Meetup with members of the RMACC SysAdmin group for an informal discussion. 

Simplot D:
LoRA, RAG, RL, Agentic AI - Making sense of the different acronyms to improve LLMs and fix hallucinations
Presenter- Michael Ramshaw 

New models from leading AI organizations come out seemingly every week, but despite the constant evolution there are always hallucinations and knowledge gaps. LoRA, RAG, RL, and Agentic AI are all popular methods to improve LLM performance and this presentation will give an overview of each method as well as discuss how "performance" can be measured in an LLM context. These methods will be compared and contrasted across several criteria such as ease of use, resources required, and the theory behind them. Open source models from the huggingface repository will be primarily used as examples since they are common in RMACC institutions that may have restricted access environments. 

 

Simplot A: 
UV Package Manager--Get Your Sunscreen!  
Presenter- Mohal Khandewal

Python dependency management getting you burned? Tired of slow pip installs and juggling virtual environments on HPC systems? This session introduces UV, a lightning-fast Python package manager designed for speed, simplicity, and reproducibility. 

We’ll explore UV’s core features including uv pip, uv venv, uv tool and demonstrate how they streamline package installation, environment creation, and tool management. We’ll also compare UV with traditional solutions such as pip and conda, highlighting where it excels in performance and usability, particularly in high-performance computing (HPC) environments. 


Simplot B: 
Practical Guide to Performance-Conscious Python
Presenter- Robben Migacz

Python is one of the most widely used languages in scientific computing, and its adoption on HPC systems continues to grow, particularly among users with limited training in HPC and performance-oriented software development. At the same time, the ecosystem of tools for high-performance Python has expanded rapidly, making it increasingly difficult for users—and the research computing support teams advising them—to identify effective strategies to improve performance. I will discuss the landscape of high-performance Python with an emphasis on decision-making: how to choose appropriate tools and approaches based on workload characteristics and performance goals, focusing on common performance pitfalls, practical tradeoffs, and guidance for selecting technologies. The content will be most useful for Python users and research computing and data (RCD) facilitators who support Python workflows on HPC systems.

Simplot C:
Deploying and Operationalizing Intel Gaudi Systems with Kubernetes for AI Workloads
Presenter- Johnathan Lee

This session will cover ASU’s experience with the recent Intel Gaudi system donation, including the planning and buildout of new data center space to support the hardware. I will discuss the technical and operational challenges involved in bringing the systems online, lessons learned during deployment, the current status of the environment, and how we plan to integrate Gaudi into our broader research computing ecosystem to support AI and other high performance workloads.

Simplot D: 
Accelerating Scale-Out HPC and Data-Intensive Research with a Modern Parallel File System
Presenter- Joe McCormick, BeeGFS

This session examines the role of modern parallel file systems in supporting scalable, data-intensive HPC environments commonly found in academic and regional research computing centers. It discusses how BeeGFS enables high-throughput, low-latency storage architectures that effectively support a wide range of workloads, including traditional simulation, modeling, data analytics, and emerging AI-driven research.
Attendees will gain insight into BeeGFS architecture and design principles, including scalable metadata services, flexible storage tiering, and integration with high-speed networks and NVMe-based storage. The session will also present real-world user case studies from research and HPC environments, highlighting practical deployment considerations, performance characteristics, and operational lessons learned.
By focusing on production deployments and real research workflows, this talk demonstrates how BeeGFS is used today to build reliable, high-performance storage platforms that scale with growing compute and data demands in academic and research-focused HPC environments.

 

Simplot A: 
What's on the agenda for RMACC in 2026?
Presenter- Becky Yeager

Join us to learn more about what RMACC has accomplished over the last year
and what is on the agenda for 2026 and beyond!

Simplot B: Kit Menlove- INL’s software stack buildout

Simplot C: 
Linpack and other benchmarks on heterogeneous HPC clusters 
Presenter- Nil Mu

Benchmarking is a core part of HPC operations — whether you're validating a new cluster, justifying a procurement, or hunting down a performance regression. But running benchmarks well on heterogeneous hardware introduces challenges that the documentation doesn't always prepare you for. This presentation shares our experience running the NVIDIA HPC benchmark container on A100 GPUs and submitting results to the Top500 and Green500 lists, covering the practical details of tuning HPL parameters, navigating the submission process, and the surprises we encountered along the way. We also attempted runs across mixed A100 and H100 nodes, which raised questions about how GPU generational differences affect Linpack scaling, interconnect saturation, and whether heterogeneous submissions are even meaningful. Beyond Linpack, we'll discuss our use of the OSU Micro-Benchmarks (OMB) suite as a cluster diagnostics tool, using pairwise node communication tests to identify fabric bottlenecks, misconfigured adapters, and inconsistent latency across the cluster, and more.

Simplot D:
Scalable Patent Search and Analysis Using Large Language Models with Function Calling
Presenter- Juan Jose Garcia Mesa

This work describes a scalable system for automated patent search and analysis that integrates large language models with function calling to support data retrieval and classification. The approach combines conventional data extraction from the US Patent and Trademark Office with semantic similarity search and structured function execution to enable accurate and reproducible patent management applicable to real-world institutional data analysis challenges.

 

 

Join us at The Stueckle Sky Center 
 1200 W University Dr, Boise, ID

5:30 pm- Take a picture on the iconic blue field of Albertson's Stadium, explore the Boise State Football Stadium
6:00-7:30 pm- Join us the Stueckle Sky Center for Appetizers and Drinks, sponsored by Intel

Thursday May 14th

Simplot A: Evolutionary HPC

Simplot B: 
Hands-On with DeepLynx Nexus: Building an AI-Ready Data Catalog
Presenter- Drew Rizk

In 2018, Idaho National Laboratory built DeepLynx, a data warehouse designed to organize large amounts of engineering and scientific data. As INL's projects grew more complex and AI became more central to their work, the original platform couldn't scale to meet new requirements.

DeepLynx Nexus was built to address these limitations. Rather than replicating and storing large datasets, Nexus catalogs metadata and relationships about data that lives in existing systems. Think of it as a smart catalog that doesn't just tell you what data exists and where to find it, but also explains how different pieces relate to each other, where they came from, and what they mean. This rich context is exactly what AI agents need to actually do useful work.

This presentation provides a hands-on walkthrough of getting Nexus running locally and cataloging your first datasets. We'll cover installation, configuration, creating a data schema, and a brief overview of Apache Airflow, a common ETL adapter architecture we use to bring metadata into Nexus. By the end, you'll have a practical understanding of how Nexus works and how it's being used to support lab initiatives. 

Simplot C:
AWS

Simplot D:
Building the AI Workforce: Practical Models for Training the Next Generation of Research Computing Professionals
Presenters- Shelley Knuth, Craig Earley, Kyle Reinholt, John Reiland

 As demand for AI-enabled research continues to grow, institutions across higher education are facing a common challenge: how to develop the skilled workforce needed to operate advanced computing infrastructure and support researchers using AI tools. This session will highlight two complementary workforce development models designed to address this need. The RMACC Student System Administrator Cohort provides hands-on training in research computing operations, bringing together students from multiple institutions to gain practical experience supporting production high-performance computing environments. In parallel, the AI Unlocked workshop series—developed through the National Artificial Intelligence Research Resource Pilot—introduces researchers and technical staff to applied AI workflows and access to national-scale computing resources through national and regional training events. Panelists will discuss lessons learned in designing these programs, approaches for scaling training beyond a single institution, and how regional and national initiatives can work together to build sustainable AI workforce pipelines.

 

Simplot A: 
Tuning Workflow Management Systems for shared HPC resources
Presenter- Nil Mu

Workflow management systems like Nextflow are increasingly popular among researchers building computational pipelines, but their default configurations rarely account for the realities of shared HPC clusters. Left untuned, these tools can flood schedulers with thousands of short-lived jobs, request resources they never use, or create bursty submission patterns that degrade cluster performance for all users. This presentation examines Nextflow resource management on SLURM clusters with a focus on the concerns that matter most to HPC operators: scheduler interaction, fair-share impact, resource efficiency, and cluster-wide utilization. Using a computationally demanding genome alignment pipeline as an example, we'll explore how executor configuration, process-level resource directives, and monitoring strategies affect not just individual pipeline performance but overall cluster health. We'll cover common anti-patterns we've encountered—over-provisioned memory requests, runaway task submissions, poor locality awareness—and the configuration and design patterns that prevent them. Whether you're supporting researchers who use workflow managers or evaluating how to integrate them into your site's policies and documentation, the goal is to give you practical knowledge for keeping these tools running well on shared infrastructure.

Simplot B: 
Research computing models of service in current changing times
Presenter-  Martin Cuma
The changing funding environment necessitates changes in how we provide and charge for research computing services. In this talk, we'll go over what we are discussing in Utah, including some level of operation re-charge, subscription plans, compute as a service with several priority levels, and persistent services on VMs. We hope to initiate discussion on these topics among the attendees to share their thoughts and experiences.

Simplot C: 
Promoting Cloud Native Development and Deployment at NCAR
Presenter- Kevin Hrpcek

Modern research environments often face increasing demands for agility and reproducibility, hindered by traditional, monolithic software architectures. This talk explores the transition to cloud-native computing by leveraging containers and Kubernetes to create portable, consistent computational infrastructures that decouple applications from underlying hardware. By adopting these technologies, research teams can ensure their workflows operate identically across diverse environments, from local development machines to production-grade clusters, thereby eliminating the notorious "works on my machine" problem.

The National Center for Atmospheric Research has deployed an internal Kubernetes based platform called CIRRUS and has been working to further modernize application development and deployments by promoting CI, CD, and GitOps practices. By implementing automated testing and declarative infrastructure management, teams can drastically reduce manual errors and accelerate the deployment of new methodologies. Ultimately, embracing these methodologies empowers researchers to focus on scientific discovery rather than logistical bottlenecks.

Simpot D: 
Beyond R1. Community support for smaller institutions
Presenter- Chris Reidy

National institutions that support research computing like CASC and CaRCC tend to be focused on the needs of large Universities. We talk about a project from CaRCC that is oriented to smaller institutions.

 

Simplot A: 
AI for Anvil & Anvil for AI: Case Studies in Research, Education, and Ethical Adoption
Presenter- Susanna Gardner

This presentation explores the mutually reinforcing relationship between artificial intelligence (AI) and high-performance computing (HPC) through applied research, education, and workforce development initiatives centered on Anvil, an NSF-funded national advanced computing resource. We highlight a 2025 NSF Research Experiences for Undergraduates (REU) case study in which undergraduate researchers used large language models (LLMs) to improve Anvil’s user support infrastructure by automatically generating FAQs from historical support tickets, demonstrating AI for Anvil. At the same time, Anvil’s scalable, production-grade environment enabled realistic AI model training and evaluation, illustrating Anvil for AI. Beyond research, the session addresses ethical AI implementation frameworks, governance considerations, and classroom integration. We also discuss expanding K-12 and educator outreach, including AI-enhanced CyberSafe Heroes and Code Explorers summer camps, and a new K-12 teacher in-service focused on practical AI in the classroom. Together, these efforts demonstrate a sustainable model for responsible AI adoption that spans national cyberinfrastructure, undergraduate research, and early-pipeline education.

Simplot B:
Pre-training and fine-tuning a semantically-enriched small language model
Presenter- Casey Kennington

Language models use word-level embeddings that are trained using text using a pre-train, fine-tune training and evaluation regime. In this presentation, we will see how the embeddings can be enriched with visual knowledge as they are pre-trained and fine-tuned on multiple linguistic tasks. Knowledge of python is needed; experience with torch and huggingface helps, but is not required. 

Simplot C: 
The ASU Slurm HPC Dashboard
Presenter- Johnathan Lee

This presentation will provide an overview of the Slurm HPC dashboard developed and used at ASU to monitor cluster utilization, GPU resources, and overall system health. I will explain how the dashboard supports day to day operations, demonstrate recently added features, and discuss the roadmap for future development as we continue expanding its capabilities to meet growing demands.

Simplot D: 
ACCESS Software Documentation Service (SDS) at CU Boulder 
Presenter- John Reiland

Are you struggling to provide comprehensive software documentation to users of your HPC system, or want to make your documentation easier to maintain? Join CU Boulder Research Computing (CURC)’s User Support Team for a demonstration of the ACCESS Software Documentation Service (SDS), a package designed to automate the creation and maintenance of software documentation for HPC clusters! We’ll give a demo of CURC’s implementation of the SDS, discuss implementation, and answer any questions you may have. 

 

Jordan Ballroom