NERSC

Artificial Intelligence for the Electron Ion Collider (AI4EIC)

(2024)

The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. This paper summarizes the different activities and R&D projects covered across the sessions of the workshop and provides an overview of the goals, approaches and strategies regarding AI/ML in the EIC community, as well as cutting-edge techniques currently studied in other experiments.

Quantum-centric supercomputing for materials science: A perspective on challenges and future directions

(2024)

Computational models are an essential tool for the design, characterization, and discovery of novel materials. Computationally hard tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their resources for simulation, analysis, and data processing. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of the computational tasks needed for materials science. In order to do that, the quantum technology must interact with conventional high-performance computing in several ways: approximate results validation, identification of hard problems, and synergies in quantum-centric supercomputing. In this paper, we provide a perspective on how quantum-centric supercomputing can help address critical computational problems in materials science, the challenges to face in order to solve representative use cases, and new suggested directions.

The Early Data Release of the Dark Energy Spectroscopic Instrument

(2024)

The Dark Energy Spectroscopic Instrument (DESI) completed its 5 month Survey Validation in 2021 May. Spectra of stellar and extragalactic targets from Survey Validation constitute the first major data sample from the DESI survey. This paper describes the public release of those spectra, the catalogs of derived properties, and the intermediate data products. In total, the public release includes good-quality spectral information from 466,447 objects targeted as part of the Milky Way Survey, 428,758 as part of the Bright Galaxy Survey, 227,318 as part of the Luminous Red Galaxy sample, 437,664 as part of the Emission Line Galaxy sample, and 76,079 as part of the Quasar sample. In addition, the release includes spectral information from 137,148 objects that expand the scope beyond the primary samples as part of a series of secondary programs. Here, we describe the spectral data, data quality, data products, Large-Scale Structure science catalogs, access to the data, and references that provide relevant background to using these spectra.

Distilling particle knowledge for fast reconstruction at high-energy physics experiments

(2024)

Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational resources, in particular on edge devices. In this article, we consider proton-proton collisions at the High-Luminosity Large Hadron Collider (HL-LHC) and demonstrate a successful knowledge transfer from an event-level graph neural network (GNN) to a particle-level small deep neural network (DNN). Our algorithm, DistillNet, is a DNN that is trained to learn about the provenance of particles, as provided by the soft labels that are the GNN outputs, to predict whether or not a particle originates from the primary interaction vertex. The results indicate that for this problem, which is one of the main challenges at the HL-LHC, there is minimal loss during the transfer of knowledge to the small student network, while improving significantly the computational resource needs compared to the teacher. This is demonstrated for the distilled student network on a CPU, as well as for a quantized and pruned student network deployed on an field programmable gate array. Our study proves that knowledge transfer between networks of different complexity can be used for fast artificial intelligence (AI) in high-energy physics that improves the expressiveness of observables over non-AI-based reconstruction algorithms. Such an approach can become essential at the HL-LHC experiments, e.g. to comply with the resource budget of their trigger stages.

New constraints on ultraheavy dark matter from the LZ experiment

(2024)

Searches for dark matter with liquid xenon time projection chamber experiments have traditionally focused on the region of the parameter space that is characteristic of weakly interacting massive particles, ranging from a few GeV/c2 to a few TeV/c2. Models of dark matter with a mass much heavier than this are well motivated by early production mechanisms different from the standard thermal freeze-out, but they have generally been less explored experimentally. In this work, we present a reanalysis of the first science run of the LZ experiment, with an exposure of 0.9 tonne×yr, to search for ultraheavy particle dark matter. The signal topology consists of multiple energy deposits in the active region of the detector forming a straight line, from which the velocity of the incoming particle can be reconstructed on an event-by-event basis. Zero events with this topology were observed after applying the data selection calibrated on a simulated sample of signal-like events. New experimental constraints are derived, which rule out previously unexplored regions of the dark matter parameter space of spin-independent interactions beyond a mass of 1017 GeV/c2.

Real‐time XFEL data analysis at SLAC and NERSC: A trial run of nascent exascale experimental data analysis

(2024)

X-ray scattering experiments using Free Electron Lasers (XFELs) are a powerful tool to determine the molecular structure and function of unknown samples (such as COVID-19 viral proteins). XFEL experiments are a challenge to computing in two ways: i) due to the high cost of running XFELs, a fast turnaround time from data acquisition to data analysis is essential to make informed decisions on experimental protocols; ii) data collection rates are growing exponentially, requiring new scalable algorithms. Here we report our experiences analyzing data from two experiments at the Linac Coherent Light Source (LCLS) during September 2020. Raw data were analyzed on NERSC's Cori XC40 system, using the Superfacility paradigm: our workflow automatically moves raw data between LCLS and NERSC, where it is analyzed using the software package CCTBX. We achieved real time data analysis with a turnaround time from data acquisition to full molecular reconstruction in as little as 10 min -- sufficient time for the experiment's operators to make informed decisions. By hosting the data analysis on Cori, and by automating LCLS-NERSC interoperability, we achieved a data analysis rate which matches the data acquisition rate. Completing data analysis with 10 mins is a first for XFEL experiments and an important milestone if we are to keep up with data collection trends.

HamPerf: A Hamiltonian-Oriented Approach to Quantum Benchmarking

(2024)

Quantum computing technologies are undergoing rapid development. The different qubit modalities being considered for quantum computing each have their strengths and weaknesses, making it challenging to compare their performance relative to each other and the state-of-the-art in classical high-performance computing. To better understand the utility of a given quantum processor and to assess when and how it will be able to advance the frontiers of computational science, researchers need a robust approach to quantum benchmarking. A variety of approaches have been proposed, many of which characterize the presence of noise in current quantum devices. These efforts include component-level performance metrics, such as randomized benchmarking and gate set tomography; high-level application-dependent metrics; and devicelevel metrics, such as the Quantum Volume. However, it remains unclear how low-level metrics, such as fidelities and decoherence times, and global device metrics, such as Quantum Volume, relate to the computational utility and practical limitations of quantum processors to solve useful problems. In this paper, we describe our Hamiltonian-oriented approach to quantum benchmarking called HamPerf. Where previous application-dependent approaches specify a suite of benchmarking circuits inspired by applications, we place the problem Hamiltonian at the center. Our strategy allows us to probe the computational performance of a quantum processor on standardized and relevant problem sets, agnostic of the algorithms and hardware used to solve them; it also provides fundamental insights into how device characteristics correlate with computational utility.

Parallel Runtime Interface for Fortran (PRIF) Specification, Revision 0.3

(2024)

This document specifies an interface to support the parallel features of Fortran, named the Parallel Runtime Interface for Fortran (PRIF). PRIF is a proposed solution in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the invocation of Fortran-level parallel features into procedure calls to the necessary PRIF procedures. The interface is designed for portability across shared- and distributed-memory machines, different operating systems, and multiple architectures. Implementations of this interface are intended as an augmentation for the compiler's own runtime library. With an implementation-agnostic interface, alternative parallel runtime libraries may be developed that support the same interface. One benefit of this approach is the ability to vary the communication substrate. A central aim of this document is to define a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.

Comparison of point cloud and image-based models for calorimeter fast simulation

(2024)

Score based generative models are a new class of generative models that have been shown to accurately generate high dimensional calorimeter datasets. Recent advances in generative models have used images with 3D voxels to represent and model complex calorimeter showers. Point clouds, however, are likely a more natural representation of calorimeter showers, particularly in calorimeters with high granularity. Point clouds preserve all of the information of the original simulation, more naturally deal with sparse datasets, and can be implemented with more compact models and data files. In this work, two state-of-the-art score based models are trained on the same set of calorimeter simulation and directly compared.

First constraints on WIMP-nucleon effective field theory couplings in an extended energy region from LUX-ZEPLIN

(2024)

Following the first science results of the LUX-ZEPLIN (LZ) experiment, a dual-phase xenon time projection chamber operating from the Sanford Underground Research Facility in Lead, South Dakota, USA, we report the initial limits on a model-independent nonrelativistic effective field theory describing the complete set of possible interactions of a weakly interacting massive particle (WIMP) with a nucleon. These results utilize the same 5.5 t fiducial mass and 60 live days of exposure collected for the LZ spin-independent and spin-dependent analyses while extending the upper limit of the energy region of interest by a factor of 7.5 to 270 keV. No significant excess in this high energy region is observed. Using a profile-likelihood ratio analysis, we report 90% confidence level exclusion limits on the coupling of each individual nonrelativistic WIMP-nucleon operator for both elastic and inelastic interactions in the isoscalar and isovector bases.

Computing Sciences

NERSC