CEED Annual Meeting
August 1-3, 2023
Lawrence Livermore National Lab + Virtual
CEED will hold its seventh and final annual meeting August 1-3, 2023 in a hybrid format: in-person at Lawrence Livermore National Laboratory and virtually using ECP Zoom for videoconferencing and Slack for side discussions. We strongly encourage you to join us in person if you can!
The goal of the meeting is to report on the progress in the center, deepen existing and establish new connections with ECP hardware vendors, ECP software technologies projects and other collaborators, plan project activities and brainstorm/work as a group to make technical progress. In addition to gathering together many of the CEED researchers, the meeting will include representatives of the ECP management, hardware vendors, software technology, and other interested projects.
The meeting will take place at the University of California Livermore Collaboration Center (UCLCC) which is just outside of LLNL's East Gate.
The meeting will include the following elements:
- Project review and updates from the CEED team
- Contributed talks from AD, ST, vendors, and external partners
- Technical discussions in small breakout groups
In addition to the in-person meeting, we will provide the following options for remote participants:
- Live presentations and wrap-up will be on ECP Zoom (link to be posted the week of the meeting).
- Side discussions and breakout sessions will be on the meeting Slack space -- Please [join in advance].
The meeting activities will take place 9:00am-5:00pm Pacific time.
Tuesday, August 1
Tim Germann (LANL, ECP Co-Design Lead)
|9:05-9:30||Welcome & CEED Overview
Tzanio Kolev (LLNL)
|9:30-10:00||Exascale Poisson Solvers for Incompressible Navier-Stokes
Paul Fischer (UIUC)
We will present user experiences on peta- and exascale platforms.
Tim Warburton (Virginia Tech)
We will describe the scope of the libParanumal project including new features targeting linear solver efficiency.
|11:00-11:30||Finite Element Thrust & MFEM Update
Veselin Dobrev (LLNL)
This talk will provide a summary of the Finite Element thrust activities during the last year of the CEED project with a main focus on new developments in the MFEM software library.
|11:30-12:00||Applications Thrust & Nek Update
Misun Min (ANL)
We will present challenging application problems and recent advances in performance and optimization on exascale architectures.
|1:00-1:30||How Numerical Simulations Helped to Achieve Breakeven in ICF Implosions on the National Ignition Facility
Marty Marinak (LLNL)
The inertial confinement fusion (ICF) program relies upon detailed simulations with radiation hydrodynamic codes to design targets and to interpret the experimental results. The simulations treat as much physics from essential principles as is practical, including laser deposition, cross beam energy transfer, x-ray production and transport, NLTE kinetics, thermal transport, hydrodynamic instabilities, thermonuclear burn, and transport of reaction products. These simulations were used to optimize the target designs, making use of increased laser energy, to give them greater robustness. Preshot predictions of the first experiment that surpassed breakeven yielded thermonuclear burn conditions matched the experimental results reasonably well, with a target gain of ~1.5. We will cover the key developments in radiation hydrodynamic codes and modeling methodologies that enabled these simulations.
|1:30-2:00||ExaSMR Update: Exascale Multiphysics Reactor Simulations
Elia Merzari (Penn State)
ExaSMR integrates reliable numerical methods for modeling reactors using Monte Carlo transport for neutron flux distribution (OpenMC) and high-resolution computational fluid dynamics (NekRS) for thermal fluid heat transfer. It aims to run efficiently on exascale systems, benefiting nuclear vendors and the broader nuclear community by generating detailed virtual datasets. The exascale challenge problem involves predicting reactor conditions for a small modular reactor. This talk will discuss recent Multiphysics simulations that achieved and surpassed the ECP targets. We will also discuss how these simulations are expected to impact the field of nuclear engineering and provide future perspectives.
|2:00-2:30||An Exascale Workflow to Connect Variations in Additive Manufacturing Process-Aware Built Microstructures to Variations in Part Scale Properties
Robert Carson (LLNL)
Additively manufactured (AM) metals produce complex grain morphologies that differ drastically from grain morphologies found in traditional manufacturing techniques. In term, the mechanical response can also vary quite a bit from traditional techniques and also from other AM builds due to varying processing conditions. The ExaAM project is developing a number of high-fidelity simulation capabilities and workflows that goes from the melt pool physics all the way up to part scale response. The aim of the ExaAM project is to provide predictive simulation results that can guide the AM design process and help accelerate part qualifications. Within this study, we will examine the workflows necessary to connect process-aware built microstructures to macroscopic yield surfaces used in part scale simulations using ExaConstit. Results from this study will be examined using simulations of the NIST IN625 AMB2018-01 part build and will utilize the new exascale computer, Frontier, located at ORNL.
|2:30-3:00||Conjectures in Economics for Fluids and Structures
Jed Brown (Boulder)
Data structures and algorithms have changed the relative costs, but few production pipelines have internalized the new economics of simulation. Meanwhile, there is frequent dispute among practitioners of when to apply linearizations and assumptions on physical regime, when to use structured vs unstructured grids, and many other important design choices in a simulation tool. We reflect on the computational cost, robustness, and user interface consequences of "simplifying" this decision landscape by embracing fully nonlinear formulations with unstructured meshes.
|3:00-3:30||Group Photo & Coffee Break
|3:30-4:00||Matrix-Free Preconditioners for High-Order H(div) Discretizations
Will Pazner (Portland State)
Many problems of physical relevance are posed in the Sobolev space H(div), including porous media flow, radiation diffusion, and magnetohydrodynamics. The iterative solution of the large linear systems that result from high-order finite element discretizations of problems in H(div) remains challenging. In this talk, I will describe the construction of "matrix-free" solvers for high-order grad-div and Darcy problems in H(div). These solvers are designed to work on meshes of high-order tensor-product elements (quadrilaterals and hexahedra), and make use of properties of one-dimensional interpolation and histopolation operators. We will show that the condition number of the preconditioned system is bounded independent of polynomial degree and mesh size, and so the number of iterations is asymptotically O(1), resulting in a solver with quasi-optimal computational complexity. High-performance and GPU-accelerated implementations will be discussed.
|4:00-4:30||Hardware Thrust & MAGMA Update
Stanimire Tomov and Natalie Beams (UTK)
In this presentation, we will provide an update on the main tasks and activities of the CEED Hardware thrust, including the latest developments in MAGMA and the MAGMA backend for libCEED. We will discuss optimizing the MAGMA Backend for Frontier/Aurora, with improvements in the batch matrix-vector multiplication kernel and utilizing Tensor Cores/Matrix Cores for enhanced GEMM performance. We will also cover porting the MAGMA Backend to Intel GPUs, mixed-precision optimizations in CEED, mini-application development, integration of MFEM-Ginkgo for distributed solvers, autotuning the MAGMA backend, exploring SYCL backends in libCEED, and ongoing work in porting the entire MAGMA library to SYCL/oneAPI.
|4:30-5:00||Tools for MFEM-Based, High-Order Adaptive Simulations with Application to RF Plasma Heaters in Tokamak Fusion Reactors
Aditya Joshi, Cameron Smith, Matthew McCall, and Mark Shephard (RPI)
This presentation will review recent developments of the conforming mesh adaptation procedures available to MFEM users. These developments include progress on supporting GPU based conforming mesh adaptation on the DOE exascale machines and supporting curved mesh adaptation where the mesh entities on curved boundaries can be up to sixth order and interior mesh entities can use up to cubic geometry. The presentation will also overview the use of the developed capabilities in a complete simulation workflow of RF plasma heaters in tokamak fusion reactors. The steps in the simulation workflow include defeaturing of un-needed details from antenna CAD models; combining the antenna, reactor wall and physics components into a single analysis model geometry; applying physical attributes to the analysis model; automatically generating a graded mesh; and executing an MFEM-based adaptive finite element analysis that includes the application of iterations of finite element solve, a posteriori error estimation, and mesh enrichment.
|5:00||Day 1 Wrap-up|
Wednesday, August 2
|9:00-9:30||FUSE: A Face Upwinded Spectral Element Method for Conversation Laws
Per-Olof Persson (UC Berkeley)
We present a new high-order accurate discretization on unstructured meshes of quadrilateral meshes. Our Face Upwinded Spectral Element (FUSE) method uses the same node distribution as a high-order continuous Galerkin (CG) method, but with a particular choice of high-order node locations within each element and an upwinded stencil on the face nodes. This results in a number of benefits, including fewer degrees of freedom, straight-forward integration with CG, and a highly sparse line-based connectivity pattern. We present the derivation of the scheme and the analysis of its properties, in particular stability and conservation for arbitrary polynomial degrees. We show numerous numerical evidence for its accuracy, efficiency, and high sparsity compared to traditional schemes, on multiple classes of problems including convection-dominated flows, the Euler equations, and the incompressible Navier-Stokes equations.
|9:30-10:00||On the Implementation of Small Dense Matrix Multiplications on Intel GPUs
Freddie Witherden (Texas A&M)
In this talk I will outline our experiences developing small dense matrix multiplication kernels for Intel GPUs. Particular attention will be paid to the microarchitectural differences between Intel GPUs and those from AMD and NVIDIA, particularly around the need to rely on cache in lieu of shared memory. As part of this I will describe the libysmm library which is able to obtain up to 75% of peak for single precision multiplications on consumer GPUs.
|10:30-11:00||Leveraging GPUs for Multi-Scale Modeling in Storm and Hurricane Simulations
Frank Giraldo (NPS)
This talk will describe our approach to leveraging GPUs for getting around the cost of the required Large-Eddy-Simulations (LES) for capturing extreme events in high-resolution weather prediction. The NUMA and xNUMA models form the basis of our element-based Galerkin research codes used for this work. The goal of this work is to resolve hurricanes without paying the excessive price currently required to capture all spatial scales. We aim to do this by using 1) adaptive mesh refinement (AMR), 2) exploring multi-scale modeling framework (MMF), 3) employing machine learning algorithms as proxies for the MMF approach, and 4) utilizing current computer hardware.
|11:00-11:30||Preparing Algebraic Multigrid Solvers in hypre for Exascale Computers
Rui Peng Li (LLNL)
The emerging exascale computers provide opportunities to perform much larger scale simulations to obtain more accurate solutions than ever before. The increasing complexities of heterogeneous accelerators on such platforms have made the development of sparse linear solvers challenging to achieve high performance. In this talk, we will discuss the porting strategies, new developments and performance optimizations of the Multigrid solvers in hypre in preparation for the exascale computers with the results from real application codes.
|11:30-12:00||Recent Development in PETSc GPU Support
Jacob Faibussowitsch and Junchao Zhang (ANL)
This talk includes two parts. In the first part, we will introduce some new features that were recently added in PETSc to support GPUs. We will give a status update on PETSc GPU support with CUDA, HIP and SYCL. In the second part, we provide a detailed discussion of a new transparently asynchronous programming model for use in PETSc and beyond. We begin with an overview of the model, discuss its implementation, and finally conclude with concrete performance results.
|1:00-1:30||Multiplicative Smoothers for Multi-Level Solvers: The Quadratic Eigenvalue Problem for ORAS
Stephen Thomas (AMD)
We have re-formulated the Gram-Schmidt and GMRES algorithms -- thereby greatly enhancing the strong scaling properties - and we extended and proved new backward error results - by introducing the IGS-GMRES variant and now MGS-CGS GMRES. Based on these ideas we have introduced composite smoothers (L1-Jacobi + Gauss-Seidel) for algebraic multigrid and optimized Schwarz - which lead to 3x faster solve times on GPU architectures . The Gershgorin circle theorem shows how these improve with the problem size. With Erika Strakova in Ostrava - we are finally making progress on optimized RAS by extending the work of St-Cyr, Gander and Thomas (2007) to irregular domains, and this channels Barry Smith et al. (1996), by applying variational techniques to the FEM boundary conditions in integral form and solves the quadratic eigenvalue problem for the optimal transmission conditions. Numerical experiments confirm the decrease in the number of GMRES iterations.
Group discussion on topics such as:
• CPUs vs GPUs, AMD and Intel GPUs, ARM processors
• preconditioning, strong scaling, meshing, visualization, etc.
|3:30-4:00||GPU-Capable Sparse Direct Solvers
Pieter Ghysels (LBNL)
We present recent progress in porting the sparse direct solver STRUMPACK to the modern GPU architectures found at today's leadership compute facilities: OLCF's Frontier and NERSC's Perlmutter. STRUMPACK also provides several preconditioners, based on sparse multifrontal factorization with rank-structured approximations of the frontal matrices, the dense sub-blocks in the sparse triangular factors. We present multi-GPU acceleration of the block low rank preconditioner, and show near-linear complexity of the hierarchically off-diagonal butterfly compression-based preconditioner for several PDEs including high frequency Helmholtz and a singularly perturbed reaction diffusion problem.
|4:00-4:30||Hybrid p-Multigrid and Low-Order Refined Preconditioners for the High-Order Poisson Equation
Malachi Phillips (UIUC)
The solution to the Poisson equation arising from the spectral element discretization of the incompressible Navier-Stokes equation requires robust preconditioning strategies. Low-order refined preconditioners, however, provide preconditioners with constant, bounded condition numbers. We propose a hybrid p-multigrid and low-order refined preconditioner that improve the time-to-solution by as much as 86% compared to the low-order preconditioner. We demonstrate the effectiveness of this approach on a variety of problems arising from the spectral element discretization of the incompressible Navier-Stokes equations on GPU architectures spanning to P > 1024 NVIDIA V100 GPUs on Summit.
|4:30-5:00||MPICH for Low-Latency Communication on Exascale Systems
Thomas Gillis (ANL)
PDE solvers rely on repeated communication patterns, unchanged over a few iterations. Further, the messages size at play is usually small. In this context, the communication is then latency-bound and the latter plays a key role in the performance at scale. In this presentation, we detail two mechanisms to reduce the latency of communication: MPIX_Stream (mpich only) and MPI-RMA. First, MPIX_Stream has been introduced in mpich recently and allows the user to specify an MPI context ID, which is crucial to achieve performance when using multiple threads. Second, MPI-RMA provides a low-latency interface when used for certain communication patterns. Specifically, we will outline the MPI RMA semantics, compare it to the commonly used point-to-point one, and provide examples as well as expected performance gain in the context of distributed PDE solvers.
|5:00||Day 2 Wrap-Up|
Thursday, August 3
|9:00-9:30||El Capitan Update
David Richards (LLNL)
Preparations for El Capitan are accelerating as the delivery date approaches. This talk will review the status of El Capitan preparations including the hardware and system software as well as application readiness.
|9:30-10:00||Efficient High-Dimension High-Order Matrix-Free Discontinuous Galerkin Methods for High-Fidelity Physics
Yohann Dudouit (LLNL)
The computational modeling of high-fidelity physics problems in radiation, electron, and neutron transport for inertial confinement fusion often involves the solution of large-scale, high-dimensional partial differential equations. Traditional approaches using matrix-based discontinuous Galerkin methods can be limited by their computational complexity and memory requirements. To overcome these challenges, this presentation introduces a novel high-dimension high-order matrix-free discontinuous Galerkin method for deterministic transport problems. The method leverages our general discrete ordinates (GSN) framework, which is integrated on top of the matrix-free approach and uses a natural r-adaptivity with the advection vector to eliminate numerical error artifacts and significantly improve accuracy. This approach combines high-order accuracy with efficient computational and memory utilization, making it ideal for the simulation of complex high-dimensional problems in the field of inertial confinement fusion. The presentation will showcase the results of this matrix-free discontinuous Galerkin method with GSN framework in MFEM applied to high-dimensional radiation transport problems, highlighting its efficiency, accuracy, and potential for enabling high-fidelity simulations in this field.
Nicole Marsaglia (LLNL)
This presentation will focus on updates related to the flyweight in situ analysis and visualization infrastructure, Ascent. Topics include new capabilities to support HPC simulation codes, merging co-developed codes into Ascent to simplify development, necessary updates for executing on Frontier, as well as a look towards what is next.
|11:00-11:30||Visualization of Novel and Higher Order Discretizations using VTK and ParaView
David Thompson and Corey Wetterer-Nelson (Kitware)
Many visualization tools, including VTK and ParaView, have adopted data models that focused on discretizations in common use years ago. Advances in large-scale solvers have fueled a new class of discretizations that make different assumptions than prior data models. This makes accurate visualization a difficult exercise, requiring significantly more memory to faithfully represent new discretizations in the original data model. Recently, Sandia National Laboratories (SNL) and the Army Engineering Research and Development Center (ERDC) have funded efforts to modernize the data model in VTK and, by extension, ParaView. We will discuss and demonstrate these.
|11:30-12:00||A Geometry-Based Approach to Initializing High Order Volume Fractions in Multimaterial Simulations
Kenneth Weiss (LLNL)
Geometric setup for multimaterial simulations can be error prone and time consuming. While we would ideally like to conformally mesh the shapes for all materials, this is often infeasible or impractical. In this talk, we discuss workflows for describing and initializing high order volume fractions for materials with curved interfaces using Axom's Klee and Quest components.
|12:00-5:00||Additional Meetings and Discussions|
Registration closed on July 17, 2023.
There are many hotels in Livermore, and others are available in Pleasanton and nearby cities. See LLNL's recommended list of area hotels or this Google Maps search. If you stay outside of Livermore, we recommend staying west of the city to have a reverse commute to the Lab.
About Livermore and LLNL
Founded in 1869, Livermore is California's oldest wine region, framed by award-winning wineries, farmlands, and ranches that mirror the valley's western heritage. As home to renowned science and technology centers, Lawrence Livermore and Sandia national labs, Livermore is a technological hub and an academically engaged community. It has become an integral part of the Bay Area, successfully competing in the global market powered by its wealth of research, technology, and innovation.
For more than 70 years, LLNL has applied science and technology to make the world a safer place. World-class facilities include the National Ignition Facility, the Advanced Manufacturing Laboratory, and the Livermore Computing Center hosting the Sierra supercomputer and home of the future exascale machine, El Capitan.
For questions, please contact the meeting organizers at email@example.com.