6th Nek5000 User Meeting held at U Florida
The 6th Nek5000 User/Developer Meeting was held in Tampa, FL, April 17-18, 2018. The event was hosted by the DOE Center for Multiphase Turbulence, which is headed by Prof. S. Balachandar at the University of Florida. Thirty five researchers from industry, national labs, and American, Canadian, and European universities attended the event, which featured twenty three presentations and extensive discussions about current and new trends in Nek5000 development. Next month will mark the 10th anniversary of Nek5000 going open source.
Software release: CEED v1.0
The CEED team released its first software distribution, CEED 1.0 consisting of 12 integrated Spack packages for libCEED, mfem, nek5000, nekcem, laghos, nekbone, hpgmg, occa, magma, gslib, petsc and pumi plus a new CEED meta-package.
With Spack, a user can install the whole CEED software stack simply with:
spack install ceed.
For more information visit http://ceed.exascaleproject.org/ceed-1.0.
Software release: libCEED v0.2
libCEED is a lightweight portable library that allows, for the first time, a wide variety of applications (written in C, C++, Fortran) to share a wide variety of discretization kernels (CPU, GPU, OpenMP, OpenCL), including high-performance GPU kernels.
libCEED comes with several examples of its usage, ranging from standalone C codes in the
/examples/ceed directory to examples based on external packages, such as MFEM, PETSc and Nek5000. Below is an illustration how libCEED enables these very different codes (C++, C, F77) to take advantage of a common set of GPU kernels (see also the CEED 1.0 GPU demo):
# libCEED examples on CPU and GPU cd examples/ceed make ./ex1 -ceed /cpu/self ./ex1 -ceed /gpu/occa cd ../.. # MFEM+libCEED examples on CPU and GPU cd examples/mfem make ./bp1 -ceed /cpu/self -no-vis ./bp1 -ceed /gpu/occa -no-vis cd ../.. # PETSc+libCEED examples on CPU and GPU cd examples/petsc make ./bp1 -ceed /cpu/self ./bp1 -ceed /gpu/occa cd ../.. # Nek+libCEED examples on CPU and GPU cd examples/nek5000 ./make-nek-examples.sh ./run-nek-example.sh -ceed /cpu/self -b 3 ./run-nek-example.sh -ceed /gpu/occa -b 3 cd ../..
For more information visit https://github.com/CEED/libCEED.
Panayot Vassilevski named SIAM fellow
In CEED, Panayot is working on matrix-free preconditioners for high-order finite element discretizations.
Benchmark release by Paranumal team
The CEED group at Virginia Tech released standalone implementations of CEED's BP1.0, BP3.0, and BP3.5 benchmark problems. For results and discussion, see the ("CEED Code Competition: VT software release")[https://www.paranumal.com/single-post/2018/02/01/CEED-Code-Competition-bake-off-problems] entry in the Paranumal blog.
Workshop on Batched, Reproducible, and Reduced Precision BLAS
CEED's [UTK team]](magma.md) organized a two-session minisymposium at the SIAM Conference on Parallel Processing and Scientific Computing in Tokyo, Japan from March 7-10, 2018, devoted on Batched BLAS Standardization.
The minisymposium is part of our efforts on standardization and co-design of exascale discretization APIs with application developers, hardware vendors and ECP software technologies projects. The goal is to extend the BLAS standard to include batched BLAS computational patterns/"application motifs" that are essential for representing and implementing tensor contractions.
Besides participation from the CEED project, stakeholders from ORNL, Sandia, NVidia, Intel, IBM, and Universities were invited.
New website launched by the Virginia Tech CEED team
A new website was recently launched by the Parallel Numerical Algorithms (Paranumal) research group at Virginia Tech here. The site includes a blog that gives some practical computing tips related to high performance implementations of finite element methods developed as part of the CEED project here.
Initial release of libCEED: The CEED API library
The initial version of libCEED, the CEED API library, was released in December 2017.
libCEED is a high-order API library, that for the first time provides a common operator description on algebraic level, that allows a wide variety of applications to take advantage of the efficient operator evaluation algorithms in the different CEED packages (from a single source).
Our long-term vision for libCEED is to include a variety of back-end implementations, ranging from simple reference kernels, to highly optimized kernels targeting specific devices (e.g. GPUs) or specific polynomial orders.
For more information visit https://github.com/CEED/libCEED.
Software release: MFEM v3.3.2
Version 3.3.2 of MFEM was released on November 10, 2017. Some of the new additions in this release are:
- Support for high-order mesh optimization based on the target-matrix optimization paradigm from the ETHOS project.
- Implementation of the community policies in xSDK, the Extreme-scale Scientific Software Development Kit.
- Integration with the STRUMPACK parallel sparse direct solver and preconditioner.
- Several new linear interpolators, five new examples and miniapps.
- Various memory, performance, discretization and solver improvements, including physical-to-reference space mapping capabilities.
- Continuous integration testing on Linux, Mac and Windows.
CEED participates in xSDK and FASTMath
MFEM and PUMI are also part of the FASTMath institute in the SciDAC program, see https://fastmath-scidac.llnl.gov/software-catalog.html.
Software release: Nek5000 v17.0
Nek5000 version 17.0 was released as a major upgrade to Nek5000. Major features improvements include:
- Refactored build system.
- New user-input parameter file format (
- Characteristics (large time-step) support for moving mesh problems.
- Moving mesh support for the $PN-PN$ formulation.
- Improved stability for $PN-PN$ with variable viscosity.
- Support for mixed
- New fast
AMG setuptool based on HYPRE.
- New interface to
libxsmm(fast MATMUL library).
lowMachsolver for time varying thermodynamic pressure.
- Added DG for scalars.
- Reduced solver initialization time (parallel binary reader for all input files).
- Automatic general mesh-to-mesh transfer for restarts.
- Refactored support for overlapping domains (NekNek).
- Added high-pass filter relaxation (alternative to explicit filter).
- Refactored residual projection including support for coupled Helmholtz solves.
Nekbone and Laghos join proxy app suites
Both miniapps were also picked to be CORAL-2 benchmarks.
Laghos was also selected as one of LLNL's ASC co-design miniapps.
6th Nek5000 User Meeting to be held at U Florida
The 6th Nek5000 User/Developer Meeting will be hosted by the DOE PSAAP-II Compressible Multiphase Turbulence (CMT) center in Tampa, FL, March 17-18, 2018.
Nek5000 hackathon at UIUC
The inaugural Nek5000 Hackathon was held at NCSA Building, University of Illinois, Urbana-Champaign (UIUC), IL on Nov 12-14, 2017.
The event was attended by researchers and Nek5000 developers to promote the application of Nek5000 to new problems from industry, national laboratories, and academia. Twenty-five participants spent three days working on setting up new examples, developing new features, and helping one another to get maximum performance on their applications. Some of the more prominent exchanges of ideas included standardization of synthetic turbulent inflow techniques, use of CVODE for pure advection-diffusion problems, and the use of the characteristics methods for moving geometry applications.
For more details, see the Nek5000 hackathon website.
CEED organizing minisymposium at ICOSAHOM 2018
CEED is organizing a minisymposium, Efficient High-Order Finite Element Discretizations at Large Scale, at the International Conference on Spectral and High-Order Methods (ICOSAHOM 2018) in London UK, Jul 9-13, 2018.
The goal of the minisymposium is to discuss the next-generation high-order discretization algorithm and software, based on finite/spectral element approaches that will enable a wide range of important scientific applications to run efficiently on future architecture.
Best Paper Award at NURETH-17
CEED researchers (P. Fischer, E. Merzari, A. Obabko) won a Best Paper Award at the 17th International Topical Meeting on Nuclear Reactor Thermal Hydraulics (NURETH-17), held in China in September 2017, with a paper entitled High-Fidelity Simulation of Flow Induced Vibrations in Helical Steam Generators for Small Modular Reactors.
CEED attending Cray, AMD and Intel Deep-Dives
- Cray deep-dive in Bloomington, MN on Oct 18-19
- AMD deep-dive in Austin, TX on Oct 24-25
- Intel deep-dive in Hudson, MA on Oct 21-Nov 2
Topics of discussion include advanced technology and memory design, strong scaling considerations and the porting and evaluation of CEED's bake-off problems and miniapps (Nekbone, Laghos, NekCEM CEEDling).
New Nekbench repository
The user provides ranges for important parameters ranges (e.g., processor counts and local problem size ranges) and a test type (e.g., scaling or ping-pong test). Nekbench will run the given test in the given parameter space using a Nek5000 case file which is also given by the user (in the ping-pong tests, the case file is optional).
Nekbench is written using bash scripting language and runs any Unix-like operating system that supports bash. It has been successfully tested on Linux laptops/desktops, ALCF Theta, NERSC Cori (KNL and Haswell), and NERSC Edison machines for scaling tests.
Planned extensions for Nekbench include adding more machine types like ANL's Cetus, additional support for the ping-pong test type, and automated plot generation (e.g., scaling study graphs) for each test run.
GPU ports of Nek and Laghos
GPU acceleration is a main focus of the performance optimization efforts in CEED. Recent progress in this direction include GPU ports of CEED's Nek5000 application and the Nekbone and Laghos miniapps.
For Nek5000, an initial GPU-enabled version has been developed based on OpenACC. For Nekbone a pure OpenACC implementation as well as a hybrid OpenACC/CUDA implementation with a CUDA kernel for matrix-vector multiplication has been developed.
For Laghos, a GPU-enabled version of has been released using the OCCA interface. With this approach, the user is able to run Laghos distributively using varying device types per MPI process, whether serial C++, OpenMP, or CUDA.
First CEED annual meeting held at LLNL
CEED held its first annual meeting in August, 2017 at the HPC Innovation Center of Lawrence Livermore National Laboratory.
The goal of the meeting was to report on the progress in the center, deepen existing and establish new connections with ECP hardware vendors, ECP software technologies projects and other collaborators, plan project activities and brainstorm/work as a group to make technical progress.
In addition to gathering together many of the CEED researchers, the meeting included representatives of the ECP management, hardware vendors, software technology and other interested projects.
CEED researchers at ATPESC17
Six CEED researchers presented at the 2017 edition of the Argonne Training Program on Extreme-Scale Computing, now part of the Exascale Computing Project.
The CEED presentations covered a wide variety of topics, from overview of Theta, to GPU programming, dense and sparse linear algebra, and high-order discretizations on unstructured meshes.
Videos of all 2017 talks are available on YouTube. CEED researchers have also participated in past editions of the meeting.
CEED BPs and benchmarks repository released
CEED released an initial set of bake-off (BP) problems, which are simple kernels designed to test and compare the performance of high-order codes, both internally in CEED, as well as in the broader high-order community.
In addition to the benchmark descriptions on the CEED BPs page, a benchmarks repository is publicly available with several implementations of the CEED bake-off problems. Currently, MFEM, Nek5000 and deal.ii are included, see directories tests/mfem_bps, tests/nek5000_bps and tests/dealii_bps respectively.
New Laghos and NekCEM CEEDling miniapps released
Two new miniapps developed in CEED were released in June 2017: Laghos and NekCEM CEEDling.
Laghos (LAGrangian High-Order Solver) is a new miniapp developed in CEED that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. In CEED, Laghos serves as a proxy for a sub-component of the MARBL/LLNLApp application.
NekCEM CEEDling is a new NekCEM miniapp, solving the time-domain Maxwell equation for electromagnetic systems.
Paper with MPICH at SC17
Joint paper with the MPICH group, Why is MPI so Slow? Analyzing the fundamental limits in implementing MPI-3.1 accepted in Supercomputing 2017. The paper provides an in-depth analysis of the software overheads in the MPI performance-critical path and exposes mandatory performance overheads that are unavoidable based on the MPI-3.1 specification.
STRUMPACK support in MFEM
Support for the sparse direct solver and preconditioner STRUMPACK has been integrated in MFEM.
STRUMPACK is being developed at LBNL and is part of the ECP project Factorization Based Sparse Solvers and Preconditioners (Xiaoye Sherry Li and Pieter Ghysels). The STRUMPACK solver is based on multifrontal sparse Gaussian elimination and uses hierarchically semi-separable matrices to compress fill-in. It can be used as an exact direct solver or as an algebraic, robust and parallel preconditioner for a range of discretized PDE problems.
2017 PETSc User Meeting
Over 75 participants from all over the world attended the PETSc User Meeting, held June 14-16 in Boulder, CO. Hosted by the University of Colorado Boulder, the event consisted of a one-day tutorial on the solver library PETSc and showcased the latest research enabled by the functionality available in PETSc. The meeting agenda covered a total of 15 talks, four posters, and two panels.
Thanks to generous support from Intel and Tech-X, 22 students received travel grants and got to learn about the latest techniques on the large-scale numerical solution of partial differential equations.
PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It has become one of the most widely used numerical software packages of its kind and has users in application areas ranging from acoustics and arterial flow to seismology and semiconductors.
GPU Hackathon at BNL
Nek/CEED team participated the GPU Hackathon 2017 that was held in Brookhaven National Laboratory on June 5-9, 2017. Our team focused on performing and tuning GPU-enabled Nek5000/Nekbone/NekCEM version on large-scale GPU systems for small modular reactor, thermal fluids, and meta-materials modeling.
Workshop on Batched, Reproducible, and Reduced Precision BLAS
The second Workshop on Batched, Reproducible, and Reduced Precision BLAS was held in Atlanta, GA on February 23-25, 2017 including many members of the CEED MAGMA team.
The goal of this workshop was to touch on extending the Basic Linear Algebra Software Library (BLAS). The existing BLAS have proven to be very effective in assisting portable, efficient software for sequential and some of the current class of high-performance computers. New computational needs in many applications have motivated the need to investigate the possibility of extending the currently accepted standards to provide greater parallelism for small size operations, reproducibility, and reduced precision support.
Of particular interest to CEED is the use of batched BLAS for finite element tensor contractions, and thus our team is interested in the establishment of a batched BLAS standard, highly-optimized implementations, and support from vendors on various architectures.
This is the second workshop of an open forum to discuss and formalize details related to batched, reproducible, and reduced precision BLAS. The agenda and the talks from the first workshop can be found here.
Software release: MFEM v3.3
Version 3.3 of MFEM, a lightweight, general, scalable C++ library for finite element methods and a main partner in CEED, was released on January 28, 2017 at http://mfem.org
The goal of MFEM is to enable high-performance scalable finite element discretization research and application development on a wide variety of platforms, ranging from laptops to exascale supercomputers.
It has many features, including:
- 2D and 3D, arbitrary order H1, H(curl), H(div), L2, NURBS elements.
- Parallel version scalable to hundreds of thousands of MPI cores.
- Conforming/nonconforming adaptive mesh refinement (AMR), including anisotropic refinement, derefinement and parallel load balancing.
- Galerkin, mixed, isogeometric, discontinuous Galerkin, hybridized, and DPG discretizations.
- Support for triangular, quadrilateral, tetrahedral and hexahedral elements, including arbitrary order curvilinear meshes.
- Scalable algebraic multigrid, time integrators, and eigensolvers.
- Lightweight interactive OpenGL visualization with the MFEM-based GLVis tool.
Some of the new additions in version 3.3 are:
- Comprehensive support for the linear and nonlinear solvers, preconditioners, time integrators and other features from the PETSc and SUNDIALS suites.
- Linear system interface for action-only linear operators including support for matrix-free preconditioning and low-order-refined spaces.
- General quadrature and nodal finite element basis types.
- Scalable parallel mesh format.
- Thirty six new integrators for common families of operators.
- Sixteen new serial and parallel example codes.
- Support for CMake, on-the-fly compression of file streams, and HDF5-based output following the Conduit mesh blueprint specification.
CEED co-design center announced
The Exascale Computing Project (ECP) announced on November 11, 2016 its selection of four co-design centers, including CEED: the Center for Efficient Exascale Discretizations, which is a research partnership between Lawrence Livermore National Laboratory; Argonne National Laboratory; the University of Illinois Urbana-Champaign; Virginia Tech; University of Tennessee, Knoxville; Colorado University, Boulder; and the Rensselaer Polytechnic Institute (RPI).
R&D 100 Award for NekCEM / Nek5000
NekCEM/Nek5000: Scalable High-Order Simulation Codes received a 2016 R&D 100 Award, given by R&D Magazine to 100 top new technologies for the year.
The R&D 100 citation reads: "NekCEM/Nek5000: Release 4.0: Scalable High-Order Simulation Codes is an open-source simulation-software package that delivers highly accurate solutions for a wide range of scientific applications including electromagnetics, quantum optics, fluid flow, thermal convection, combustion and magnetohydrodynamics. It features state-of-the-art, scalable, high-order algorithms that are fast and efficient on platforms ranging from laptops to the world’s fastest computers. The size of the physical phenomena that can be simulated with this package ranges from quantum dots for nanoscale devices to accretion disks surrounding black holes. NekCEM provides simulation capabilities for the analysis of electromagnetic and quantum optical devices, such as particle accelerators and solar cells. Nek5000 provides turbulent flow simulation capabilities for a variety of thermal-fluid problems including nuclear reactors, internal combustion engines, vascular flows, and ocean currents."
See the ANL press release for more information.