6th Nek5000 User Meeting held at U Florida

The 6th Nek5000 User/Developer Meeting was held in Tampa, FL, April 17-18, 2018. The event was hosted by the DOE Center for Multiphase Turbulence, which is headed by Prof. S. Balachandar at the University of Florida. Thirty five researchers from industry, national labs, and American, Canadian, and European universities attended the event, which featured twenty three presentations and extensive discussions about current and new trends in Nek5000 development. Next month will mark the 10th anniversary of Nek5000 going open source.

Nek5000 2018 meeting picture

Software release: CEED v1.0

The CEED team released its first software distribution, CEED 1.0 consisting of 12 integrated Spack packages for libCEED, mfem, nek5000, nekcem, laghos, nekbone, hpgmg, occa, magma, gslib, petsc and pumi plus a new CEED meta-package.

With Spack, a user can install the whole CEED software stack simply with: spack install ceed.

As part of CEED 1.0, the team developed comprehensive documentation including software and compiler configurations for ALCF, OLCF, NERSC and LLNL:

Platform Architecture Spack Configuration
Mac darwin-x86_64 packages
Linux (RHEL7) linux-rhel7-x86_64 packages
Cori (NERSC) cray-CNL-haswell packages
Edison (NERSC) cray-CNL-ivybridge packages
Theta (ALCF) cray-CNL-mic_knl packages
Titan (OLCF) cray-CNL-interlagos packages
CORAL-EA (LLNL) blueos_3_ppc64le_ib packages   compilers
TOSS3 (LLNL) toss_3_x86_64_ib packages   compilers

For more information visit

Software release: libCEED v0.2

Version 0.2 of libCEED, the CEED API library, was released in March 2018 with major improvements in the OCCA backend.

libCEED is a lightweight portable library that allows, for the first time, a wide variety of applications (written in C, C++, Fortran) to share a wide variety of discretization kernels (CPU, GPU, OpenMP, OpenCL), including high-performance GPU kernels.

libCEED comes with several examples of its usage, ranging from standalone C codes in the /examples/ceed directory to examples based on external packages, such as MFEM, PETSc and Nek5000. Below is an illustration how libCEED enables these very different codes (C++, C, F77) to take advantage of a common set of GPU kernels (see also the CEED 1.0 GPU demo):

# libCEED examples on CPU and GPU
cd examples/ceed
./ex1 -ceed /cpu/self
./ex1 -ceed /gpu/occa
cd ../..

# MFEM+libCEED examples on CPU and GPU
cd examples/mfem
./bp1 -ceed /cpu/self -no-vis
./bp1 -ceed /gpu/occa -no-vis
cd ../..

# PETSc+libCEED examples on CPU and GPU
cd examples/petsc
./bp1 -ceed /cpu/self
./bp1 -ceed /gpu/occa
cd ../..

# Nek+libCEED examples on CPU and GPU
cd examples/nek5000
./ -ceed /cpu/self -b 3
./ -ceed /gpu/occa -b 3
cd ../..

For more information visit

Panayot Vassilevski named SIAM fellow

Panayot Vassilevski, who is part of the CEED team at LLNL, was named a 2018 SIAM fellow for his work on designing algebraic approaches for creating and analyzing multilevel algorithms.


Panayot is the editor-in-chief for Numerical Linear Algebra with Applications and has published many papers and a monograph on Multilevel Block Factorization Preconditioners.

In CEED, Panayot is working on matrix-free preconditioners for high-order finite element discretizations.

Congratulations Panayot!

Benchmark release by Paranumal team

The CEED group at Virginia Tech released standalone implementations of CEED's BP1.0, BP3.0, and BP3.5 benchmark problems. For results and discussion, see the ("CEED Code Competition: VT software release")[] entry in the Paranumal blog.

Workshop on Batched, Reproducible, and Reduced Precision BLAS

CEED's [UTK team]]( organized a two-session minisymposium at the SIAM Conference on Parallel Processing and Scientific Computing in Tokyo, Japan from March 7-10, 2018, devoted on Batched BLAS Standardization.

The minisymposium is part of our efforts on standardization and co-design of exascale discretization APIs with application developers, hardware vendors and ECP software technologies projects. The goal is to extend the BLAS standard to include batched BLAS computational patterns/"application motifs" that are essential for representing and implementing tensor contractions.

Besides participation from the CEED project, stakeholders from ORNL, Sandia, NVidia, Intel, IBM, and Universities were invited.

New website launched by the Virginia Tech CEED team

A new website was recently launched by the Parallel Numerical Algorithms (Paranumal) research group at Virginia Tech here. The site includes a blog that gives some practical computing tips related to high performance implementations of finite element methods developed as part of the CEED project here.

Initial release of libCEED: The CEED API library

The initial version of libCEED, the CEED API library, was released in December 2017.

libCEED is a high-order API library, that for the first time provides a common operator description on algebraic level, that allows a wide variety of applications to take advantage of the efficient operator evaluation algorithms in the different CEED packages (from a single source).

Our long-term vision for libCEED is to include a variety of back-end implementations, ranging from simple reference kernels, to highly optimized kernels targeting specific devices (e.g. GPUs) or specific polynomial orders.

For more information visit

Software release: MFEM v3.3.2

Version 3.3.2 of MFEM was released on November 10, 2017. Some of the new additions in this release are:

For more details, see the interactive documentation and the full CHANGELOG at

CEED participates in xSDK and FASTMath

MFEM joined xSDK, the Extreme-scale Scientific Software Development Kit in ECP's software technologies focus area as of release xSDK-0.3.0, see

MFEM and PUMI are also part of the FASTMath institute in the SciDAC program, see

Software release: Nek5000 v17.0

Nek5000 version 17.0 was released as a major upgrade to Nek5000. Major features improvements include:

Nekbone and Laghos join proxy app suites

The Nekbone and Laghos miniapps developed in CEED were selected to be part of ECP's initial Proxy Applications Suite.

Both miniapps were also picked to be CORAL-2 benchmarks.

Laghos was also selected as one of LLNL's ASC co-design miniapps.

6th Nek5000 User Meeting to be held at U Florida

The 6th Nek5000 User/Developer Meeting will be hosted by the DOE PSAAP-II Compressible Multiphase Turbulence (CMT) center in Tampa, FL, March 17-18, 2018.

Nek5000 hackathon at UIUC

The inaugural Nek5000 Hackathon was held at NCSA Building, University of Illinois, Urbana-Champaign (UIUC), IL on Nov 12-14, 2017.

The event was attended by researchers and Nek5000 developers to promote the application of Nek5000 to new problems from industry, national laboratories, and academia. Twenty-five participants spent three days working on setting up new examples, developing new features, and helping one another to get maximum performance on their applications. Some of the more prominent exchanges of ideas included standardization of synthetic turbulent inflow techniques, use of CVODE for pure advection-diffusion problems, and the use of the characteristics methods for moving geometry applications.

For more details, see the Nek5000 hackathon website.

CEED organizing minisymposium at ICOSAHOM 2018

CEED is organizing a minisymposium, Efficient High-Order Finite Element Discretizations at Large Scale, at the International Conference on Spectral and High-Order Methods (ICOSAHOM 2018) in London UK, Jul 9-13, 2018.

The goal of the minisymposium is to discuss the next-generation high-order discretization algorithm and software, based on finite/spectral element approaches that will enable a wide range of important scientific applications to run efficiently on future architecture.

Best Paper Award at NURETH-17

CEED researchers (P. Fischer, E. Merzari, A. Obabko) won a Best Paper Award at the 17th International Topical Meeting on Nuclear Reactor Thermal Hydraulics (NURETH-17), held in China in September 2017, with a paper entitled High-Fidelity Simulation of Flow Induced Vibrations in Helical Steam Generators for Small Modular Reactors.

CEED attending Cray, AMD and Intel Deep-Dives

CEED researchers and representatives of the Nek and MFEM teams will attend the October 2017, ECP vendor deep-dive meetings:

Topics of discussion include advanced technology and memory design, strong scaling considerations and the porting and evaluation of CEED's bake-off problems and miniapps (Nekbone, Laghos, NekCEM CEEDling).

New Nekbench repository

New Nekbench repository has been released to provide scripts that simplify the benchmarking of Nek5000.

The user provides ranges for important parameters ranges (e.g., processor counts and local problem size ranges) and a test type (e.g., scaling or ping-pong test). Nekbench will run the given test in the given parameter space using a Nek5000 case file which is also given by the user (in the ping-pong tests, the case file is optional).

Nekbench is written using bash scripting language and runs any Unix-like operating system that supports bash. It has been successfully tested on Linux laptops/desktops, ALCF Theta, NERSC Cori (KNL and Haswell), and NERSC Edison machines for scaling tests.

Planned extensions for Nekbench include adding more machine types like ANL's Cetus, additional support for the ping-pong test type, and automated plot generation (e.g., scaling study graphs) for each test run.

GPU ports of Nek and Laghos

GPU acceleration is a main focus of the performance optimization efforts in CEED. Recent progress in this direction include GPU ports of CEED's Nek5000 application and the Nekbone and Laghos miniapps.

For Nek5000, an initial GPU-enabled version has been developed based on OpenACC. For Nekbone a pure OpenACC implementation as well as a hybrid OpenACC/CUDA implementation with a CUDA kernel for matrix-vector multiplication has been developed.

For Laghos, a GPU-enabled version of has been released using the OCCA interface. With this approach, the user is able to run Laghos distributively using varying device types per MPI process, whether serial C++, OpenMP, or CUDA.

First CEED annual meeting held at LLNL

CEED1AM picture

CEED held its first annual meeting in August, 2017 at the HPC Innovation Center of Lawrence Livermore National Laboratory.

The goal of the meeting was to report on the progress in the center, deepen existing and establish new connections with ECP hardware vendors, ECP software technologies projects and other collaborators, plan project activities and brainstorm/work as a group to make technical progress.

In addition to gathering together many of the CEED researchers, the meeting included representatives of the ECP management, hardware vendors, software technology and other interested projects.

CEED researchers at ATPESC17

Six CEED researchers presented at the 2017 edition of the Argonne Training Program on Extreme-Scale Computing, now part of the Exascale Computing Project.

The CEED presentations covered a wide variety of topics, from overview of Theta, to GPU programming, dense and sparse linear algebra, and high-order discretizations on unstructured meshes.

YouTube YouTube YouTube YouTube YouTube YouTube

Videos of all 2017 talks are available on YouTube. CEED researchers have also participated in past editions of the meeting.

CEED BPs and benchmarks repository released

CEED released an initial set of bake-off (BP) problems, which are simple kernels designed to test and compare the performance of high-order codes, both internally in CEED, as well as in the broader high-order community.

In addition to the benchmark descriptions on the CEED BPs page, a benchmarks repository is publicly available with several implementations of the CEED bake-off problems. Currently, MFEM, Nek5000 and deal.ii are included, see directories tests/mfem_bps, tests/nek5000_bps and tests/dealii_bps respectively.

New Laghos and NekCEM CEEDling miniapps released

Two new miniapps developed in CEED were released in June 2017: Laghos and NekCEM CEEDling.

Laghos (LAGrangian High-Order Solver) is a new miniapp developed in CEED that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. In CEED, Laghos serves as a proxy for a sub-component of the MARBL/LLNLApp application.

NekCEM CEEDling is a new NekCEM miniapp, solving the time-domain Maxwell equation for electromagnetic systems.

For more details, see the CEED miniapps page and the Laghos and NekCEM CEEDling repositories on GitHub.

Paper with MPICH at SC17

Joint paper with the MPICH group, Why is MPI so Slow? Analyzing the fundamental limits in implementing MPI-3.1 accepted in Supercomputing 2017. The paper provides an in-depth analysis of the software overheads in the MPI performance-critical path and exposes mandatory performance overheads that are unavoidable based on the MPI-3.1 specification.


Support for the sparse direct solver and preconditioner STRUMPACK has been integrated in MFEM.

STRUMPACK is being developed at LBNL and is part of the ECP project Factorization Based Sparse Solvers and Preconditioners (Xiaoye Sherry Li and Pieter Ghysels). The STRUMPACK solver is based on multifrontal sparse Gaussian elimination and uses hierarchically semi-separable matrices to compress fill-in. It can be used as an exact direct solver or as an algebraic, robust and parallel preconditioner for a range of discretized PDE problems.

2017 PETSc User Meeting

Over 75 participants from all over the world attended the PETSc User Meeting, held June 14-16 in Boulder, CO. Hosted by the University of Colorado Boulder, the event consisted of a one-day tutorial on the solver library PETSc and showcased the latest research enabled by the functionality available in PETSc. The meeting agenda covered a total of 15 talks, four posters, and two panels.

Thanks to generous support from Intel and Tech-X, 22 students received travel grants and got to learn about the latest techniques on the large-scale numerical solution of partial differential equations.

PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It has become one of the most widely used numerical software packages of its kind and has users in application areas ranging from acoustics and arterial flow to seismology and semiconductors.

GPU Hackathon at BNL

Nek/CEED team participated the GPU Hackathon 2017 that was held in Brookhaven National Laboratory on June 5-9, 2017. Our team focused on performing and tuning GPU-enabled Nek5000/Nekbone/NekCEM version on large-scale GPU systems for small modular reactor, thermal fluids, and meta-materials modeling.

Workshop on Batched, Reproducible, and Reduced Precision BLAS

The second Workshop on Batched, Reproducible, and Reduced Precision BLAS was held in Atlanta, GA on February 23-25, 2017 including many members of the CEED MAGMA team.

The goal of this workshop was to touch on extending the Basic Linear Algebra Software Library (BLAS). The existing BLAS have proven to be very effective in assisting portable, efficient software for sequential and some of the current class of high-performance computers. New computational needs in many applications have motivated the need to investigate the possibility of extending the currently accepted standards to provide greater parallelism for small size operations, reproducibility, and reduced precision support.

Of particular interest to CEED is the use of batched BLAS for finite element tensor contractions, and thus our team is interested in the establishment of a batched BLAS standard, highly-optimized implementations, and support from vendors on various architectures.

This is the second workshop of an open forum to discuss and formalize details related to batched, reproducible, and reduced precision BLAS. The agenda and the talks from the first workshop can be found here.

Software release: MFEM v3.3

Version 3.3 of MFEM, a lightweight, general, scalable C++ library for finite element methods and a main partner in CEED, was released on January 28, 2017 at

The goal of MFEM is to enable high-performance scalable finite element discretization research and application development on a wide variety of platforms, ranging from laptops to exascale supercomputers.

It has many features, including:

Some of the new additions in version 3.3 are:

MFEM is being developed in CASC, LLNL and is freely available under LGPL 2.1. For more details, see the interactive documentation and the full CHANGELOG.

CEED co-design center announced

The Exascale Computing Project (ECP) announced on November 11, 2016 its selection of four co-design centers, including CEED: the Center for Efficient Exascale Discretizations, which is a research partnership between Lawrence Livermore National Laboratory; Argonne National Laboratory; the University of Illinois Urbana-Champaign; Virginia Tech; University of Tennessee, Knoxville; Colorado University, Boulder; and the Rensselaer Polytechnic Institute (RPI).

Additional news coverage can be found in LLNL Newsline and the ANL press release.

R&D 100 Award for NekCEM / Nek5000

NekCEM/Nek5000: Scalable High-Order Simulation Codes received a 2016 R&D 100 Award, given by R&D Magazine to 100 top new technologies for the year.

The R&D 100 citation reads: "NekCEM/Nek5000: Release 4.0: Scalable High-Order Simulation Codes is an open-source simulation-software package that delivers highly accurate solutions for a wide range of scientific applications including electromagnetics, quantum optics, fluid flow, thermal convection, combustion and magnetohydrodynamics. It features state-of-the-art, scalable, high-order algorithms that are fast and efficient on platforms ranging from laptops to the world’s fastest computers. The size of the physical phenomena that can be simulated with this package ranges from quantum dots for nanoscale devices to accretion disks surrounding black holes. NekCEM provides simulation capabilities for the analysis of electromagnetic and quantum optical devices, such as particle accelerators and solar cells. Nek5000 provides turbulent flow simulation capabilities for a variety of thermal-fluid problems including nuclear reactors, internal combustion engines, vascular flows, and ocean currents."

See the ANL press release for more information.