Speaker
Description
VPIC is being ported and optimized on several modern architectures. These include KNL processors available on Trinity, Cori and Stampede2, Skylake processors available on Mare Nostrum and Stampede2, IBM Power 9 processors and Volta GPUs available on Summit and Sierra and ARM ThunderX2 processors, available on Astra at Sandia and ARM clusters at Los Alamos National Laboratory. VPIC is in production on several of these systems. These architectures vary in many ways including available memory bandwidth, vector length, threads per core, clock frequency and overall node architecture. This work is focused on single node performance. Current efforts to optimize single node performance are exploring changes to data layout of key data structures, use of performance portability frameworks such as Kokkos and performance profiling with a variety of performance analysis tools. Results will be presented which compare the performance of VPIC on these different architectures.
This work was supported by the US Department of Energy through the Los Alamos National Laboratory. Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001).