Speaker
Description
For nearly two decades, the highly improved staggered quark (HISQ) discretization of the Dirac operator has enabled fast and accurate simulations of (2+1)- and (2+1+1)-flavor QCD, particularly at the physical point. Over this period, numerous code bases targeting both CPU and GPU architectures have implemented HISQ through a variety of methods that optimize for low communication overhead, efficient use of local storage, programmatic flexibility, and so forth. We discuss ongoing efforts targeting a single-source implementation of the HISQ smearing and derivatives within the Grid framework. We emphasize our use of Grid's GeneralLocalStencil and PaddedCell data structures for the purpose of minimizing both communication overhead and storage costs. Additionally, we discuss Grid-specific design choices that enhance the flexibility of our implementation of HISQ while improving Grid's present staggered infrastructure. We end with a timeline for production-readiness and future applications, such as domain-decomposed hybrid Monte Carlo.
| Parallel Session (for talks only) | Software development and machines | 
|---|
