Speaker
Description
Real-time track finding for displaced-muon signatures in the CMS Level-1 trigger must operate under strict fixed-latency constraints (12.5~$\mu$s) while processing high-throughput detector data. Graph neural networks (GNNs) provide a natural representation of sparse, irregular detector geometries; however, mapping message-passing models to FPGAs requires careful co-optimization of numerical formats, architectural parameters, and high-level synthesis (HLS) microarchitecture.
We present an end-to-end workflow bridging GNN training and FPGA prototyping for a GraphSAGE-based model targeting real-time inference. The pipeline integrates: (i) automated design-space exploration across model dimensions, fixed-point precision, and HLS parameters to expose accuracy--latency--resource trade-offs; (ii) an integer-only INT8 implementation with data-driven bit-width optimization, reducing accumulator and scaling widths while preserving numerical correctness; and (iii) modular C++ kernels synthesized with Vitis HLS and validated through bit-exact C-simulation against Python integer references.
Preliminary validation on the Cora benchmark demonstrates that post-training quantization preserves model accuracy within 0.1\% of the floating-point baseline, while enabling substantial reductions in memory footprint and arithmetic complexity. Bit-exact agreement between software and hardware models is achieved using optimized fixed-point scaling. Quantization-aware training and physics-driven datasets for displaced-muon reconstruction are currently under development.
This work establishes a reproducible methodology for deploying message-passing GNNs on FPGAs under strict real-time constraints, providing a concrete path toward fixed-latency GNN-based track reconstruction in the CMS trigger system.
| Minioral | Yes |
|---|---|
| IEEE Member | No |
| Are you a student? | No |