2–4 Feb 2026
CIEMAT
Europe/Madrid timezone

Design-Space Exploration and Integer Quantization of Graph Neural Networks for Real-Time FPGA Track Finding

3 Feb 2026, 12:15
15m
Salón de Actos "Margarita Salas" (Edificio 1, Planta Baja) (CIEMAT)

Salón de Actos "Margarita Salas" (Edificio 1, Planta Baja)

CIEMAT

Avenida Complutense, 40 28040 Madrid Spain
WG6 Electronics WG6 Electronics

Speaker

Pelayo Leguina (Universidad de Oviedo)

Description

Real-time track finding for displaced-muon signatures in the CMS Level-1 trigger must operate under strict fixed-latency constraints (12.5~$\mu$s) while processing high-throughput detector data. Graph neural networks (GNNs) provide a natural representation of sparse, irregular detector geometries; however, mapping message-passing models to FPGAs requires careful co-optimization of numerical formats, architectural parameters, and high-level synthesis (HLS) microarchitecture.

We present an end-to-end workflow bridging GNN training and FPGA prototyping for a GraphSAGE-based model targeting real-time inference. The pipeline integrates: (i) automated design-space exploration across model dimensions, fixed-point precision, and HLS parameters to expose accuracy--latency--resource trade-offs; (ii) an integer-only INT8 implementation with data-driven bit-width optimization, reducing accumulator and scaling widths while preserving numerical correctness; and (iii) modular C++ kernels synthesized with Vitis HLS and validated through bit-exact C-simulation against Python integer references.

Preliminary validation on the Cora benchmark demonstrates that post-training quantization preserves model accuracy within 0.1\% of the floating-point baseline, while enabling substantial reductions in memory footprint and arithmetic complexity. Bit-exact agreement between software and hardware models is achieved using optimized fixed-point scaling. Quantization-aware training and physics-driven datasets for displaced-muon reconstruction are currently under development.

This work establishes a reproducible methodology for deploying message-passing GNNs on FPGAs under strict real-time constraints, providing a concrete path toward fixed-latency GNN-based track reconstruction in the CMS trigger system.

Author

Pelayo Leguina (Universidad de Oviedo)

Co-author

Santiago Folgueras (Universidad de Oviedo (ES))

Presentation materials