8–10 Jul 2026
Europe/Zurich timezone
Registration is open and free!

Towards Unified Power and Efficiency Monitoring Across the Worldwide LHC Computing Grid

9 Jul 2026, 14:45
15m
15 minute talk Submitted talks

Speaker

Natalia Diana Szczepanek (CERN)

Description

The Worldwide LHC Computing Grid (WLCG) provides the distributed computing infrastructure required to support both LHC and non-LHC experiments. With the upcoming HL-LHC era, the expected increase in computing demand makes power efficiency and sustainability increasingly important challenges for the HEP community. However, monitoring power consumption at the level of grid job slots remains a missing component of current workload management systems. While individual computing centres can monitor power consumption locally, maintaining a consistent view across heterogeneous clusters and repeatedly benchmarking systems after hardware or configuration changes is time-consuming and often impractical for sites.

This work presents a lightweight and scalable framework for unified power and efficiency monitoring across WLCG sites by leveraging existing benchmarking and monitoring infrastructures already deployed at computing centres. The proposed approach minimizes operational overhead for site administrators while enabling continuous collection of power-related metrics from thousands of worker nodes across heterogeneous environments. Two deployment approaches are currently supported: a systemd-based collector and an implementation integrated with existing Prometheus infrastructures, allowing straightforward adoption across a broad range of sites.

The collected data enables large-scale analysis of dynamic power consumption under realistic workloads and varying utilization levels, providing load-aware power estimation and performance-per-watt characterization across different hardware architectures and sites. Initial studies demonstrate that the framework can support improved accounting of computing resources and more representative estimation of workload efficiency. In addition to direct measurements, this work introduces a power/core modeling approach that enables characterization of the broader WLCG infrastructure, including sites without direct power measurements.

Authors

Presentation materials

There are no materials yet.