(Bi-)Weekly meeting green compute team

Europe/London
Alessandra Forti (The University of Manchester (GB)), Caterina Doglioni (The University of Manchester (GB)), Michael Sparks (The University of Manchester (GB)), Robert Frank (University of Manchester), Tobias Fitschen (The University of Manchester (GB))
Description

Live notes: see Teams channel / on demand for non-UofM.

Zoom link:

https://cern.zoom.us/j/64447396002?pwd=M5MzzOTnDcdNDuNDPFY7slbvvhCMnt.1

In this meeting, we will discuss status of the joint work with Glasgow, and connections to the HEPScore/HEPBenchmarks work. 

# UofM/Glasgow/CERN QMUL Green Compute Meeting
## 30/7/2025 1pm BST

Present: Michael, Rosie, Emanuele, Robert, Sakshi, Alessandra, Sudha


## Agenda 

13:00 BST / 14:00 CERN as usual.

1. Progress on Prometheus data processing (Saksh, Myself)
* Introduction by Sudha
2. Progress on on Glasgow Prometheus replication (Emanuele)
3. Brief discussion of ROCrates and  standardised reporting esp relating to runs (Michael)
4. Who is around, and when / next meeting 
5. AOB

## Progress on Prometheus data processing (Sakshi, Myself)


### Discussion:

My notes:
* Created a notebook. Performed analysis on a week's data.
* Been able to do analysis to an extent, which is a start
  and gained some ideas about what sorts of things would be
  useful to correlate power usage against.
* Would like guidance on what to analyse, and how.

Robert/Michael:
* Robert Frank - list of different metrics
* Michael - It's a collection of search terms.

Sakshi - how to progress ?
Robert - need to be careful to select various pieces and may require summarisation.

Specific Actions:

* Sakshi - to select a collection of metrics
* Michael - to collect them for sakshi
* Both - discussion of specific steps


### Summary (by GPT from Transcript)

Sakshi:

* Generated a notebook in the metrics exporter directory for initial data analysis (approx. one week of data).
* Analysed ~800 CSV files from Manchester Prometheus; only one unlabeled metric was present.
* Received an additional Prometheus file from Robert (1,000 lines with multiple metrics, some sparse).

* Next step: review the list of metrics to identify which should be extracted for further analysis.

Robert & Michael:

* Robert clarified that the file Sakshi analysed was a list of available metrics (snapshot, not full time-series).
* Michael to pull out selected metrics from Prometheus once Sakshi provides the list.
* Post-processing will be required (e.g. summing CPU core metrics).

* Key next step: Sakshi to select useful metrics → Michael to extract them → follow-up discussion.


## Introduction by Sudha

- Hello/introductions.

Sudha introduced herself:

* New to Queen Mary University of London and GridPP (joined June).
* Experimental particle physicist with background in ATLAS and CMS trigger software.
* Will work on sustainability studies.


## Progress on on Glasgow Prometheus replication (Emanuele)

Key points:

* Looking good, been fighting with replication + filtering, so compromising on some points.
* Will re-enable this afternoon.

* Replicating from private instance to shareable instance
* Difficulty in replicating the content between the two instances is made trickier due to wanting to filter and anonymise the data.
   - Decision around hostname/etc
       - Might be good for us to anonymise that when we extract
   - Variety of metrics added relative to Condor Jobs 

GPT Summary:

Emanuele:

* Significant challenges replicating and sanitising the Glasgow Prometheus database (~300 GB).
* Initially attempted to anonymise hostnames but Prometheus made this impractical.
* Decision: retain full hostnames (privacy concern deemed minimal).
* Will wipe and re-import the database, enabling external access again.
* Additional metrics to be exposed: RAM totals, CPU totals, etc. for better usage calculations.

Note:

* Timeline: expected completion by tonight or tomorrow.
* Will be away for 2–3 weeks starting next week (limited email availability).


## Who is around, and when 

Away for longer periods in August:

* Rosie: Last day mid-August
* Emanuele: Away: w/c 4th, 11th, 18th August 
* Michael: Away: w/c 11th, 18th, 25th August 
* Caterina: Away: w/c 18th, 25th Aug, 1st Sept
* Alessandra: Away 9–13 August and 21 Aug–4 Sept.

Generally around: 

* Sudha: Mostly available in August (occasional long weekends).
* Sakshi: Available in August; possible time off in September.
* Robert: Generally available; cannot guarantee.


## Brief discussion of ROCrates and  standardised reporting esp relating to runs (Michael)

* Context: Desire for reproducible and standardised sharing of Prometheus-derived datasets.
  * Rosie: Has been scripting Prometheus data plots; aiming to replicate previous student’s Monte Carlo analysis.

* Proposal:

  * Use RO Crates (lightweight, machine-readable JSON-LD descriptions) for:
    * Capturing metrics datasets with metadata (anonymised where necessary).
    * Supporting reproducibility and easier sharing within the group and potentially externally.

  * Links to green metadata work we did with Loic Lannelogue at CW25 (carbon usage reporting).

  * Motivation:
    - Captured and shared once with Luis - manual approach is fine
    - Sharing and explaining second time (With Rosie) - would be useful to capture better
    - If we're likely to want to share again more times - some light-touch automation friendly changes would be useful to make it easier for the next person to pick up what's been captured in a simpler fashion. (Whether it's another MPhys student or beyond)

* Next step: Michael to create an example RO Crate for group feedback (decide which information to include/anonymise).


## date/time of next meeting 

* 13th August 1pm BST, 2pm CERN
* Likely to be around: Emanuele, Caterina, Sudha, Rosie, Sakshi

May need to defer detailed discussions to September. (e.g. integrating Emanuele’s Condor exporters - see AOB) 


## AOB

* Alessandra suggested reviewing Emanuele’s exporters (e.g. HTCondor) and exploring whether they can be applied in Manchester to better match jobs with machine status.

* Action to revisit this when all data/metrics are available (likely September).

There are minutes attached to this event. Show them.
    • 13:00 13:20
      Discussion of work with HEPBenchmarks CERN and Glasgow groups 20m
      Speakers: Alessandra Forti (The University of Manchester (GB)), Caterina Doglioni (The University of Manchester (GB)), Domenico Giordano (CERN), Mr Michael Sparks (The University of Manchester (GB)), Tobias Fitschen (The University of Manchester (GB))
    • 13:20 13:40
      Student updates 20m
      Speaker: Caterina Doglioni (The University of Manchester (GB))
    • 13:40 14:00
      Blackett team updates 20m
      Speakers: Alessandra Forti (The University of Manchester (GB)), Mr Robert Frank (University of Manchester)