EOFS 2026 Workshop on Open Source Parallel Filesystems

Name: EOFS 2026 Workshop on Open Source Parallel Filesystems
Start: 2026-03-12T12:40:00+01:00
End: 2026-03-13T13:00:00+01:00
Location: Maison des Mines et des Ponts et Chaussées

12–13 Mar 2026

Maison des Mines et des Ponts et Chaussées

Europe/Paris timezone

lustre-db: Scalable Metadata Analytics for Large-Scale Lustre Filesystems

13 Mar 2026, 09:25

25m

Maison des Mines et des Ponts et Chaussées

270 Rue Saint-Jacques, 75005 Paris

Presentation Session I: Operational Experiences and Aspects Session C

Janos Zimmermann (German Climate Computing Center)

We operate a 120 PiB Lustre filesystem at DKRZ with billions of inodes. At the same time, climate and Earth system workflows are increasingly moving toward chunked, object-style formats such as Zarr. While this shift enables scalable and cloud-aligned data access patterns, it also dramatically increases inode counts. As a result, traditional namespace traversals become slow, resource-intensive, and difficult to run continuously at scale.

We present lustre-db, a lightweight and scalable metadata analytics framework designed to persist and query the current state as well as the historical evolution of our Lustre filesystem. The system incrementally captures inode-level metadata changes and stores them in a columnar database (DuckDB), enabling efficient SQL-based analytics across billions of records.

This talk introduces the architecture, data model, ingestion strategy, and performance characteristics of lustre-db in production, along with practical lessons learned from operating it at large scale.

Janos Zimmermann (German Climate Computing Center)

9_Zimmermann_lustre-db EOFS_2026.pdf

EOFS 2026 Workshop on Open Source Parallel Filesystems

lustre-db: Scalable Metadata Analytics for Large-Scale Lustre Filesystems

Maison des Mines et des Ponts et Chaussées

Speaker

Description

Author

Presentation materials