EOFS 2026 Workshop on Open Source Parallel Filesystems

Europe/Paris
Maison des Mines et des Ponts et Chaussées

Maison des Mines et des Ponts et Chaussées

270 Rue Saint-Jacques, 75005 Paris
Frank Baetke (EOFS), Jean-Thomas Acquaviva, Tiziano Müller (European Open File System SCE)
Description

Following “EOFS workshops on open source parallel filesystems” at DKRZ in Hamburg in 2022, at TU Dresden in 2024 and at the University Mainz in 2025, participants suggested to continue in 2026 at a different European country but staying with the chosen format of two half-days and a joint dinner for all participants.

Fortunately, two EOFS members, DDN and HPE, have agreed to provide the infrastructure for the next workshop to take place on March 12 and on March 13, including a joint dinner on the evening of the 12th. The venue will be downtown Paris easily accessible by public transport.

The focus will again be on all open source (parallel) file systems and will not only cover products currently in operation such as BeeGFS, CephFS, DAOS, and Lustre, but also experimental file systems such as GekkoFS and other adhoc file systems. In addition to aspects of operation in the data center, computer science curricula will also be covered with regard to parallel file systems, I/O architectures and operating systems.

The contributions (short presentations of 20 minutes) should thus address at least one of the following topics:

  • Status and development trends for BeeGFS, DAOS, CephFS and Lustre, as well as experimental architectures.
  • Experiences, development requests, and aspects of operation in the data center.
  • Problems and suggestions regarding computer science education in the field of parallel file systems, IO architectures, and operating systems.

The conference language is English. The aim of the workshop is to exchange positive and negative experiences and concepts, assess development trends, formulate requests to the development teams, and expand personal contacts.

Sponsors for this event:

  • Thursday 12 March
    • 12:40 13:00
      Arrival 20m
    • 13:00 13:15
      EOFS Workshop 2026 – Welcome, objectives, and agenda 15m
      Speakers: Frank Baetke (EOFS), Jean-Thomas Acquaviva, Tiziano Müller (European Open File System SCE)
    • 13:15 15:35
      Session A
      • 13:15
        GekkoFS - Updates 25m

        The presentation will show the last updates in GekkoFS, architecture, new backends, hardware support, and a exploration of new features developed under the umbrella of the Barcelona Zettascale Lab.

        • Syscall Intercept
        • LibC intercept
        • FUSE backend
        • Small file, data distribution
        • What worked and what no.
        Speaker: Ramon Nou Castell (Barcelona Supercomputing Center)
      • 13:40
        Examining GekkoFS for use at TU Dresden HPC 25m

        Ad hoc filesystems across node-local SSDs can help alleviate the I/O bottleneck imposed by an HDD-backed shared filesystem, aiding efficient compute resource utilization. In an attempt to reduce strain on the global storage and make better use of node local storage, GekkoFS is under examination for TU Dresden's HPC systems. I highlight our considerations for an ad hoc filesystem as well as issues deploying GekkoFS.

        Speaker: Malte Christian Kuns (TU Dresden)
      • 14:05
        DAOS Update 25m

        This presentation will provide an update on activities in the DAOS Foundation including the upcoming DAOS 2.8 release, first experiences of deploying DAOS on 400Gbps fabrics (IB-NDR, Slingshot 400, OPA 400, RoCE), a preview of the HPE Cray Supercomputing Storage K3000 product based on DAOS, and other notable research and development topics from the DAOS community.

        Speaker: Michael Hennecke (HPE)
      • 14:30
        BeeGFS Auto-Tiering for Data-Heavy Environments 25m

        Data-heavy workloads demand flexible and efficient storage strategies. In this session, the BeeGFS VP of Engineering will present BeeGFS auto-tiering capabilities, demonstrating how data can be dynamically placed across different storage tiers, including internal BeeGFS systems and external S3-based storage to balance performance, scalability, and cost.

        Speaker: Mr Philipp Falk
      • 14:55
        Discussion 40m
    • 15:35 16:05
      Break 30m
    • 16:05 18:15
      Session B
      • 16:05
        Lustre: Status and Path Forward 25m

        Lustre is the leading open-source and open-development file system for HPC. Around two thirds of the top 100 supercomputers use Lustre. It is a community developed technology with contributors from around the world. Lustre currently supports many HPC infrastructures beyond scientific research, such as financial services, energy, manufacturing, and life sciences, and in recent years has been leveraged by cloud solutions to bring its performance benefits to a variety of new use cases (particularly relating to AI). This talk will reflect on the current state of the Lustre ecosystem and also will include the latest news relating to Lustre community releases (LTS releases and major releases), the roadmap, and details of features under development.

        Speaker: Sebastien Buisson (DDN/Whamcloud)
      • 16:30
        Status of Nodemap, Erasure Coding, and Trashcan features for Lustre 25m

        This talk will present the current status of Lustre development, upcoming features, and roadmap. This will include topics, such as:
        - the Lustre nodemap feature that was significantly extended recently;
        - the Erasure Coding effort status and next steps with Immediate Write Mirroring;
        - the Lustre Trashcan/undelete feature; and
        - the Lustre quota aggregation feature.

        Speaker: Dr Marc Vef (DDN/Whamcloud)
      • 16:55
        Training engineers at ensIIE for HPC and related domains 25m

        In this presentation, we will discuss the HPC training program at ensIIE (https://www.ensiie.fr/), a renowned engineering school specialising in computer science and located in Evry (France).

        With a "Chaire d'Enseignement" (teaching chair) supported in partnership with the CEA, the CIDM program trains engineers in the fields of high-performance computing, system administration, and the deployment of scientific applications on specific software and hardware environments.

        We will discuss:
        - the layout of training programs (engineering degree, "Mastère Spécialisé") in HPC and the profile of their students.
        - the program track, including topics like filesystems, as well as the associated teaching resources
        - the future evolutions of the program, that will lead to a discussion with the audience about their expectation of such training program.

        Speaker: Valentin DELIS (ensIIE)
      • 17:20
        Discussion 40m
    • 19:30 23:00
      Dinner 3h 30m
    • 09:00 10:20
      Session C
      • 09:00
        RobinHood: storing and querying a filesystem's metadata for fast access 25m

        As supercomputers are becoming faster and faster, so does their data output. Since the regularly accessed data must be stored and available quickly to users, it is important to put it on fast storage systems. However, these tend to have a low capacity, meaning we must be able to chose the data which should remain on those types of storage systems, and which can be placed on slower but more capacitive systems. As such, it is important to be able to accurately know the state of a filesystem at any point, but using the conventional means provided by the operating system for this, for instance to do filesystem traversals, can be time consuming if done regularly. Moreover, these operations impose a heavy load on the filesystem, making it slower. To counter these problems, we created a suite of tools called RobinHood that aims to mirror a filesystem in a database, and use the latter to define policies that will manage data placement according to their usage.

        Speaker: Yoann Valeri (CEA)
      • 09:25
        lustre-db: Scalable Metadata Analytics for Large-Scale Lustre Filesystems 25m

        We operate a 120 PiB Lustre filesystem at DKRZ with billions of inodes. At the same time, climate and Earth system workflows are increasingly moving toward chunked, object-style formats such as Zarr. While this shift enables scalable and cloud-aligned data access patterns, it also dramatically increases inode counts. As a result, traditional namespace traversals become slow, resource-intensive, and difficult to run continuously at scale.

        We present lustre-db, a lightweight and scalable metadata analytics framework designed to persist and query the current state as well as the historical evolution of our Lustre filesystem. The system incrementally captures inode-level metadata changes and stores them in a columnar database (DuckDB), enabling efficient SQL-based analytics across billions of records.

        This talk introduces the architecture, data model, ingestion strategy, and performance characteristics of lustre-db in production, along with practical lessons learned from operating it at large scale.

        Speaker: Janos Zimmermann (German Climate Computing Center)
      • 09:50
        Discussion 30m
    • 10:20 10:50
      Break 30m
    • 10:50 12:10
      Session D
      • 10:50
        Opportunities and Challenges of HPC Storage for EDF 25m

        EDF has been using HPC computing for many years in its R&D and production activities. After a brief introduction to the challenges of HPC computing, we will provide an overview of our storage systems for computing and long-term storage. We will then present the difficulties encountered with our current configurations. We will conclude this presentation with new opportunities related to uncertainty calculations, big data analytics, resilience requirements, and more.

        Speaker: Cyril BAUDRY (EDF)
      • 11:15
        Phobos: A Flexible, Open-Source Tape Storage System for HPC and Beyond 25m

        In the era of exascale computing and data-intensive workflows, efficient tiered storage architectures are essential to balance performance, capacity, and cost. While parallel file systems like Lustre, BeeGFS, and DAOS excel at handling high-throughput I/O, the seamless integration of high-capacity, long-term storage solutions such as tape libraries remains a major challenge for long-term data retention and cost-effective archiving.

        This talk presents Phobos, an open-source storage system developed by CEA, specifically designed to address these challenges by providing a highly efficient, scalable, and vendor-neutral solution for managing tape-based archives and large robotic libraries. Built on the Linear Tape File System (LTFS) —an open, standardized format— Phobos ensures interoperability, long-term data preservation, and independence from proprietary formats and software, making it a cornerstone for data sovereignty. Phobos offers multiple front-ends, including Lustre/HSM, S3, and iRODS, enabling seamless integration in diverse HPC, cloud, and data management environments.

        Speaker: Thomas LEIBOVICI (CEA)
      • 11:40
        Discussion 30m
    • 12:10 12:30
      Closing Remarks 20m
      Speakers: Frank Baetke (EOFS), Jean-Thomas Acquaviva, Tiziano Müller (European Open File System SCE)