Reproducible Machine Learning Workflows for Scientists Workshop Pilot 2025

US/Central
1145 (Discovery Building)

1145

Discovery Building

330 N. Orchard St Madison, WI 53715
Christopher Endemann (University of Wisconsin-Madison), Matthew Feickert (University of Wisconsin Madison (US)), Ryan Bemowski (University of Wisconsin-Madison)
Description

Workshop Pilot

Scientific researchers need reproducible software environments for complex applications that can run across heterogeneous computing platforms. Modern open source tools, like Pixi, provide automatic reproducibility solutions for all dependencies while providing a high level interface well suited for researchers.

This workshop will provide a practical introduction to using Pixi to easily create scientific and AI/ML environments that benefit from hardware acceleration, across multiple machines and platforms. The focus will be on applications using Python machine learning libraries with CUDA enabled, as well as deploying these environments to production settings in Linux container images. This workshop will not teach machine learning concepts, but will focus on the methodologies and tools to make existing machine learning workflows reproducible.

This workshop is a pilot for an upcoming national level workshop.


Participant Information


No prior experience with these tools and technologies is expected to participate in the workshop, though if you do have experience that is great. Participants who do not have experience with machine learning but have interest in general hardware acceleration for computing are encouraged to register for the workshop also (the workshop focuses on enabling reproduction for existing hardware accelerated workflows, not machine learning techniques). Basic experience with file systems and programming in Python (or a similar language) are expected. The workshop is free and there are no fees associated with it. To participate please register for the workshop.

This workshop is a pilot for an upcoming national level workshop on the same materials at UW-Madison. If you attend this pilot you can also attend the national level workshop, if you would like to.

Workshop resources and references

Materials and resources

References

If participants feel they would benefit from review of some of the foundational material of the workshop they might consider the following lessons from The Carpentries


While no machine learning or deep learning background is assumed for this workshop, participants may also wish to explore these Carpentries resources in the future:

Instructor team

  • Matthew Feickert (UW--Madison, Data Science Institute)
  • Chris Endemann (UW--Madison, Data Science Hub)
  • Ryan Bemowski (UW--Madison, Data Science Hub)

Acknowledgements

This workshop is supported by the US Research Software Sustainability Institute (URSSI) via grant G-2022-19347 from the Sloan Foundation.

Registration
Reproducible Machine Learning Workflows for Scientists
Participants
    • 09:00 12:30
      Day 1: Workshop Day 1
      Convener: Matthew Feickert (University of Wisconsin Madison (US))
    • 09:00 12:30
      Day 2: Workshop Day 2
      Convener: Matthew Feickert (University of Wisconsin Madison (US))