New RESIF 3 Framework
Since 2014, the scientific software stack on the ULHPC facility is generated and deployed in an automated and consistent way through the RESIF framework (Revolutionary EB-based Software Installation Framework), a wrapper on top of Easybuild and Lmod meant to efficiently handle user software generation. The main objectives of this project was to fully automate software builds and to supports all available toolchains and software sets through a clean hierarchical modules layout to facilitate its usage and provide an intuitive interface to the users. I also wanted to facilitate the reproducible and self-contained deployment of the complete software stack, coupled with a strong versioning policy between environments and (typically) yearly release cycles.
- The first version was the result of a master project I proposed to Maxime Schmitt in 2014-2015. It was used to produce the following ULHPC software environments:
- 2013-2015 software set (377 software packages)
- 2015-2017 software set (133 software packages)
- A large code refactoring (bringing RESIF 2) was performed in 2017 to better handle different software sets and roles across multiple clusters, all piloted through a dedicated control repository. Sarah Peter and Valentin Plugaru were mainly taking care of the updates at this level to produce the following ULHPC software environments:
- 2017-2018 software set (165 software packages)
- 2018-2019 software set (210 software packages)
- 2019a software set (239 software packages)
Yet after these 3 last environment releases, the limitations induced by RESIF 2 were clear and the corresponding workflow proved to be quite complex and hard to maintain. Furthermore, the broken compliance with streamline EasyBuild developments led to an explosion of custom configurations.
With the advent of the new Aion supercomputer featuring a different CPU architecture (AMD Epyc instead of Intel Broadwell/Skylake), and to mitigate the identified limitations, I wanted to rethink completely the framework.
This led to a complete code refactoring leading to the RESIF 3.0 framework presented in [1] at the occasion of the ACM PEARC’21 conference, on July 22, 2021.
- S. Varrette, E. Kieffer, F. Pinel, E. Krishnasamy, S. Peter, H. Cartiaux, and X. Besseron, “RESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility,” in ACM Practice and Experience in Advanced Research Computing (PEARC’21), Virtual Event, 2021.
RESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility
OpenAccess ACM PEARC'21 Article
Validated against the 2019b and 2020a toolchains with the ULHPC team, it enables the User Software Environment on ULHPC systems for now. it follows that the ULHPC software modules are structured according to the organization depicted below (click to enlarge) through Module bundles (i.e., using the Bundle
easyblock, or the Toolchain
one (derived from the Bundle one) for the ULHPC environment.
The code base is available publicly on Github – see ULHPC/sw
. It is synchronized from our internal repository piloting the deployment. This tool may thus help other HPC centres to consolidate their own software management stack.