MPI: OpenMPI¶
Introduction¶
This document outlines the supported OpenMPI versions in the Merlin7 cluster.
OpenMPI supported versions¶
The Merlin cluster supports OpenMPI versions across three distinct stages: stable, unstable, and deprecated. Below is an overview of each stage:
-
Stable: Versions in the
stablestage are fully functional, thoroughly tested, and officially supported by the Merlin administrators. These versions are available via PModules and Spack, ensuring compatibility and reliability for production use. -
Unstable: Versions in the
unstablestage are available for testing and early access to new OpenMPI features. While these versions can be used, their compilation and configuration are subject to change before they are promoted to thestablestage. Administrators recommend caution when relying onunstableversions for critical workloads. -
Deprecated: Versions in the
deprecatedstage are no longer supported by the Merlin administrators. Typically, these include versions no longer supported by the official OpenMPI project, or problematic releases. While deprecated versions may still be available for use, their functionality cannot be guaranteed, and they will not receive updates or bug fixes.
Warning
Merlin7 runs as a PSI vCluster on the HPE Cray EX-based ALPS system. The native software environment on ALPS is generally centered around Cray MPICH and other system-integrated components. However, many PSI applications depend on OpenMPI, so we also provide and support OpenMPI builds on Merlin7.
Our goal is to keep these OpenMPI installations as independent as practical from site-specific system components, so that the same software stack remains usable for longer and requires fewer rebuilds. In practice, however, the full stack still involves several low-level components and runtime dependencies. As a result, occasional rebuilds with newer OpenMPI releases may still be necessary, for example to address compatibility issues, broken dependencies, or performance problems.
At present, we recommend using OpenMPI 5.0.8 or newer on Merlin7, compiled with libfabric/2.2.0-oss or superior.
Running OpenMPI on Merlin7¶
In OpenMPI versions prior to 5.0.x, using srun for direct task launches was faster than mpirun.
Although this is no longer the case, srun remains the recommended method due to its simplicity and ease of use.
Key benefits of srun:
* Automatically handles task binding to cores.
* In general, requires less configuration compared to mpirun.
* Best suited for most users, while mpirun is recommended only for advanced MPI configurations.
* However, mpirun has a much faster MPI initialization, which might be suitable for short runs.
For any module-related issues, please contact the Merlin7 administrators.
Example Usage:
Tip
Always run OpenMPI applications with srun for a seamless experience.
CXI vs LinkX provider¶
Open MPI on Merlin7 is built with libfabric support. By default, libfabric uses the CXI (Cassini) provider on Slingshot-based networks, which is the standard behavior on Merlin7. However, the default CXI provider does not always deliver the best performance for applications with significant inter-node communication.
To address this, a new provider plugin specifically designed for Open MPI, LINKx, has been built on Merlin7 using
libfabric -oss releases. This provider is still considered preview and is therefore not enabled by default,
but is strongly recommended.
In addition, some environment variables must be configured explicitly, in particular FI_LNX_PROV_LINKS, whose value
depends on the node type and on the resources assigned to the job. For this reason, if you want to use the LINKx
provider, you should set:
# CPU nodes
export FI_PROVIDER="lnx"
export FI_LNX_PROV_LINKS="shm+cxi:cxi0"
# GPU nodes
export FI_PROVIDER="lnx"
export FI_LNX_PROV_LINKS="shm+cxi:cxi0|shm+cxi:cxi1|shm+cxi:cxi2|shm+cxi:cxi3"
PMIx Support in Merlin7¶
Merlin7's SLURM installation includes support for multiple PMI types, including pmix. To view the available options, use the following command:
🔥 [caubet_m@login001:~]# srun --mpi=list
MPI plugin types are...
none
pmix
pmi2
cray_shasta
specific pmix plugin versions available: pmix_v5,pmix_v4,pmix_v3,pmix_v2
Important Notes:
* For OpenMPI, always use pmix by specifying the appropriate version (pmix_$version).
When loading an OpenMPI module (via PModules or Spack), the corresponding PMIx version will be automatically loaded.
* Users do not need to manually manage PMIx compatibility, unless they compile their own OpenMPI release.
Warning
PMI-2 is not supported in OpenMPI 5.0.0 or later releases. Despite this, pmi2 remains the default SLURM PMI type in Merlin7 as it is the officially supported type and maintains compatibility with other MPI implementations.