MPI: MPICH¶
Overview¶
The PSI Pmodules environment provides several MPI implementations for HPC workloads on PSI clusters. On Cray-based systems (ALPS) the default MPI implementation is typically Cray MPICH, which is tightly integrated with the Cray networking stack.
In addition, Open Source MPICH (OSS MPICH) is provided as an alternative MPI implementation. This can be useful for users who require upstream MPICH behavior, improved portability across clusters, or the ability to experiment with different communication backends.
What is MPICH?¶
MPICH is one of the reference implementations of the MPI (Message Passing Interface) standard. Many vendor MPI implementations are derived from it.
Main characteristics:
- Full support for modern MPI standards
- High portability across architectures
- Modular communication layers
- Hydra process manager for launching MPI jobs
- Support for OFI-based networking via libfabric
MPICH provides standard MPI compiler wrappers such as:
mpiccmpicxxmpifort
and runtime tools such as mpiexec.
Loading MPICH¶
MPICH modules are available through the PSI module system.
Example:
module purge
module load gcc/<version> mpich/<version>
module load intelcc/<version> mpich/<version>
Loading MPICH automatically pulls the required dependencies, such as libfabric.
libfabric Backend¶
MPICH uses libfabric (OFI) as the communication backend.
In the PSI module environment, libfabric is provided as a separate module dependency. This allows users to swap the networking library without rebuilding MPICH.
Example:
This feature can be useful for:
- testing newer OFI implementations
- debugging networking issues
- experimenting with different provider implementations
Note that unstable libfabric versions may include experimental features.
Running MPI Jobs with MPICH¶
MPI jobs should normally be launched through Slurm. Example job script:
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks=64
#SBATCH --hint=nomultithread
#SBATCH --ntasks-per-node=32
module load gcc/<version> mpich/<version>
srun ./my_mpi_application
Tip
Although MPICH provides mpiexec through the Hydra process manager, srun is typically preferred on Slurm systems.