FerroMPI
Safe, generic Rust bindings for MPI 4.x with persistent collectives support.
FerroMPI provides safe, generic Rust bindings to MPI through a thin C wrapper layer, enabling access to MPI 4.0+ features like persistent collectives that are not available in other Rust MPI bindings. All communication operations are generic over MpiDatatype, supporting f32, f64, i32, i64, u8, u32, and u64.
Features
- ๐ MPI 4.0+ support: Persistent collectives, large-count operations
- ๐ชถ Lightweight: Minimal C wrapper (~2400 lines), focused API
- ๐ Safe: Rust-idiomatic API with proper error handling and RAII
- ๐ง Flexible: Works with MPICH, OpenMPI, Intel MPI, and Cray MPI
- โก Fast: Zero-cost abstractions, direct FFI calls
- ๐งฌ Generic: Type-safe API for all supported MPI datatypes
- ๐งต Thread-safe:
CommunicatorisSend + Syncfor hybrid MPI+threads programs - ๐ช Shared memory: RMA windows with RAII lock guards (feature:
rma) - ๐ SLURM integration: Job topology helpers (feature:
numa)
Why FerroMPI?
| Feature | FerroMPI | rsmpi |
|---|---|---|
| MPI Version | 4.1 | 3.1 |
| Persistent Collectives | โ | โ |
| Large Count (>2ยณยน) | โ | โ |
| Generic API | โ | โ |
| Shared Memory Windows | โ | โ |
| Thread Safety | Send + Sync |
!Send |
| API Style | Minimal, focused | Comprehensive |
| C Wrapper | ~2400 lines | None (direct bindings) |
FerroMPI is ideal for:
- Iterative algorithms benefiting from persistent collectives (10-30% speedup)
- Applications with large data transfers (>2GB)
- Hybrid MPI+threads programs (OpenMP, Rayon,
std::thread) - Intra-node shared memory communication
- Users who want a simple, focused MPI API
Supported Types
All communication operations are generic over MpiDatatype:
| Rust Type | MPI Equivalent |
|---|---|
f32 |
MPI_FLOAT |
f64 |
MPI_DOUBLE |
i32 |
MPI_INT32_T |
i64 |
MPI_INT64_T |
u8 |
MPI_UINT8_T |
u32 |
MPI_UINT32_T |
u64 |
MPI_UINT64_T |
Feature Flags
| Feature | Description | Dependencies |
|---|---|---|
rma |
RMA shared memory window operations | โ |
numa |
NUMA-aware shared memory windows and SLURM helpers | rma |
Enable features in your Cargo.toml:
[]
= { = "0.2", = ["rma"] }
Quick Start
Installation
Add to your Cargo.toml:
[]
= "0.2"
Requirements
- Rust 1.74+
- MPICH 4.0+ (recommended) or OpenMPI 5.0+
Ubuntu/Debian:
macOS:
Hello World
use ;
Examples
Blocking Collectives
use ;
let mpi = init?;
let world = mpi.world;
// Broadcast (generic โ works with f64, i32, u8, etc.)
let mut data = vec!;
if world.rank == 0
world.broadcast?;
// All-reduce
let send = vec!;
let mut recv = vec!;
world.allreduce?;
// Gather
let my_data = vec!;
let mut gathered = vec!;
world.gather?;
// Works with integers too!
let mut int_data = vec!;
world.broadcast?;
Nonblocking Collectives
use ;
let mpi = init?;
let world = mpi.world;
let send = vec!;
let mut recv = vec!;
// Start nonblocking operation
let request = world.iallreduce?;
// Do other work while communication proceeds...
expensive_computation;
// Wait for completion
request.wait?;
// recv now contains the result
Persistent Collectives (MPI 4.0+)
use ;
let mpi = init?;
let world = mpi.world;
// Buffer used for all iterations
let mut data = vec!;
// Initialize ONCE
let mut persistent = world.bcast_init?;
// Use MANY times โ amortizes setup cost!
for iter in 0..10000
// Cleanup on drop
Point-to-Point Communication
use Mpi;
let mpi = init?;
let world = mpi.world;
if world.rank == 0 else if world.rank == 1
Available Examples
Run examples with mpiexec:
# Core examples
# Communicator management
# Scan and variable-length collectives
# Shared memory (requires --features rma)
# Hybrid MPI+threads
| Example | Description | Feature |
|---|---|---|
hello_world |
Basic MPI initialization and rank/size query | โ |
ring |
Point-to-point ring communication pattern | โ |
allreduce |
Blocking and nonblocking allreduce | โ |
nonblocking |
Nonblocking collective operations | โ |
persistent_bcast |
Persistent broadcast (MPI 4.0+) | โ |
pi_monte_carlo |
Monte Carlo Pi estimation with reduce | โ |
comm_split |
Communicator splitting and management | โ |
scan |
Prefix scan and exclusive scan operations | โ |
gatherv |
Variable-length gather (gatherv) | โ |
shared_memory |
Shared memory windows with RAII lock guards | rma |
hybrid_openmp |
Hybrid MPI + threads with thread-level init | โ |
API Reference
Core Types
| Type | Description |
|---|---|
Mpi |
MPI environment handle (init/finalize) |
Communicator |
MPI communicator wrapper |
Request |
Nonblocking operation handle |
PersistentRequest |
Persistent operation handle (MPI 4.0+) |
MpiDatatype |
Trait for types usable in MPI ops |
Status |
Message status (source, tag, count) |
Info |
MPI_Info object with RAII |
SharedWindow<T> |
Shared memory window (feature: rma) |
LockGuard |
RAII window lock (feature: rma) |
LockAllGuard |
RAII window lock-all (feature: rma) |
Collective Operations
| Operation | Blocking | Nonblocking | Persistent |
|---|---|---|---|
| Broadcast | broadcast |
ibroadcast |
bcast_init |
| Reduce | reduce |
ireduce |
reduce_init |
| Allreduce | allreduce |
iallreduce |
allreduce_init |
| Gather | gather |
igather |
gather_init |
| Allgather | allgather |
iallgather |
allgather_init |
| Scatter | scatter |
iscatter |
scatter_init |
| Alltoall | alltoall |
ialltoall |
alltoall_init |
| Scan | scan |
iscan |
scan_init |
| Exscan | exscan |
iexscan |
exscan_init |
| Reduce-scatter-block | reduce_scatter_block |
ireduce_scatter_block |
reduce_scatter_block_init |
| Barrier | barrier |
ibarrier |
โ |
Additional scalar and in-place variants:
| Variant | Description |
|---|---|
reduce_scalar |
Reduce a single value (returns scalar on root) |
reduce_inplace |
In-place reduce (root's buffer is both send/recv) |
allreduce_scalar |
Allreduce a single value (returns scalar) |
allreduce_inplace |
In-place allreduce |
allreduce_init_inplace |
Persistent in-place allreduce |
scan_scalar |
Prefix scan on a single value |
exscan_scalar |
Exclusive prefix scan on a single value |
Variable-length (V-variant) collectives:
| Operation | Blocking | Nonblocking | Persistent |
|---|---|---|---|
| Gatherv | gatherv |
igatherv |
gatherv_init |
| Scatterv | scatterv |
iscatterv |
scatterv_init |
| Allgatherv | allgatherv |
iallgatherv |
allgatherv_init |
| Alltoallv | alltoallv |
ialltoallv |
alltoallv_init |
Point-to-Point Operations
| Operation | Description |
|---|---|
send |
Blocking send |
recv |
Blocking receive (returns source, tag, count) |
isend |
Nonblocking send (returns Request) |
irecv |
Nonblocking receive (returns Request) |
sendrecv |
Simultaneous send and receive |
probe |
Blocking probe (returns Status) |
iprobe |
Nonblocking probe (returns Option<Status>) |
Reduction Operations
Thread Safety
Communicator is Send + Sync, enabling hybrid MPI + threads programs where MPI handles inter-node communication and threads (via std::thread, Rayon, or OpenMP) handle intra-node parallelism.
The thread-safety guarantee depends on the level requested at initialization:
| Thread Level | Who can call MPI | Use case |
|---|---|---|
Single |
Main thread only | Pure MPI, no threads |
Funneled |
Main thread only | Threads compute, main calls MPI |
Serialized |
Any thread | User serializes MPI calls |
Multiple |
Any thread | Full concurrent MPI access |
use ;
// Request funneled support for hybrid MPI + threads
let mpi = init_thread?;
assert!;
let world = mpi.world;
// Worker threads compute locally, main thread calls MPI
let local = 42.0_f64;
let global = world.allreduce_scalar?;
See examples/hybrid_openmp.rs for a complete hybrid MPI + threads pattern.
SLURM Configuration
The numa feature flag enables the slurm module with helpers for reading SLURM job topology at runtime. These functions return None when not running under SLURM.
[]
= { = "0.2", = ["numa"] }
| Function | SLURM Variable | Description |
|---|---|---|
is_slurm_job() |
SLURM_JOB_ID |
Check if running under SLURM |
job_id() |
SLURM_JOB_ID |
Unique job identifier |
local_rank() |
SLURM_LOCALID |
Task ID relative to this node |
local_size() |
SLURM_NTASKS_PER_NODE |
Number of tasks on this node |
num_nodes() |
SLURM_NNODES |
Total number of allocated nodes |
cpus_per_task() |
SLURM_CPUS_PER_TASK |
CPUs allocated per task |
node_name() |
SLURM_NODENAME |
Name of this compute node |
node_list() |
SLURM_NODELIST |
Compact list of allocated nodes |
Example SLURM batch script for hybrid MPI + threads:
#!/bin/bash
#SBATCH --ntasks-per-node=4 # MPI ranks per node
#SBATCH --cpus-per-task=8 # threads per rank
#SBATCH --bind-to core
RMA / Shared Memory Windows
The rma feature flag enables SharedWindow<T>, a safe wrapper around MPI_Win_allocate_shared with RAII lifecycle management. Shared memory windows allow processes on the same node to directly access each other's memory without message passing.
[]
= { = "0.2", = ["rma"] }
use ;
let mpi = init?;
let world = mpi.world;
let node = world.split_shared?;
// Each process allocates 100 f64s in shared memory
let mut win = allocate?;
// Write to local portion
// Fence synchronization โ all processes participate
win.fence?;
// Read from any rank's memory (zero-copy!)
let remote = win.remote_slice?;
println!;
Synchronization modes:
- Active target (
fence): Bulk-synchronous, all processes participate - Passive target (
lock/lock_all): Fine-grained one-sided access with RAII guards
See examples/shared_memory.rs for a complete shared memory example.
Running Tests
# Unit tests (no MPI required)
# MPI integration tests (requires mpiexec)
MPI_NP=8
# Build and run individual examples
Configuration
Environment Variables
| Variable | Description | Example |
|---|---|---|
MPI_PKG_CONFIG |
pkg-config name | mpich, ompi |
MPICC |
MPI compiler wrapper | /opt/mpich/bin/mpicc |
CRAY_MPICH_DIR |
Cray MPI installation | /opt/cray/pe/mpich/8.1.25 |
Build Configuration
FerroMPI automatically detects MPI installations via:
MPI_PKG_CONFIGenvironment variable- pkg-config (
mpich,ompi,mpi) mpicc -showoutputCRAY_MPICH_DIR(for Cray systems)- Common installation paths
Troubleshooting
"Could not find MPI installation"
# Check if MPI is installed
# Set pkg-config name explicitly
"Persistent collectives not available"
Persistent collectives require MPI 4.0+. Check your MPI version:
# MPICH Version: 4.2.0 โ
# Open MPI 5.0.0 โ
# MPICH Version: 3.4.2 โ (too old)
macOS linking issues
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Rust Application โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ferrompi (Safe Rust) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ffi.rs (bindings) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ferrompi.c (C layer) โ โ ~2400 lines
โโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ MPICH / OpenMPI โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโ
The C layer provides:
- Handle tables for MPI opaque objects (256 comms, 16384 requests, 256 windows, 64 infos)
- Automatic large-count operation selection
- Request management
- Graceful degradation for MPI <4.0
License
Licensed under either of:
- MIT license (LICENSE-MIT)
- Apache License, Version 2.0 (LICENSE-APACHE)
at your option.
Contributing
Contributions welcome! Please ensure:
- All examples pass with
mpiexec -n 4 - New features include tests and documentation
- Code follows Rust style guidelines (
cargo fmt,cargo clippy)
Acknowledgments
FerroMPI was inspired by:
- rsmpi - Comprehensive MPI bindings for Rust
- The MPI Forum for the excellent MPI 4.0 specification