Skip to main content

Crate ferrompi

Crate ferrompi 

Source
Expand description

§ferrompi

Safe, generic Rust bindings for MPI (Message Passing Interface).

This crate wraps MPI functionality through a thin C layer, providing:

  • Type-safe generic API for all MPI datatypes
  • Blocking, nonblocking, and persistent (MPI 4.0+) collectives
  • Communicator management (split, duplicate)
  • RMA shared memory windows (with rma feature)
  • SLURM environment helpers (with numa feature)
  • Large count support (MPI 4.0+ _c variants for blocking/nonblocking collectives; persistent collectives currently reject count > INT_MAX with MPI_ERR_COUNT — full _c dispatch for persistent ops is deferred)

§Supported Types

All communication operations are generic over MpiDatatype: f32, f64, i32, i64, u8, u32, u64

§Quick Start

use ferrompi::{Mpi, ReduceOp};

fn main() -> Result<(), ferrompi::Error> {
    let mpi = Mpi::init()?;
    let world = mpi.world();

    let rank = world.rank();
    let size = world.size();
    println!("Hello from rank {} of {}", rank, size);

    // Generic broadcast — works with any MpiDatatype
    let mut data = vec![0.0f64; 100];
    if rank == 0 {
        data.fill(42.0);
    }
    world.broadcast(&mut data, 0)?;

    // Generic all-reduce
    let sum = world.allreduce_scalar(rank as f64, ReduceOp::Sum)?;
    println!("Rank {rank}: sum of all ranks = {sum}");

    Ok(())
}

§Feature Flags

FeatureDescriptionDependencies
rmaRMA shared memory window operations
numaNUMA-aware windows and SLURM helpersrma

§Capabilities

  • Generic API: All operations work with any MpiDatatype (f32, f64, i32, i64, u8, u32, u64)

  • Blocking collectives: barrier, broadcast, reduce, allreduce, gather, scatter, allgather, alltoall, scan, exscan, reduce_scatter_block, plus V-variants (gatherv, scatterv, allgatherv, alltoallv)

  • Nonblocking collectives: All 15 i-prefixed variants with Request handles

  • Persistent collectives (MPI 4.0+): All 15 _init variants with PersistentRequest handles

  • Scalar and in-place variants: reduce_scalar, allreduce_scalar, reduce_inplace, allreduce_inplace, scan_scalar, exscan_scalar

  • Point-to-point: send, recv, isend, irecv, sendrecv, probe, iprobe

  • Communicator management: split, split_type, split_shared, duplicate

  • Group operations: Group with incl/excl/union/intersection/difference, RankRange for range constructors, GroupComparison.

    Note: Mpi::create_from_group requires MPI 4.0+. Support is probed once and cached; see the function rustdoc for the cache invariant.

  • Custom datatypes: CustomDatatype (contiguous/vector/struct/resized) and StructField for struct-type builders.

  • User-defined reduction operations: UserOp wraps MPI_Op_create with safe closure storage and trampoline.

  • Distributed RMA windows (feature rma): Win<T> with [WinFenceAssert], [WinPscwAssert], [WinLockGuard], and [WinLockAllGuard] RAII guards.

  • Info objects: Info for runtime hint passing to communicator, window, and operation constructors.

  • Persistent point-to-point: send_init, bsend_init, rsend_init, ssend_init, recv_init methods on Communicator, each returning a PersistentRequest.

  • Shared memory windows (feature rma): SharedWindow<T> with RAII lock guards for NUMA-aware intra-node shared memory (distinct from the distributed Win<T> windows above).

  • SLURM helpers (feature numa): Job topology queries via slurm module

  • Rich error handling: MpiErrorClass categorization with messages from the MPI runtime

§Thread Safety

Communicator is Send + Sync to support hybrid MPI + threads programs (e.g., MPI between nodes, std::thread::scope within a node).

The actual thread-safety guarantees depend on the thread level requested at initialization:

Thread LevelWho can call MPISynchronization
ThreadLevel::SingleMain thread onlyN/A
ThreadLevel::FunneledMain thread onlyN/A
ThreadLevel::SerializedAny threadUser must serialize
ThreadLevel::MultipleAny threadNone needed
use ferrompi::{Mpi, ThreadLevel};

// Request serialized thread support for hybrid MPI + threads
let mpi = Mpi::init_thread(ThreadLevel::Funneled).unwrap();
assert!(mpi.thread_level() >= ThreadLevel::Funneled);

Mpi itself is !Send + !Sync — MPI initialization and finalization must occur on the same thread. Only Communicator handles (and the operations on them) may cross thread boundaries.

§Send/Sync Status of Public Types

TypeSend/SyncNotes
CommunicatorSend + SyncExplicit unsafe impl in src/comm/mod.rs; cross-thread use is the primary hybrid MPI use case.
Mpi!Send + !SyncPhantomData<*const ()> field; init and finalize must occur on the same thread.
GroupSend + SyncExplicit unsafe impl in src/group.rs; handles are opaque integers, MPI-thread-safe under MPI_THREAD_MULTIPLE.
RequestSend + SyncAuto-derived; i64 + bool fields. Cross-thread use requires MPI_THREAD_MULTIPLE. Buffer-lifetime invariant still applies.
PersistentRequestSend + SyncAuto-derived; same shape as Request. ADR-0004 §“Drop behavior” applies across thread boundaries.
StatusSend + SyncPOD wrapper; all fields are Copy.
CustomDatatypeSend + SyncExplicit unsafe impl in src/datatype_builder.rs; handle is an opaque integer.
InfoSend + SyncAuto-derived from i32 + bool fields; MPI info objects are thread-safe under MPI_THREAD_MULTIPLE.
UserOp<T>Send + Sync (for T: MpiDatatype)Auto-derived: fields are i32 + PhantomData<T>. The trait bound MpiDatatype: Copy + Send + 'static and the fact that all concrete MpiDatatype impls are also Sync give Send + Sync for UserOp<T>. The global closure registry uses internal unsafe impl Send/Sync on its slots; that is a separate object from UserOp<T> itself.
Win<T> (feature rma)!Send + !SyncNonNull<T> field suppresses auto-traits; RMA window’s local memory pointer is not safe to share across threads.
SharedWindow<T> (feature rma)!Send + !SyncNonNull<T> field; same rationale as Win<T>.
[LockGuard<'a, T>] (feature rma)!Send + !SyncBorrows SharedWindow<T>; inherits non-Send/Sync.
[LockAllGuard<'a, T>] (feature rma)!Send + !SyncBorrows SharedWindow<T>; inherits non-Send/Sync.
WinLockGuard<'g, 'a, T> (feature rma)!Send + !SyncBorrows Win<T>; inherits non-Send/Sync.
WinLockAllGuard<'g, 'a, T> (feature rma)!Send + !SyncBorrows Win<T>; inherits non-Send/Sync.

§Hybrid MPI+OpenMP

For hybrid parallelism, use Mpi::init_thread() with the appropriate level:

  • Funneled (recommended): Only the main thread makes MPI calls. OpenMP threads handle computation between MPI calls.
  • Serialized: Any thread can make MPI calls, but only one at a time.
  • Multiple: Full concurrent MPI from any thread (highest overhead).
use ferrompi::{Mpi, ThreadLevel, ReduceOp};

let mpi = Mpi::init_thread(ThreadLevel::Funneled).unwrap();
assert!(mpi.thread_level() >= ThreadLevel::Funneled);

let world = mpi.world();
// Worker threads compute locally, main thread calls MPI
let local = 42.0_f64;
let global = world.allreduce_scalar(local, ReduceOp::Sum).unwrap();

§SLURM Configuration

#SBATCH --ntasks-per-node=4        # MPI ranks per node
#SBATCH --cpus-per-task=8          # OpenMP threads per rank
#SBATCH --bind-to core             # Pin MPI ranks
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun ./my_program

Use the slurm module (with numa feature) to read these values at runtime. See examples/hybrid_openmp.rs for the full pattern.

§Extended documentation

Long-form documentation artifacts are embedded in the doc module and render as individual pages in this rustdoc. The same content is available as plain Markdown in the docs/ directory.

ModuleDescription
doc::architectureSix-layer stack, handle tables, thread-safety model, FFI/ABI invariants, and generic MpiDatatype design
doc::migrating_from_rsmpiFunction-for-function API mapping and migration cookbook from rsmpi
doc::mpi_compatibilityCompatibility matrix for MPICH, Open MPI, Intel MPI, and Cray MPI
doc::adr_0001_why_c_wrapperADR-0001: why a hand-written C wrapper is used instead of bindgen
doc::adr_0002_handle_tablesADR-0002: C11 atomic CAS strategy for the request-table under MPI_THREAD_MULTIPLE
doc::adr_0003_generic_mpi_datatypeADR-0003: sealed MpiDatatype trait family and DatatypeTag ABI contract
doc::adr_0004_persistent_collective_approachADR-0004: PersistentRequest lifecycle and buffer-lifetime invariants
doc::adr_0005_mpi_op_createADR-0005: MPI_Op_create closure storage, trampoline safety, and drop ordering

Modules§

doc
Long-form documentation for ferrompi.

Structs§

Communicator
An MPI communicator.
CustomDatatype
A committed derived MPI datatype backed by the C-side datatype_table.
DoubleInt
Paired { f64 value; i32 index } — maps to MPI_DOUBLE_INT.
FloatInt
Paired { f32 value; i32 index } — maps to MPI_FLOAT_INT.
Group
An MPI group handle.
HostEntry
A single host and its assigned MPI ranks.
Info
An MPI info object for passing hints to MPI operations.
Int2
Paired { i32 value; i32 index } — maps to MPI_2INT.
LongDoubleInt
Paired { long double value; i32 index } — maps to MPI_LONG_DOUBLE_INT.
LongInt
Paired { i64 value; i32 index } — maps to MPI_LONG_INT.
Mpi
MPI environment handle.
PersistentRequest
A persistent MPI request handle.
RankRange
Inclusive rank range [first, last] with positive stride.
Request
A handle to a nonblocking MPI operation.
ShortInt
Paired { i16 value; i32 index } — maps to MPI_SHORT_INT.
Status
Information about a probed or received MPI message.
StructField
One field of a struct-type passed to CustomDatatype::create_struct.
TopologyInfo
MPI topology information gathered across all ranks in a communicator.
UserOp
A user-defined MPI reduction operation backed by a Rust closure.

Enums§

DatatypeTag
Tag values matching C-side FERROMPI_* defines.
Error
Error types for MPI operations.
GroupComparison
Outcome of Group::compare.
MpiErrorClass
MPI error class, categorizing the type of MPI error.
ReduceOp
Reduction operations
SplitType
Split types for Communicator::split_type.
ThreadLevel
MPI thread support levels

Traits§

BytePermutable
Trait for types whose byte representation is valid for use with MPI_BYTE-typed bitwise reductions.
MpiDatatype
Trait for types that can be used in MPI communication operations.
MpiIndexedDatatype
Trait for types that can be used with MPI_MAXLOC / MPI_MINLOC.

Type Aliases§

Result
Result type for MPI operations.