Skip to main content

Crate axonml_distributed

Crate axonml_distributed 

Source
Expand description

Distributed training for AxonML — data, model, pipeline, and tensor parallelism.

DDP (DistributedDataParallel with gradient bucketing), FSDP (Fully Sharded Data Parallel — ZeRO-2/ZeRO-3 + HybridShard + CPU offload), Pipeline (GPipe/1F1B/Interleaved microbatch scheduling), collective ops (all-reduce with 5 strategies, broadcast, all-gather, reduce-scatter, gather, scatter, reduce, send/recv, barrier), ProcessGroup / World abstraction, NcclBackend (dynamic libcudart/libnccl loading, multi-node init via NcclUniqueId), and MockBackend (shared-state in-process simulation for deterministic testing).

§File

crates/axonml-distributed/src/lib.rs

§Author

Andrew Jewell Sr. — AutomataNexus LLC ORCID: 0009-0005-2158-7060

§Updated

April 14, 2026 11:15 PM EST

§Disclaimer

Use at own risk. This software is provided “as is”, without warranty of any kind, express or implied. The author and AutomataNexus shall not be held liable for any damages arising from the use of this software.

Re-exports§

pub use backend::Backend;
pub use backend::MockBackend;
pub use backend::ReduceOp;
pub use comm::all_gather;
pub use comm::all_reduce_max;
pub use comm::all_reduce_mean;
pub use comm::all_reduce_min;
pub use comm::all_reduce_product;
pub use comm::all_reduce_sum;
pub use comm::barrier;
pub use comm::broadcast;
pub use comm::broadcast_from;
pub use comm::gather_tensor;
pub use comm::is_main_process;
pub use comm::rank;
pub use comm::reduce_scatter_mean;
pub use comm::reduce_scatter_sum;
pub use comm::scatter_tensor;
pub use comm::sync_gradient;
pub use comm::sync_gradients;
pub use comm::world_size;
pub use ddp::DistributedDataParallel;
pub use ddp::GradSyncStrategy;
pub use ddp::GradientBucket;
pub use ddp::GradientSynchronizer;
pub use fsdp::CPUOffload;
pub use fsdp::ColumnParallelLinear;
pub use fsdp::FSDPMemoryStats;
pub use fsdp::FullyShardedDataParallel;
pub use fsdp::RowParallelLinear;
pub use fsdp::ShardingStrategy;
pub use pipeline::Pipeline;
pub use pipeline::PipelineMemoryStats;
pub use pipeline::PipelineSchedule;
pub use pipeline::PipelineStage;
pub use process_group::ProcessGroup;
pub use process_group::World;

Modules§

backend
Backend - Communication Backend Abstractions
comm
Communication - High-level Communication Utilities
ddp
DDP - Distributed Data Parallel
fsdp
FSDP - Fully Sharded Data Parallel
pipeline
Pipeline Parallelism
prelude
Common imports for distributed training.
process_group
ProcessGroup - Process Group Abstraction

Type Aliases§

DDP
Type alias for DistributedDataParallel.
FSDP
Type alias for FullyShardedDataParallel.