Skip to main content

Module zero

Module zero 

Source
Expand description

ZeRO (Zero Redundancy Optimizer) Implementation for TrustformeRS

ZeRO is a memory-efficient training technique that partitions optimizer states, gradients, and parameters across devices to reduce memory usage while maintaining training efficiency.

Implements three stages:

  • Stage 1: Partition optimizer states
  • Stage 2: Partition optimizer states + gradients
  • Stage 3: Partition optimizer states + gradients + parameters

Re-exports§

pub use zero_optimizer::ZeROConfig;
pub use zero_optimizer::ZeROOptimizer;
pub use zero_optimizer::ZeROStage;
pub use zero_stage1::ZeROStage1;
pub use zero_stage2::ZeROStage2;
pub use zero_stage3::ZeROStage3;
pub use zero_utils::all_gather_gradients;
pub use zero_utils::gather_parameters;
pub use zero_utils::partition_gradients;
pub use zero_utils::partition_parameters;
pub use zero_utils::reduce_scatter_gradients;
pub use zero_utils::GradientBuffer;
pub use zero_utils::ParameterGroup;
pub use zero_utils::ParameterPartition;
pub use zero_utils::ZeROState;

Modules§

zero_optimizer
Main ZeRO Optimizer Implementation
zero_stage1
ZeRO Stage 1: Optimizer State Partitioning
zero_stage2
ZeRO Stage 2: Optimizer State + Gradient Partitioning
zero_stage3
ZeRO Stage 3: Full Parameter Partitioning
zero_utils
Utility functions and data structures for ZeRO optimization

Structs§

ZeROMemoryStats
Memory statistics for ZeRO optimization

Enums§

ZeROImplementationStage
ZeRO optimization stages