Skip to main content

Module zero_utils

Module zero_utils 

Source
Expand description

Utility functions and data structures for ZeRO optimization

Structs§

GradientBuffer
Gradient buffer for ZeRO Stage 2+
ParameterGroup
Parameter group for ZeRO optimization
ParameterPartition
Parameter partition for ZeRO Stage 3
PartitionInfo
Partition information for distributed parameters
ZeROState
ZeRO optimizer state management

Functions§

all_gather_gradients
All-gather gradients from all devices
calculate_bucket_size
Calculate optimal bucket size for gradient communication
gather_parameters
Gather parameters from all devices
partition_gradients
Partition gradients across devices for ZeRO Stage 2+
partition_parameters
Partition parameters across devices for ZeRO Stage 3
reduce_scatter_gradients
Reduce-scatter gradients across devices