Expand description
Distributed Training - Data Parallelism and Model Parallelism
This module provides utilities for distributed training across multiple nodes/processes.
Structs§
- Data
Parallel Trainer - Data parallel trainer
- Distributed
Config - Distributed trainer configuration
- Distributed
Data Loader - Distributed data loader
- Distributed
Optimizer - Distributed optimizer wrapper
- Gradient
Compression - Gradient compression for communication efficiency
- Ring
AllReduce - Ring All-Reduce implementation
Enums§
- Communication
Backend - Communication backend for distributed training
- Compression
Method - Distributed
Strategy - Distributed training strategy
- Gradient
Aggregation - Gradient aggregation method