Module memory_efficient_optimizer

Module memory_efficient_optimizer 

Source
Expand description

Memory-efficient optimizer operations

This module provides memory-efficient optimization for very large models through gradient accumulation, chunked processing, and memory usage estimation.

§Features

  • Gradient accumulation to reduce memory pressure
  • Chunked parameter processing for large models
  • Memory usage estimation and recommendations
  • Streaming gradient computation

§Performance

Enables optimization of models with billions of parameters through efficient memory management.

Structs§

ChunkedOptimizer
Chunked optimizer for processing large parameter arrays in chunks
GradientAccumulator
Gradient accumulator for memory-efficient training
MemoryUsageEstimator
Memory usage estimator for optimizers