Expand description
§Gradient Processing Enhancements
This module provides advanced gradient processing techniques that can improve training stability, convergence speed, and final model performance.
§Available Techniques
- Gradient Centralization: Removes the mean of gradients to improve convergence
- Gradient Standardization: Normalizes gradients to unit variance
- Adaptive Gradient Clipping: Dynamically adjusts clipping based on gradient history
- Gradient Noise Injection: Adds controlled noise to escape local minima
- Gradient Smoothing: Applies exponential moving average to gradients
- Hessian-based Preconditioning: Uses second-order information to precondition gradients
Structs§
- Adaptive
Clipping Config - Configuration for adaptive gradient clipping.
- Gradient
Processed Optimizer - Wrapper for optimizers that automatically applies gradient processing.
- Gradient
Processing Config - Configuration for gradient processing techniques.
- Gradient
Processor - Gradient processor that applies various enhancement techniques.
- Hessian
Preconditioning Config - Configuration for Hessian-based preconditioning.
- Noise
Injection Config - Configuration for gradient noise injection.
- Smoothing
Config - Configuration for gradient smoothing.
Enums§
- Hessian
Approximation Type - Types of Hessian approximation methods.
- Noise
Type - Types of noise for gradient noise injection.