Expand description
Optimization algorithms for ToRSh
This crate provides PyTorch-compatible optimizers built on top of scirs2-optim.
§Features
- 80+ optimizers: Comprehensive collection including Adam, SGD, RAdam, Ranger, Lion, Sophia, and more
- Modern optimizers: Latest research including Schedule-Free AdamW and Prodigy
- Second-order methods: L-BFGS, Newton-CG, Trust Region, K-FAC, AdaHessian
- Learning rate schedulers: Step, exponential, cosine annealing, one-cycle, and more
- Mixed precision training: Full fp16/fp32 support with loss scaling
- Distributed optimization: AsyncSGD, Elastic Averaging, Federated Learning
- Advanced features: Gradient accumulation, fused kernels, memory-efficient implementations
- Research features: Quantum-inspired, neuromorphic, continual learning, green AI optimizers
§Quick Start
use torsh_optim::prelude::*;
use torsh_tensor::Tensor;
use std::sync::Arc;
use parking_lot::RwLock;
// Create parameters
let params = vec![Arc::new(RwLock::new(Tensor::scalar(1.0)?))];
// Create optimizer
let mut optimizer = Adam::new(params, Some(0.001), None, None, None, false);
// Training loop
for _ in 0..100 {
// ... compute gradients ...
optimizer.step()?;
optimizer.zero_grad();
}Re-exports§
pub use adam::Adam;pub use adam::AdamW;pub use distributed::DistributedBackend;pub use distributed::DistributedConfig;pub use distributed::DistributedOptimizer;pub use distributed::SyncStrategy;pub use rmsprop::RMSprop;pub use sgd::SGD;
Modules§
- adabelief
- AdaBelief optimizer implementation
- adabound
- AdaBound optimizer implementation
- adadelta
- AdaDelta optimizer
- adagrad
- AdaGrad optimizer
- adahessian
- AdaHessian optimizer
- adam
- Adam and AdamW optimizers
- adamax
- AdaMax optimizer implementation
- advanced
- Advanced optimizers using SciRS2 optimization algorithms
- asgd
- Averaged Stochastic Gradient Descent optimizer
- bayesian_
optimization - Bayesian Optimization for Hyperparameter Tuning
- benchmarks
- Comprehensive benchmarking suite for optimizers
- checkpointing
- composition
- Optimizer composition tools
- continual_
learning - Continual Learning Optimizers
- cross_
framework_ validation - Cross-framework validation tests for ToRSh optimizers
- debugging
- Debugging and analysis tools for optimizers
- differential_
privacy - Differential Privacy support for optimizers
- distributed
- Distributed optimization for multi-process training
- evolutionary_
strategies - Evolutionary Strategies for Optimization
- ftrl
- FTRL (Follow-The-Regularized-Leader) optimizer
- fused_
kernels - Fused optimizer kernels for improved performance
- grad_
accumulation - Gradient accumulation utilities for optimizers
- gradient_
free - Gradient-Free Optimization Methods
- green_
ai - Green AI Optimizers
- hyperparameter_
tuning - Automatic hyperparameter tuning for optimizers
- kfac
- K-FAC (Kronecker-Factored Approximate Curvature) optimizer
- lamb
- LAMB (Large Batch Optimization for Deep Learning) optimizer implementation
- lazy_
updates - lbfgs
- Limited-memory BFGS optimizer
- lion
- Lion (Evolved Sign Momentum) optimizer
- lookahead
- Lookahead optimizer implementation
- low_
precision - lr_
scheduler - Learning rate schedulers
- lr_
scheduler_ additional - Additional learning rate schedulers
- lr_
scheduler_ enhanced - Enhanced learning rate schedulers with advanced features
- memory_
efficient - Memory-efficient optimizer implementations
- memory_
mapped - mixed_
precision - Mixed precision support for optimizers
- nadam
- NAdam optimizer implementation
- natural_
gradient - Natural Gradient optimizer
- neural_
optimizer - Neural Optimizer - Research Feature
- neuromorphic
- Neuromorphic Optimization
- newton_
cg - Newton-CG (Newton-Conjugate Gradient) optimizer
- numerical_
stability_ tests - Numerical stability tests for optimizers
- online_
learning - Online learning optimizers and variance-reduced methods
- optimizer
- Base optimizer implementation utilities
- prelude
- prodigy
- Prodigy (An Adaptive Learning Rate Method) optimizer
- quantum_
inspired - Quantum-Inspired Optimization Algorithms
- radam
- Rectified Adam optimizer
- ranger
- Ranger optimizer implementation
- rmsprop
- RMSprop (Root Mean Square Propagation) optimizer
- robustness
- Robustness features for optimizers
- rprop
- Resilient Backpropagation optimizer
- schedule_
free - Schedule-Free Optimizers
- sgd
- Stochastic Gradient Descent (SGD) optimizer
- shampoo
- Shampoo optimizer
- sophia
- Sophia (Second-order Clipped Stochastic Optimization) optimizer
- sparse_
adam - Sparse Adam optimizer
- sparse_
updates - state_
dict_ ops - Optimized state dict operations for efficient optimizer state management
- stress_
tests - Stress tests for ToRSh optimizers
- trust_
region - Trust Region optimization methods
- yellowfin
- YellowFin optimizer
Macros§
- impl_
base_ scheduler_ methods - Macro to implement common LRScheduler methods for schedulers with a
basefield - impl_
scheduler_ with_ state - Macro to implement common LRScheduler methods with custom state handling
Structs§
- Optimizer
Options - Common optimizer options
- Optimizer
State - Optimizer state for serialization
- Param
Group - Parameter group
- Param
Group Builder - Builder for creating parameter groups with various options
- Param
Group State - Parameter group state
Enums§
- Optimizer
Error - Optimizer-specific error type
Constants§
Traits§
- Optimizer
- Base optimizer trait
Type Aliases§
- Optimizer
Result - Result type for optimizer operations