Expand description
Optimizers for training neural networks
Re-exports§
pub use dp::add_gaussian_noise;pub use dp::clip_gradient;pub use dp::estimate_noise_multiplier;pub use dp::grad_norm;pub use dp::privacy_cost_per_step;pub use dp::DpError;pub use dp::DpSgd;pub use dp::DpSgdConfig;pub use dp::PrivacyBudget;pub use dp::RdpAccountant;pub use hpo::AcquisitionFunction;pub use hpo::GridSearch;pub use hpo::HPOError;pub use hpo::HyperbandScheduler;pub use hpo::HyperparameterSpace;pub use hpo::ParameterDomain;pub use hpo::ParameterValue;pub use hpo::SearchStrategy;pub use hpo::SurrogateModel;pub use hpo::TPEOptimizer;pub use hpo::Trial;pub use hpo::TrialStatus;
Modules§
Structs§
- Adam
- Adam optimizer (Adaptive Moment Estimation)
- AdamW
- AdamW optimizer
- Cosine
AnnealingLR - Cosine Annealing Learning Rate Scheduler
- Linear
WarmupLR - Linear Warmup Learning Rate Scheduler
- SGD
- SGD optimizer with optional momentum
- Step
DecayLR - Step Decay Learning Rate Scheduler
- Warmup
Cosine DecayLR - Warmup + Cosine Decay Learning Rate Scheduler
Traits§
- LRScheduler
- Learning rate scheduler trait
- Optimizer
- Trait for optimization algorithms
Functions§
- clip_
grad_ norm - Clip gradients by global norm
- clip_
grad_ norm_ refs - Clip gradients by global norm on borrowed parameter references.
- simd_
adam_ update - Fused Adam parameter update.
- simd_
adamw_ update - Fused AdamW parameter update with decoupled weight decay.
- simd_
axpy - AXPY operation: y = a*x + y