Expand description
Optimizers for MLX training.
Tensors are immutable graph node handles — optimizers create new tensors for updated parameters rather than mutating in place.
Structs§
- AdamW
- AdamW optimizer (Adam with decoupled weight decay).
- Cosine
AnnealingLR - Cosine annealing learning rate scheduler.
- Sgd
- Stochastic Gradient Descent with optional momentum.
- StepLR
- Step-decay learning rate scheduler.
Traits§
- LrScheduler
- Trait for learning rate schedulers.
- Optimizer
- Optimizer trait: apply one step, returning updated parameters.