pub struct PagedAdamW { /* private fields */ }Expand description
Paged AdamW optimizer with CPU offloading.
Implements AdamW with optimizer state paging to CPU memory,
matching Python’s paged_adamw_32bit from bitsandbytes.
§Memory Behavior
- Optimizer states (
exp_avg,exp_avg_sq) stored on CPU - States paged to GPU only during parameter update
- Enables training 7B+ models on 24GB GPUs with
QLoRA
Implementations§
Source§impl PagedAdamW
impl PagedAdamW
Sourcepub fn new(
lr: f64,
weight_decay: f64,
page_size: usize,
max_gpu_memory: usize,
) -> PagedAdamW
pub fn new( lr: f64, weight_decay: f64, page_size: usize, max_gpu_memory: usize, ) -> PagedAdamW
Create a new paged AdamW optimizer.
§Arguments
lr- Learning rateweight_decay- Weight decay coefficientpage_size- Page size in bytes for CPU offloadingmax_gpu_memory- Maximum GPU memory for optimizer states (0 = unlimited)
Sourcepub fn with_betas(self, beta1: f64, beta2: f64) -> PagedAdamW
pub fn with_betas(self, beta1: f64, beta2: f64) -> PagedAdamW
Create with custom betas.
Sourcepub fn step_param(
&mut self,
name: &str,
param: &mut Tensor,
grad: &Tensor,
) -> Result<(), QLoraError>
pub fn step_param( &mut self, name: &str, param: &mut Tensor, grad: &Tensor, ) -> Result<(), QLoraError>
Perform optimizer step for a single parameter.
Implements AdamW update with CPU paging:
m_t = β₁ * m_{t-1} + (1 - β₁) * g_t
v_t = β₂ * v_{t-1} + (1 - β₂) * g_t²
m̂_t = m_t / (1 - β₁^t)
v̂_t = v_t / (1 - β₂^t)
θ_t = θ_{t-1} - lr * (m̂_t / (√v̂_t + ε) + λ * θ_{t-1})§Errors
Returns error if tensor operations fail.
Sourcepub fn memory_stats(&self) -> (usize, usize)
pub fn memory_stats(&self) -> (usize, usize)
Get memory usage statistics.
Auto Trait Implementations§
impl Freeze for PagedAdamW
impl !RefUnwindSafe for PagedAdamW
impl Send for PagedAdamW
impl Sync for PagedAdamW
impl Unpin for PagedAdamW
impl !UnwindSafe for PagedAdamW
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more