pub enum SleepLevel {
L1,
L2,
CudaSuspend,
Checkpoint,
Stop,
}Expand description
Sleep level for hibernating models
Variants§
L1
Level 1: Offload weights to CPU RAM (faster wake)
L2
Level 2: Discard weights (slower wake, less RAM)
CudaSuspend
Level 3: CUDA suspend via cuda-checkpoint toggle (process stays alive, GPU freed, VRAM contents held in host RAM, full state preserved)
Checkpoint
Level 4: CRIU checkpoint (snapshot process to disk, frees all GPU/CPU memory, preserves full state including KV cache, CUDA graphs, and warmed allocator)
Stop
Level 5: Stop the vLLM process entirely (full restart on wake)
Trait Implementations§
Source§impl Clone for SleepLevel
impl Clone for SleepLevel
Source§fn clone(&self) -> SleepLevel
fn clone(&self) -> SleepLevel
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for SleepLevel
impl Debug for SleepLevel
Source§impl From<u8> for SleepLevel
impl From<u8> for SleepLevel
Source§impl PartialEq for SleepLevel
impl PartialEq for SleepLevel
impl Copy for SleepLevel
impl Eq for SleepLevel
impl StructuralPartialEq for SleepLevel
Auto Trait Implementations§
impl Freeze for SleepLevel
impl RefUnwindSafe for SleepLevel
impl Send for SleepLevel
impl Sync for SleepLevel
impl Unpin for SleepLevel
impl UnwindSafe for SleepLevel
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
Compare self to
key and return true if they are equal.