pub enum OffloadPolicy {
None,
Budget {
ram_bytes: u64,
},
PinnedHotSet {
ram_bytes: u64,
pin_embeddings: bool,
pin_output_head: bool,
pin_last_n_layers: usize,
prefetch_n_ahead: usize,
},
}Expand description
Configures which weights stay in RAM vs. get evicted to disk.
§Variants
None— all layer weights remain in RAM. This is the default and matches the classic llama.cpp in-memory-only behaviour.Budget— evict weights until the resident set fits withinram_bytes. The LRU pager is activated; any tensor not pinned can be evicted.PinnedHotSet— likeBudgetbut a named hot-set (embeddings, output head, last N attention layers) is pinned and never evicted. Cold layers cycle in and out of RAM as they are needed.
Variants§
None
All layer weights remain in RAM (default, current behaviour).
Budget
Evict weights until resident set fits within ram_bytes.
PinnedHotSet
Pinned hot-set: evict cold layers but keep embeddings, output head, and the last N attention layers always resident.
Fields
Implementations§
Source§impl OffloadPolicy
impl OffloadPolicy
Sourcepub fn ram_budget_bytes(&self) -> Option<u64>
pub fn ram_budget_bytes(&self) -> Option<u64>
Return the RAM budget in bytes, if any eviction limit is set.
Returns None for OffloadPolicy::None (unlimited RAM usage).
Sourcepub fn is_disabled(&self) -> bool
pub fn is_disabled(&self) -> bool
Returns true if offloading is disabled (i.e. the default in-RAM path).
Trait Implementations§
Source§impl Clone for OffloadPolicy
impl Clone for OffloadPolicy
Source§fn clone(&self) -> OffloadPolicy
fn clone(&self) -> OffloadPolicy
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for OffloadPolicy
impl Debug for OffloadPolicy
Source§impl Default for OffloadPolicy
impl Default for OffloadPolicy
Source§fn default() -> OffloadPolicy
fn default() -> OffloadPolicy
Returns the “default value” for a type. Read more
Auto Trait Implementations§
impl Freeze for OffloadPolicy
impl RefUnwindSafe for OffloadPolicy
impl Send for OffloadPolicy
impl Sync for OffloadPolicy
impl Unpin for OffloadPolicy
impl UnsafeUnpin for OffloadPolicy
impl UnwindSafe for OffloadPolicy
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more