pub struct EngineConfig {
pub model_path: String,
pub tokenizer_path: Option<String>,
pub context_size: Option<usize>,
pub num_threads: usize,
pub sampler: SamplerConfig,
pub prefill_chunk_size: usize,
pub offload_policy: OffloadPolicy,
}Expand description
Configuration for the inference engine.
Fields§
§model_path: StringPath to the GGUF model file.
tokenizer_path: Option<String>Path to the tokenizer JSON file (if not embedded in GGUF).
context_size: Option<usize>Context size override (None = use model default).
num_threads: usizeNumber of threads for parallel computation.
sampler: SamplerConfigSampling configuration.
prefill_chunk_size: usizePrefill chunk size: how many prompt tokens to process per forward call.
Set to 0 or usize::MAX to process the entire prompt in one batch.
Smaller values reduce peak memory usage for long prompts at the cost
of slightly higher overhead from multiple forward calls.
Default: 512.
offload_policy: OffloadPolicyCPU/disk offload policy.
Controls which model weights are kept resident in RAM and which are evicted to disk and reloaded on demand.
Default: OffloadPolicy::None — all weights remain in RAM, matching
classic llama.cpp behaviour.
Implementations§
Source§impl EngineConfig
impl EngineConfig
Sourcepub fn with_offload(self, policy: OffloadPolicy) -> Self
pub fn with_offload(self, policy: OffloadPolicy) -> Self
Set the CPU/disk offload policy, consuming self and returning the updated config (builder pattern).
Trait Implementations§
Source§impl Clone for EngineConfig
impl Clone for EngineConfig
Source§fn clone(&self) -> EngineConfig
fn clone(&self) -> EngineConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for EngineConfig
impl Debug for EngineConfig
Auto Trait Implementations§
impl Freeze for EngineConfig
impl RefUnwindSafe for EngineConfig
impl Send for EngineConfig
impl Sync for EngineConfig
impl Unpin for EngineConfig
impl UnsafeUnpin for EngineConfig
impl UnwindSafe for EngineConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more