Struct Engine

Source

pub struct Engine { /* private fields */ }

Expand description

High-level inference engine that wraps model loading, tokenization, and generation.

Engine is Send + Sync safe for the immutable model and config, but the mutable inference context and sampler are created per-call via internal state.

Implementations§

Source §

impl Engine

Source

pub fn load(config: EngineConfig) -> Result<Self, EngineError>

Load a model and create an inference engine.

This opens the model file (GGUF or ONNX), loads the tokenizer and model weights, and selects the appropriate backend (CPU or GPU).

Format is auto-detected by file extension:

.gguf – GGUF format (default)
.onnx – ONNX format (requires onnx feature, companion config.json + tokenizer.json)

Source

pub fn select_gpu_backend(model: &LlamaModel) -> Arc<dyn Backend>

Select the best available GPU backend.

Priority: CUDA > Metal > DX12 > Vulkan > CPU fallback.

Source

pub fn model_config(&self) -> &ModelConfig

Get the model configuration.

Source

pub fn chat_template(&self) -> &ChatTemplate

Get the detected chat template.

Source

pub fn gguf(&self) -> Option<&GgufFile>

Get the GGUF file metadata (None for ONNX-loaded models).

Source

pub fn tokenizer(&self) -> &Tokenizer

Get the tokenizer.

Source

pub fn engine_config(&self) -> &EngineConfig

Get the engine configuration.

Source

pub fn model(&self) -> &dyn Model

Get the underlying model (for advanced usage like perplexity computation).

Source

pub fn backend(&self) -> &Arc<dyn Backend>

Get the backend.

Source

pub fn add_bos(&self) -> bool

Whether to add a BOS token when encoding prompts.

Source

pub fn create_inference_context(&self) -> InferenceContext

Generate text from a prompt.

The prompt is automatically wrapped with the detected chat template unless it already contains chat formatting tokens.

Create an InferenceContext respecting the configured KV cache type.

Source

pub fn generate( &self, prompt: &str, max_tokens: usize, ) -> Result<String, EngineError>

Returns the generated text (not including the prompt).

Source

pub fn generate_streaming( &self, prompt: &str, max_tokens: usize, ) -> GenerationStream<'_> ⓘ

Generate text from a prompt, yielding tokens as they are produced.

Each item in the returned iterator is a Result<String, EngineError> containing the decoded text of one or more tokens.

Source

pub fn embed(&self, text: &str) -> Result<Vec<f32>, EngineError>

Extract embeddings from text using the model.

Auto Trait Implementations§

§

impl !UnwindSafe for Engine

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T> Instrument for T

Source §

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

Source §

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §