Struct Model

Source

pub struct Model { /* private fields */ }

Expand description

High-level model wrapper with builder pattern for text generation.

Use this when you need to:

Generate multiple times with the same model
Use streaming callbacks
Maintain conversation history
Access model metadata

§Example

use oxide_rs::Model;

let mut model = Model::new("model.gguf")?
    .with_options(oxide_rs::GenerateOptions {
        max_tokens: 256,
        temperature: 0.7,
        ..Default::default()
    })
    .load()?;

let response = model.generate("Hello!")?;
println!("{}", response);

Implementations§

Source §

impl Model

Source

pub fn new<P: AsRef<Path>>(model_path: P) -> Result<Self, Box<dyn Error>>

Create a new Model instance.

This only creates the Model struct - use load() to actually load the model.

§Arguments

model_path - Path to a GGUF model file

§Example

let model = Model::new("model.gguf")?;

Source

pub fn with_options(self, options: GenerateOptions) -> Self

Set generation options.

§Example

let model = Model::new("model.gguf")
    .with_options(GenerateOptions {
        max_tokens: 256,
        temperature: 0.8,
        ..Default::default()
    });

Source

pub fn with_tokenizer<P: AsRef<Path>>(self, tokenizer_path: P) -> Self

Set a custom tokenizer path.

If not provided, the tokenizer will be extracted from the GGUF file.

§Example

let model = Model::new("model.gguf")
    .with_tokenizer("tokenizer.json");

Source

pub fn load(&mut self) -> Result<(), Box<dyn Error>>

Load the model into memory.

This must be called before generate().

§Example

let mut model = Model::new("model.gguf")?.load()?;

Source

pub fn generate(&mut self, prompt: &str) -> Result<String, Box<dyn Error>>

Generate text from a prompt.

Requires load() to be called first.

§Arguments

prompt - The input prompt

§Example

let response = model.generate("What is Rust?")?;
println!("{}", response);

Source

pub fn generate_stream<F>( &mut self, prompt: &str, callback: F, ) -> Result<String, Box<dyn Error>>
where F: FnMut(String),

Generate text with streaming callback.

Tokens are passed to the callback as they’re generated, enabling real-time output display.

Requires load() to be called first.

§Arguments

prompt - The input prompt
callback - Function called for each generated token

§Example

model.generate_stream("Tell me a story", |token| {
    print!("{}", token);
})?;

Source

pub fn generate_batch( &mut self, prompts: Vec<String>, ) -> Result<Vec<String>, Box<dyn Error>>

Generate text from multiple prompts in batch.

Processes multiple prompts sequentially, sharing the loaded model for efficiency. Each prompt generates independently with its own output.

Requires load() to be called first.

§Arguments

prompts - Vector of input prompts

§Example

let prompts = vec!["Hello!", "How are you?", "What's up?"];
let results = model.generate_batch(prompts)?;
for result in results {
    println!("{}", result);
}

Source

pub fn warmup(&mut self, num_tokens: usize) -> Result<(), Box<dyn Error>>

Pre-compile compute kernels for faster first-token generation.

Call this after load() to warm up the model before first use.

§Arguments

num_tokens - Number of tokens to use for warmup (default: 128)

§Example

model.load()?;
model.warmup(128)?;
// First generation will be faster

Source

pub fn clear_history(&mut self)

Clear conversation history.

Removes all previous messages from the conversation context.

§Example

model.generate("Hello")?;
model.clear_history();

Source

pub fn metadata(&self) -> Option<&GgufMetadata>

Get model metadata.

Returns information about the loaded model including name, architecture, layer count, embedding size, etc.

§Example

if let Some(meta) = model.metadata() {
    println!("Model: {}", meta.name);
    println!("Architecture: {}", meta.architecture);
}

Source

pub fn context_used(&self) -> Option<usize>

Get current context usage.

Returns the number of tokens currently in the context.

§Example

println!("Using {} tokens", model.context_used());

Source

pub fn context_limit(&self) -> Option<usize>

Get context limit.

Returns the maximum context window size.

§Example

println!("Context limit: {} tokens", model.context_limit());

Source

pub fn context_percentage(&self) -> Option<f32>

Get context usage percentage.

Returns the percentage of context used (0.0 - 100.0).

§Example

println!("{:.1}% context used", model.context_percentage());

Auto Trait Implementations§

§

impl !UnwindSafe for Model

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T> Instrument for T

Source §

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

Source §

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §