Struct ChatModel

Source

pub struct ChatModel { /* private fields */ }

Expand description

High-level model for token-level text generation.

Load a trained checkpoint and generate token IDs in a single call. ChatModel automatically discovers the model config next to the checkpoint file. Users bring their own tokenizer to encode/decode text.

§Example

use multiscreen_rs::prelude::*;

fn main() -> multiscreen_rs::Result<()> {
    let model = ChatModel::load("checkpoints/latest.mpk")?;
    let token_ids = model.generate(&[1, 2, 3], GenerationConfig::default())?;
    println!("generated tokens: {:?}", token_ids);
    Ok(())
}

Implementations§

Source §

impl ChatModel

Source

pub fn load(path: impl AsRef<Path>) -> Result<Self>

Load a ChatModel from a checkpoint path.

path should point to a .mpk weights file (e.g. "checkpoints/latest.mpk" or "runs/chat/checkpoints/latest.mpk").

The method resolves config.json relative to the checkpoint’s parent directory for model architecture. Falls back to Params10M defaults.

Source

pub fn generate( &self, prompt: &[u32], config: GenerationConfig, ) -> Result<Vec<u32>>

Generate token IDs from a prompt token sequence.

Returns all generated tokens (prompt + new) at once. For streaming / token-by-token output, use Self::generate_stream.

Source

pub fn generate_stream( &self, prompt: &[u32], config: GenerationConfig, on_token: impl FnMut(u32, usize) -> bool, ) -> Result<Vec<u32>>

Generate token IDs one at a time, invoking a callback for each newly produced token.

This enables streaming / word-by-word output similar to ChatGPT. The callback receives (token_id, index) where index is the zero-based position of the new token (0 = first generated token). Return false from the callback to stop generation early.

Returns the full output sequence (prompt + generated tokens).

§Example

use multiscreen_rs::prelude::*;

fn main() -> multiscreen_rs::Result<()> {
    let model = ChatModel::load("checkpoints/latest.mpk")?;
    let prompt: &[u32] = &[1, 2, 3];

    let full_output = model.generate_stream(
        prompt,
        GenerationConfig::default(),
        |token_id, _index| {
            // Stream each token as it is produced.
            // Decode with YOUR tokenizer and print word-by-word.
            print!("{} ", token_id); // Replace with actual decoding
            true // return false to stop early
        },
    )?;

    println!("\nFull sequence: {:?}", full_output);
    Ok(())
}