Trait LmRunner

Source

pub trait LmRunner: Send {
    // Required methods
    fn family(&self) -> &'static str;
    fn vocab_size(&self) -> usize;
    fn predict_logits(&mut self, prompt_ids: &[u32]) -> Result<Vec<f32>>;

    // Provided methods
    fn generate(
        &mut self,
        prompt_ids: &[u32],
        n_new: usize,
        on_token: &mut dyn FnMut(u32) -> bool,
    ) -> Result<Vec<u32>> { ... }
    fn supports_multimodal(&self) -> bool { ... }
    fn generate_multimodal(
        &mut self,
        _prompt: &str,
        _rgb: &[u8],
        _img_w: usize,
        _img_h: usize,
        _tokenizer: Option<&Path>,
        _n_new: usize,
        _on_token: &mut dyn FnMut(u32) -> bool,
    ) -> Result<Vec<u32>> { ... }
}

Expand description

Minimal per-family runner interface.

Implementations must be Send so the boxed trait can move across threads (e.g. when skill runs inference on a worker pool). Sync is intentionally not required — most runners hold mutable per-call compile / cache state.

Required Methods§

Source

fn family(&self) -> &'static str

Short family identifier matching rlx-cli::arch_runner_name (e.g. "qwen3", "qwen35", "gemma", "llama32"). Useful for logging / metrics / per-family branches in the caller.

Source

fn vocab_size(&self) -> usize

LM head vocab size — useful for callers that need to size a logit buffer or validate token ids before calling Self::predict_logits. PLAN.md M9.

Source

fn predict_logits(&mut self, prompt_ids: &[u32]) -> Result<Vec<f32>>

Run prefill on prompt_ids and return the last-token logits over the full vocab. Mirrors the existing predict_logits method on every per-family runner.

Provided Methods§

Source

fn generate( &mut self, prompt_ids: &[u32], n_new: usize, on_token: &mut dyn FnMut(u32) -> bool, ) -> Result<Vec<u32>>

Generate up to n_new tokens after prompt_ids using greedy (argmax) sampling. on_token fires once per generated token and returns true to continue, false to stop. Returns the generated id sequence (excluding the prompt).

Stop-signal honoring varies by family (PLAN.md M9):

default impl + Qwen35Runner — honor the return value.
Qwen3Runner / GemmaRunner / Llama32Runner — call the callback but ignore its return (their inherent generate doesn’t take a bool callback). Pass an EOS-aware sampler in the caller, or check produced.last() after the call.

Default impl is naive: re-prefill on the full context each step. Per-family runners override with their cached decode fast path.

Source

fn supports_multimodal(&self) -> bool

Whether this runner supports multimodal (image+text) generation via Self::generate_multimodal. Default false. Per-family runners that wire a vision encoder (e.g. Qwen35Runner with an mmproj path) override to true.

Source

fn generate_multimodal( &mut self, _prompt: &str, _rgb: &[u8], _img_w: usize, _img_h: usize, _tokenizer: Option<&Path>, _n_new: usize, _on_token: &mut dyn FnMut(u32) -> bool, ) -> Result<Vec<u32>>

Multimodal text generation: prefill the trunk with prompt text where image markers are spliced with vision embeddings derived from rgb (raw RGB bytes, row-major [h, w, 3]). Streams one token per on_token call; returns the full produced sequence.

Default impl returns an error — only family runners that wire a vision encoder override this. Match parity with llama-cpp’s MtmdContext-based multimodal eval path.

Dyn Compatibility§

This trait is dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

LmRunner

Trait LmRunner Copy item path

Required Methods§

fn family(&self) -> &'static str

fn vocab_size(&self) -> usize

fn predict_logits(&mut self, prompt_ids: &[u32]) -> Result<Vec<f32>>

Provided Methods§

fn generate( &mut self, prompt_ids: &[u32], n_new: usize, on_token: &mut dyn FnMut(u32) -> bool, ) -> Result<Vec<u32>>

fn supports_multimodal(&self) -> bool

fn generate_multimodal( &mut self, _prompt: &str, _rgb: &[u8], _img_w: usize, _img_h: usize, _tokenizer: Option<&Path>, _n_new: usize, _on_token: &mut dyn FnMut(u32) -> bool, ) -> Result<Vec<u32>>

Dyn Compatibility§

Implementors§

Trait LmRunner