Skip to main content

Module lm_runner

Module lm_runner 

Source
Expand description

Boxed-trait dispatch for LM runners (PLAN.md M3 + M8).

LmRunner is the minimal abstraction rlx_models::run::auto_runner returns from a path. M3 shipped the single-shot predict_logits method; M8 adds streaming generate(prompt_ids, n_new, on_token) with a sampler-agnostic greedy default that delegates to predict_logits on each step.

Per-family runners with a cached decode path (Qwen3Runner::generate, Qwen35Runner::generate_with_opts, GemmaRunner::generate, Llama32Runner::generate) should override this default with their fast path — the default exists so auto_runner(path)?.generate(...) always works, not as a recommended hot path.

Each per-family crate provides impl LmRunner for FooRunner so that rlx-models can hand back a Box<dyn LmRunner> from a single GGUF path without the caller knowing the family upfront.

Traits§

LmRunner
Minimal per-family runner interface.