Expand description
Boxed-trait dispatch for LM runners (PLAN.md M3 + M8).
LmRunner is the minimal abstraction rlx_models::run::auto_runner
returns from a path. M3 shipped the single-shot predict_logits
method; M8 adds streaming generate(prompt_ids, n_new, on_token)
with a sampler-agnostic greedy default that delegates to
predict_logits on each step.
Per-family runners with a cached decode path
(Qwen3Runner::generate, Qwen35Runner::generate_with_opts,
GemmaRunner::generate, Llama32Runner::generate) should
override this default with their fast path — the default exists
so auto_runner(path)?.generate(...) always works, not as a
recommended hot path.
Each per-family crate provides impl LmRunner for FooRunner so
that rlx-models can hand back a Box<dyn LmRunner> from a
single GGUF path without the caller knowing the family upfront.
Traits§
- LmRunner
- Minimal per-family runner interface.