Skip to main content

Module llm_executor

Module llm_executor 

Source
Expand description

LlmExecutor<M> โ€” adapts a DecoderOnlyLLM to the ModelExecutor trait the engine scheduler calls.

This is the Model-as-Code equivalent of GenericModelExecutor: where GenericModelExecutor wraps a Box<dyn RunnerInterface> (legacy ModelRunner<B>), LlmExecutor wraps a Box<dyn DecoderOnlyLLM> (new-style per-model code such as Qwen3Model<B>).

Tokens/logits are currently bridged through candle Tensor for TensorRef โ€” Phase C will likely replace that with SmallTensor to drop candle from the hot path.

Structsยง

LlmExecutor