Skip to main content

Module executor

Module executor 

Source
Expand description

Model executor implementations.

Decoder-only LLMs go through LlmExecutor (wrapping any Box<dyn DecoderOnlyLLM>). Per-modality executors (Bert / Clip / Whisper / Tts) remain separate — they have different forward contracts that don’t fit the prefill/decode interface.

Re-exports§

pub use bert_executor::BertModelExecutor;
pub use clip_executor::ClipModelExecutor;
pub use llm_executor::LlmExecutor;
pub use stub_executor::StubModelExecutor;
pub use tts_executor::TtsModelExecutor;
pub use whisper_executor::WhisperModelExecutor;

Modules§

bert_executor
BERT Model Executor for embeddings
clip_executor
CLIP Model Executor for multimodal embeddings.
common
Common executor utilities — extracted from duplicated code across Qwen3, Qwen2, and Llama executors.
llm_executor
LlmExecutor<M> — adapts a DecoderOnlyLLM to the ModelExecutor trait the engine scheduler calls.
stub_executor
Stub model executor for MVP testing and development
tts_executor
Qwen3-TTS Executor — text-to-speech pipeline wiring Talker LM + Vocoder.
whisper_executor
Whisper ASR Executor — full decode pipeline matching Python whisper.