Expand description
Model executor implementations.
Decoder-only LLMs go through LlmExecutor (wrapping any
Box<dyn DecoderOnlyLLM>). Per-modality executors (Bert / Clip / Whisper /
Tts) remain separate — they have different forward contracts that don’t
fit the prefill/decode interface.
Re-exports§
pub use bert_executor::BertModelExecutor;pub use clip_executor::ClipModelExecutor;pub use llm_executor::LlmExecutor;pub use stub_executor::StubModelExecutor;pub use tts_executor::TtsModelExecutor;pub use whisper_executor::WhisperModelExecutor;
Modules§
- bert_
executor - BERT Model Executor for embeddings
- clip_
executor - CLIP Model Executor for multimodal embeddings.
- common
- Common executor utilities — extracted from duplicated code across Qwen3, Qwen2, and Llama executors.
- llm_
executor LlmExecutor<M>— adapts aDecoderOnlyLLMto theModelExecutortrait the engine scheduler calls.- stub_
executor - Stub model executor for MVP testing and development
- tts_
executor - Qwen3-TTS Executor — text-to-speech pipeline wiring Talker LM + Vocoder.
- whisper_
executor - Whisper ASR Executor — full decode pipeline matching Python whisper.