Expand description
Inference engine interfaces — split per modality.
Phase 5a step 2 splits the historical mega-trait (which mixed LLM generation, embedding, transcription, and TTS in one) into a base lifecycle trait and four modality-specific supertraits. Each engine impl now implements exactly the trait its modality needs; no more inert “unsupported” stubs.
Traits§
- Advanced
Inference Engine - Advanced engine capabilities — opt-in addition to LLM engines that support batching / speculation / runtime reconfig / diagnostics.
- Embed
Engine - Embedding engine (CLIP, BERT, etc.).
- Inference
Engine - Lifecycle / status methods shared by every engine kind.
- LlmInference
Engine - LLM text-generation engine.
- Transcribe
Engine - Speech-to-text (Whisper) engine.
- TtsEngine
- Text-to-speech (Qwen3-TTS, etc.) engine.
Type Aliases§
- Hardware
Constraints - Hardware constraints alias.
- Latency
Requirements - Latency requirements alias.
- Request
Characteristics - Request characteristics alias.
- Speculation
Config - Speculation configuration for speculative decoding.