Skip to main content

Module engine

Module engine 

Source
Expand description

Inference engine interface with streaming and batch support

This module provides the top-level inference engine interface that orchestrates all other components: tokenizer, model executor, scheduler, and sampler.

Traits§

AdvancedInferenceEngine
Advanced engine capabilities
InferenceEngine
Core inference engine trait

Type Aliases§

HardwareConstraints
Hardware constraints alias
LatencyRequirements
Latency requirements alias
RequestCharacteristics
Request characteristics alias
SpeculationConfig
Speculation configuration for speculative decoding