Skip to main content

Module engine

oxibonsai_runtime

Module engine

Expand description

Inference engine orchestrating model loading and generation.

The InferenceEngine is the main entry point for running inference. It owns the model, kernel dispatcher, and sampler, and provides both blocking (InferenceEngine::generate) and streaming (InferenceEngine::generate_streaming) generation APIs.

Structs§

EngineStats: Statistics about engine usage, accumulated over the engine’s lifetime.
InferenceEngine: Top-level inference engine.

Constants§

EOS_TOKEN_ID: EOS token for Qwen3 models.