Skip to main content

InferenceEngine

Trait InferenceEngine 

Source
pub trait InferenceEngine: Send + Sync {
    // Required methods
    fn status<'life0, 'async_trait>(
        &'life0 self,
    ) -> Pin<Box<dyn Future<Output = EngineStatus> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait;
    fn shutdown<'life0, 'async_trait>(
        &'life0 self,
    ) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait;
    fn config(&self) -> &EngineConfig;
    fn metrics(&self) -> EngineMetrics;
    fn health_check<'life0, 'async_trait>(
        &'life0 self,
    ) -> Pin<Box<dyn Future<Output = HealthStatus> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait;

    // Provided methods
    fn cache_metrics_snapshot(&self) -> Option<Value> { ... }
    fn lora_metrics_snapshot(&self) -> Option<Value> { ... }
}
Expand description

Lifecycle / status methods shared by every engine kind.

LLM engines, embedders, transcribers, and TTS services all expose the same minimal status/metrics surface to the server / CLI. The modality-specific traits below extend this base.

Required Methods§

Source

fn status<'life0, 'async_trait>( &'life0 self, ) -> Pin<Box<dyn Future<Output = EngineStatus> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Get current engine status.

Source

fn shutdown<'life0, 'async_trait>( &'life0 self, ) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Shutdown engine gracefully.

Source

fn config(&self) -> &EngineConfig

Get engine configuration.

Source

fn metrics(&self) -> EngineMetrics

Get engine metrics.

Source

fn health_check<'life0, 'async_trait>( &'life0 self, ) -> Pin<Box<dyn Future<Output = HealthStatus> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Health check.

Provided Methods§

Source

fn cache_metrics_snapshot(&self) -> Option<Value>

Optional cache metrics emitted by concrete LLM engines.

The default keeps non-LLM and stub engines source-compatible. Real engines can expose prefix/session cache counters without forcing those fields into every modality’s core metrics type.

Source

fn lora_metrics_snapshot(&self) -> Option<Value>

Optional LoRA runtime metrics emitted by concrete LLM engines.

Dyn Compatibility§

This trait is dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

Implementors§