pub trait InferenceEngine: Send + Sync {
// Required methods
fn status<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = EngineStatus> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
fn shutdown<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
fn config(&self) -> &EngineConfig;
fn metrics(&self) -> EngineMetrics;
fn health_check<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = HealthStatus> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
// Provided methods
fn cache_metrics_snapshot(&self) -> Option<Value> { ... }
fn lora_metrics_snapshot(&self) -> Option<Value> { ... }
}Expand description
Lifecycle / status methods shared by every engine kind.
LLM engines, embedders, transcribers, and TTS services all expose the same minimal status/metrics surface to the server / CLI. The modality-specific traits below extend this base.
Required Methods§
Sourcefn status<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = EngineStatus> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn status<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = EngineStatus> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Get current engine status.
Sourcefn shutdown<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn shutdown<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Shutdown engine gracefully.
Sourcefn config(&self) -> &EngineConfig
fn config(&self) -> &EngineConfig
Get engine configuration.
Sourcefn metrics(&self) -> EngineMetrics
fn metrics(&self) -> EngineMetrics
Get engine metrics.
Sourcefn health_check<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = HealthStatus> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn health_check<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = HealthStatus> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Health check.
Provided Methods§
Sourcefn cache_metrics_snapshot(&self) -> Option<Value>
fn cache_metrics_snapshot(&self) -> Option<Value>
Optional cache metrics emitted by concrete LLM engines.
The default keeps non-LLM and stub engines source-compatible. Real engines can expose prefix/session cache counters without forcing those fields into every modality’s core metrics type.
Sourcefn lora_metrics_snapshot(&self) -> Option<Value>
fn lora_metrics_snapshot(&self) -> Option<Value>
Optional LoRA runtime metrics emitted by concrete LLM engines.
Dyn Compatibility§
This trait is dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".