Skip to main content

InferenceBackend

Trait InferenceBackend 

Source
pub trait InferenceBackend: Send + Sync {
    // Required methods
    fn embed(&self, text: &str) -> Result<Vec<f32>>;
    fn chat(&self, prompt: &str) -> Result<String>;

    // Provided method
    fn attested_weights(&self) -> Option<AttestedWeights> { ... }
}
Expand description

The unified inference surface. v0.8 callers will hold an Arc<dyn InferenceBackend> instead of separate embedder + llm handles. At v0.7.0 the recall hot-path still uses the legacy types directly (no callsite churn during the v0.7.0 ship window); the trait is the seam through which the v0.8 GPU/MTP backend will be threaded.

Required Methods§

Source

fn embed(&self, text: &str) -> Result<Vec<f32>>

Produce a single embedding vector for text.

§Errors

Implementor-specific (model load failure, tokenisation error, device OOM, etc.). The GPU stub backend returns a not implemented error.

Source

fn chat(&self, prompt: &str) -> Result<String>

Generate a chat completion for prompt. Default system prompt is None (implementor decides); use a concrete backend’s API for system-prompt support.

§Errors

Implementor-specific (transport error, model unavailable, safety refusal, etc.).

Provided Methods§

Source

fn attested_weights(&self) -> Option<AttestedWeights>

Return the loaded model’s SHA-256 + optional signature for issue #654 supply-chain attestation. None if the backend has no on-disk weights to attest (e.g. a network-only client).

Dyn Compatibility§

This trait is dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

Implementors§