Skip to main content

InferenceHandle

Trait InferenceHandle 

Source
pub trait InferenceHandle: Send + Sync {
    // Required methods
    fn generate<'life0, 'async_trait>(
        &'life0 self,
        req: GenerateRequest,
    ) -> Pin<Box<dyn Future<Output = Result<String, InferenceError>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait;
    fn embed<'life0, 'async_trait>(
        &'life0 self,
        req: EmbedRequest,
    ) -> Pin<Box<dyn Future<Output = Result<Vec<Vec<f32>>, InferenceError>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait;
}
Expand description

Inference operations memgine (and other embedders) need.

Implementations must be Send + Sync because memgine holds the handle in an Arc and reaches it from &self methods that may run inside tokio::spawn tasks during consolidation passes.

Why these two methods. The CLI / memgine call sites only reach generate (for skill distillation, reasoning, dream consolidation) and embed (for semantic similarity in retrieval + the speculative summary pre-compute). Other engine methods — classification, routing, tokenization, image / video generation — are reached either through the daemon directly or via the concrete engine path. Adding them to the trait would broaden the daemon-proxy implementation surface without memgine benefit.

Required Methods§

Source

fn generate<'life0, 'async_trait>( &'life0 self, req: GenerateRequest, ) -> Pin<Box<dyn Future<Output = Result<String, InferenceError>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Run a generation request to completion. Same contract as InferenceEngine::generate: caller passes a GenerateRequest (which may carry an explicit model, a routing hint, tools, or a thinking budget), receives the final text or an InferenceError.

Source

fn embed<'life0, 'async_trait>( &'life0 self, req: EmbedRequest, ) -> Pin<Box<dyn Future<Output = Result<Vec<Vec<f32>>, InferenceError>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Encode one or more texts as embedding vectors. Same contract as InferenceEngine::embed: returns one Vec<f32> per input text in the same order.

Implementors§