Trait LlamaEngine

Source

pub trait LlamaEngine: Send + Sync {
    // Required methods
    fn load_model(&self, spec: &ModelSpec) -> Result<ModelHandle>;
    fn tokenize(&self, text: &str) -> Result<Vec<TokenId>>;
    fn detokenize(&self, tokens: &[TokenId]) -> Result<String>;
    fn prefill(
        &self,
        session: &mut Session,
        tokens: &[TokenId],
    ) -> Result<PrefillResult>;
    fn decode(&self, session: &mut Session) -> Result<DecodeResult>;
    fn embed(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>;
}

Expand description

The core engine trait — everything else plugs into this.

Implementations provide inference, tokenization, and embedding functionality. oxidizedRAG and oxidizedgraph depend on engine behavior, not implementation details. Swap CPU/Metal/FFI backends without changing application code.