Expand description
Pluggable embedding backend abstraction.
§Overview
The EmbeddingBackend trait decouples embedding generation from the transport layer
(ONNX Runtime, Candle, static lookup, GGUF quantized). The engine selects a backend
at startup via environment variables:
| Env var | Backend chosen |
|---|---|
DAKERA_BACKEND=onnx | OnnxBackend — production default (INT8 quantized) |
DAKERA_BACKEND=candle | [CandleBackend] — pure-Rust Candle (requires candle feature) |
DAKERA_BACKEND=static | StaticBackend — Model2Vec static lookup (~500× faster ingest) |
DAKERA_BACKEND=gguf | [GgufBackend] — GGUF quantized via candle-gguf (requires candle feature) |
| (unset) | OnnxBackend — same as onnx |
All backends implement the same EmbeddingBackend trait so callers are fully backend-agnostic.
§Feature flags
candle— enables [CandleBackend] and [GgufBackend] (requires HuggingFace Candle crates)
The onnx and static backends are always compiled — they have no additional dependencies
beyond what the dakera-inference crate already requires.
Re-exports§
pub use onnx::OnnxBackend;pub use static_backend::StaticBackend;
Modules§
- onnx
- ONNX Runtime embedding backend.
- static_
backend - Model2Vec static embedding backend.
Enums§
- Backend
Kind - Identifies which backend is active. Stored in HNSW node metadata so the background re-embed job knows which memories were indexed with a fast (static) embedding and need to be upgraded with the quality transformer.
Traits§
- Embedding
Backend - Core embedding backend trait. All backends must be
Send + Syncso they can be held behind anArcand called from async tasks.
Functions§
- select_
backend - Select an embedding backend based on environment variables.