Skip to main content

Embedder

Enum Embedder 

Source
pub enum Embedder {
    Local {
        model: Arc<BertModel>,
        tokenizer: Arc<Tokenizer>,
        device: Device,
    },
    Ollama {
        client: Arc<OllamaClient>,
        model_name: String,
        dim: usize,
        degraded: Arc<AtomicBool>,
    },
}
Expand description

Semantic embedding engine supporting multiple backends.

  • Local (candle): all-MiniLM-L6-v2, 384-dim. Used at the semantic tier.
  • Ollama: nomic-embed-text-v1.5, 768-dim. Used at smart/autonomous tiers.

Variants§

§

Local

Candle-based local embedding (MiniLM-L6-v2, 384-dim).

v0.7.0 #1084 — model is Arc<BertModel> (no mutex). The pre-#1084 design held an Arc<Mutex<BertModel>> and locked the model across the full forward pass; on a multi-tenant HTTP daemon that serialised every embed call on a single global mutex. Candle’s BertModel::forward(&self, ...) is inference-only (weights are read-only mmap’d safetensors) so the mutex was unnecessary; parallel embed calls now run concurrently against the same weights.

Fields

§tokenizer: Arc<Tokenizer>
§device: Device
§

Ollama

Remote embed client — Ollama-native OR OpenAI-compatible wire shape (#1598). The historical variant name is preserved to avoid call-site churn; the carried crate::llm::OllamaClient routes /api/embed (Ollama) or /embeddings + Bearer (OpenAI-compatible) per its provider. dim is the model’s vector dimensionality (768 for the historical nomic default); degraded latches the outcome of the most recent embed call so the capabilities surface can report a dead remote endpoint truthfully (#1594).

Fields

§model_name: String
§dim: usize
§degraded: Arc<AtomicBool>

Implementations§

Source§

impl Embedder

Source

pub fn new() -> Result<Self>

Create a new local (candle) embedder for MiniLM-L6-v2. Downloads the model if it is not already cached.

Source

pub fn new_local() -> Result<Self>

Create a local candle embedder (MiniLM-L6-v2, 384-dim).

Source

pub fn new_ollama(client: Arc<OllamaClient>) -> Self

Create an Ollama-based embedder for nomic-embed-text-v1.5 (768-dim).

Requires the Ollama client to already be connected and the model pulled.

Source

pub fn new_remote( client: Arc<OllamaClient>, model_name: String, dim: usize, ) -> Self

#1598 — create a remote embedder for an arbitrary model + dim. client may speak either wire shape: Ollama-native (OllamaClient::new_with_url) or OpenAI-compatible (OllamaClient::new_openai_compatible — OpenRouter, HF TEI, vLLM, …). The degraded flag starts false and tracks the most recent embed outcome.

Source

pub fn from_resolved( resolved: &ResolvedEmbeddings, tier_model: Option<EmbeddingModel>, ) -> Result<Option<Self>>

#1598 — single shared boot entry for both wiring sites (MCP stdio init + daemon_runtime::build_embedder). Consumes the canonical crate::config::AppConfig::resolve_embeddings output and the tier’s embedding-model gate:

  • tier_model = None (keyword tier) → Ok(None).
  • API backend (crate::config::is_api_embed_backend) → OpenAI-compatible remote client against resolved.url with the resolved Bearer key. Keyless self-hosted endpoints (HF TEI / vLLM) are legitimate: a missing key sends an empty Bearer value, which such servers ignore. Requires a known dim ([embeddings].dim override or the known-dims table) — bails otherwise so mismatched vectors never land silently.
  • Ollama backend → the historical Self::for_model path (MiniLM = local candle regardless; nomic = Ollama client at resolved.url). Client construction failure returns Err — callers fail closed to keyword recall (#1593), NEVER to the chat LLM client.
§Errors

Remote-client construction failure, an unknown vector dim for an API-backend model, or local model-load failure.

Source

pub fn for_model( model: EmbeddingModel, ollama_client: Option<Arc<OllamaClient>>, ) -> Result<Self>

Create an embedder for the specified model.

  • MiniLmL6V2 → local candle embedder
  • NomicEmbedV15 → Ollama-based (requires ollama_client)
Source

pub fn dim(&self) -> usize

Embedding vector dimensionality for this embedder.

Source

pub fn model_description(&self) -> String

Human-readable description of the active embedding model. #1598 — returns String (the remote variant reports its live model + dim, which may be any operator-picked API model id, not just the historical nomic default).

Source

pub fn is_degraded(&self) -> bool

#1598 / #1594 — true when the most recent remote embed call failed (dead endpoint, auth rejection, …). The local candle embedder never degrades at runtime (weights are mmap’d at construction). Consumed by the capabilities surface so features.embedder_loaded / recall_mode_active report the LIVE posture rather than the boot-time one.

Source

pub fn embed(&self, text: &str) -> Result<Vec<f32>>

Generate an embedding for a single text input indexed as a corpus document. Thin alias for Embedder::embed_with_role with EmbedRole::Document — the safe default for every write/index path and for symmetric comparisons.

Source

pub fn embed_query(&self, text: &str) -> Result<Vec<f32>>

Generate an embedding for a text used as a search query. Thin alias for Embedder::embed_with_role with EmbedRole::Query. For the asymmetric Ollama nomic backend this applies the search_query: task prefix so query↔document cosine is meaningful (#1520); the symmetric local MiniLM backend ignores the role.

Source

pub fn embed_with_role(&self, text: &str, role: EmbedRole) -> Result<Vec<f32>>

Generate an embedding for text under an explicit retrieval EmbedRole. The local candle MiniLM backend is symmetric and ignores the role; the Ollama nomic backend prepends the role-specific task-instruction prefix required by nomic-embed-text-v1.5 (#1520).

Source

pub fn embed_with_status(&self, text: &str) -> (Option<Vec<f32>>, EmbedStatus)

v0.7.0 F6 — generate an embedding and report the outcome.

Combines the existing Embedder::embed call with an EmbedStatus tag so the caller (HTTP store path, MCP store path, sync ingestion, …) can surface a structured signal on the response when the embedder skipped or errored. Behaviour:

  • Empty input → (None, Skipped("empty content"))
  • Input larger than EMBED_MAX_BYTES(None, Skipped(reason))
  • Embedder errors → (None, Failed(reason))
  • Otherwise → (Some(vec), Indexed)

Callers that don’t care about the status keep using Embedder::embed; this is the new opt-in API.

Source

pub fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>

Generate embeddings for multiple texts in one call.

PERF-5 (FX-C4-batch2, 2026-05-26): true batched forward instead of the prior texts.iter().map(|t| self.embed(t)) fan-out. The Local arm tokenises every input, pads to the batch’s max sequence length, stacks to a (B, L) tensor, and runs BertModel::forward ONCE per batch — Candle’s per-call overhead dominates B=1 calls, so a true batch of 32 inputs is ~10-20× faster than 32 sequential calls. The Ollama arm continues to dispatch one POST per text (the vendor wire shape for batched /api/embed differs across Ollama versions and a wire-version probe would add the same per-call latency we are saving; keep the per-text loop here while a LlmClient-side batched-embed API is staged).

Callers: multistep_ingest, atomisation, the periodic embedding-backfill sweep (AI_MEMORY_EMBED_BACKFILL_BATCH).

Source

pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32

Compute cosine similarity between two embedding vectors.

Source

pub fn cosine_similarity_checked( query: &[f32], stored: &[f32], ) -> CosineComparison

v0.7.0 H7 — dimension-aware companion to Embedder::cosine_similarity.

Returns CosineComparison::DimensionMismatch instead of silently yielding 0.0 when the two vectors have different lengths, so the recall pipeline can report cross-model (embedder-switch) embeddings rather than dropping their semantic signal unseen. When the dimensions agree the result wraps the same value Embedder::cosine_similarity would return.

Source

pub fn fuse(primary: &[f32], secondary: &[f32], primary_weight: f32) -> Vec<f32>

Fuse a primary query embedding with a secondary context embedding via weighted linear combination (v0.6.0.0 contextual recall).

primary_weight clamped to [0.0, 1.0]. The result is returned un-normalized — cosine_similarity divides out magnitudes, so the downstream signal is direction-only. Returns primary.to_vec() when dimensions differ (graceful fallback, same policy as cosine_similarity).

Trait Implementations§

Source§

impl Clone for Embedder

Source§

fn clone(&self) -> Embedder

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Embed for Embedder

v0.7.0 L0.7 — Embed trait impl that delegates to the inherent Embedder::embed / Embedder::embed_batch methods. The inherent methods stay on Embedder verbatim so existing callers that hold a concrete &Embedder keep their fast path; the trait impl is purely additive and enables dyn Embed substitution for handler signatures (see Embed docs).

Source§

fn embed(&self, text: &str) -> Result<Vec<f32>>

Produce a single embedding vector for text. Read more
Source§

fn embed_query(&self, text: &str) -> Result<Vec<f32>>

Produce a single embedding vector for text used as a search query. Default implementation delegates to Embed::embed, which is correct for symmetric embedders (and the test MockEmbedder); the production Embedder overrides it so the asymmetric Ollama nomic backend applies the search_query: task prefix (#1520). Read more
Source§

fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>

Produce embedding vectors for a batch of texts. Default implementation calls Embed::embed in a loop; implementors may override to do native batching. Read more
Source§

fn is_degraded(&self) -> bool

#1598 / #1594 — true when the embedder’s most recent remote call failed (live-degraded posture). Default false (correct for local / mock embedders); the production Embedder overrides it for the remote variant so the capabilities surface reports a dead endpoint truthfully.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DynClone for T
where T: Clone,

Source§

fn __clone_box(&self, _: Private) -> *mut ()

Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> FromRef<T> for T
where T: Clone,

Source§

fn from_ref(input: &T) -> T

Converts to this type from a reference to the input type.
Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more