Skip to main content

Module embeddings

Module embeddings 

Source

Enums§

CosineComparison
v0.7.0 H7 — dimension-aware outcome of a recall-time cosine comparison between a live query embedding and a stored embedding whose producing model may have changed since the row was written.
EmbedRole
Retrieval role of a text handed to the embedder. Drives the asymmetric task-instruction prefix for backends that require one (Ollama nomic-embed-text-v1.5); symmetric backends (the in-process candle MiniLM-L6-v2) ignore it. See #1520.
EmbedStatus
v0.7.0 F6 — outcome of a single embedding call. Returned by Embedder::embed_with_status alongside the (possibly absent) embedding vector.
Embedder
Semantic embedding engine supporting multiple backends.
EmbeddingFormatError
Errors produced by the embedding BLOB codec. Distinguishes the three failure modes operators want to triage independently:

Constants§

EMBEDDING_DIM
Constant for backward compatibility — dimension of the default (MiniLM) embedding.
EMBEDDING_HEADER_BE_F32
Magic byte declaring “big-endian f32” payload follows. Reserved — the reader rejects this until v0.7 adds endianness conversion.
EMBEDDING_HEADER_LE_F32
Magic byte declaring “little-endian f32” payload follows.
EMBED_MAX_BYTES
v0.7.0 F6 — soft cap on the input size handed to the embedder. 64 KiB matches the F10 store-path threshold so a single content blob that the embedder can’t realistically process is reported as Skipped("content > 65536 bytes") rather than blowing up the chat/embed RPC. Operators who want larger embeddings can grow this constant alongside the F10 HTTP threshold.

Traits§

Embed
v0.7.0 L0.7 — minimal dyn-compatible trait that abstracts “produces embedding vectors” away from the concrete Embedder enum.

Functions§

decode_embedding_blob
Decode an embedding BLOB back into Vec<f32>.
decoded_dim
Number of f32 elements encoded in bytes, regardless of header presence. Used by the dim_violations stats path to compute per-row dim without allocating a Vec<f32>.
embedding_document
#1558 batch 5 wave 2 — the canonical embedding/rerank document template: "{title} {content}".
encode_embedding_blob
Encode a [f32] slice as a length-prefixed BLOB suitable for the memories.embedding column.
oversize_embed_reason
#1595 — single source of the EMBED_MAX_BYTES oversize check + its human-readable skip reason. Some(reason) when byte_len exceeds the cap, None otherwise. Shared by Embedder::embed_with_status (store path) and the backfill / reembed sweeps so the client-side guard and its WARN text can never drift between the write-time and batch paths.