Skip to main content

Module embedder

Module embedder 

Source
Expand description

Local embedding generation (LLM-only, one-shot per invocation). Embedding generation for the GraphRAG memory.

v1.0.76: the default build is LLM-only — the binary does NOT bundle fastembed / ort / ndarray / tokenizers. All embeddings are produced by a headless invocation of claude code or codex (OAuth, no MCP, no hooks) and stored as a BLOB in memory_embeddings(memory_id, embedding, source). Vector similarity is computed in pure Rust at query time.

The legacy fastembed pipeline is still available behind the opt-in embedding-legacy feature for the transition window. It is removed in v1.1.0. New code MUST use the LLM path (embed_passage / embed_query here, which always call the LLM).

Functions§

bytes_to_f32
embed_passage
Embeds a single passage for storage. Delegates to the configured LLM headless (claude code / codex). Returns a 384-dim f32 vector.
embed_passage_local
embed_passages_controlled
Embeds a batch of passages with token-count-aware batching. The token_counts are still used to keep the LLM invocation under the per-call context budget, but the count is now an approximation (whitespace-split words) since the tokenizers crate was removed.
embed_passages_controlled_local
embed_query
Embeds a single query for similarity search. Same model and dim as embed_passage; the only difference is the LLM-side prompt prefix that the headless invocation uses to disambiguate.
embed_query_local
embedding_dim
Returns the dimensionality of the embedding space. Used to validate LLM responses and to size the in-memory cache.
f32_to_bytes
get_embedder
Initialises the LLM-embedding client on first use and returns it.