Skip to main content

Module llm_embedding

Module llm_embedding 

Source
Expand description

LLM-based embedding backend (v1.0.76 default; reworked in v1.0.79 G42).

LlmEmbedding is the production embedding client. It wraps headless invocations of claude code or codex and returns f32 vectors of the active dimensionality (crate::constants::embedding_dim(), default 64).

v1.0.79 (G42) changes:

  • S1: the dimensionality is no longer hardcoded here — the single source of truth lives in crate::constants and the JSON schemas are generated dynamically.
  • S2: embed_batch embeds N numbered texts per LLM call with the {items:[{i,v}]} schema, collapsing 39 subprocess spawns into 4-5.
  • S4: the codex --output-schema file is a tempfile::NamedTempFile with a randomised name created once per client and shared across clones via Arc — no per-call write+delete, no PID-path races.
  • S5: the claude model honours SQLITE_GRAPHRAG_CLAUDE_EMBED_MODEL (symmetric to the codex env var). ZERO hardcoded models without an env override.
  • S6: CLAUDE_CONFIG_DIR points at an empty managed directory BY DEFAULT, because --strict-mcp-config/--mcp-config '{}' are silently ignored upstream (anthropics/claude-code#10787) and a full ~/.claude costs ~223k cache-creation tokens per call.
  • S7: the codex request_user_input failure mode maps to an actionable error instead of an opaque exit 11.
  • BLOCO 4: every subprocess uses kill_on_drop(true) plus an explicit tokio::time::timeout, so cancellation never leaks a child and a hung LLM cannot stall the pipeline forever.

OAuth is the only supported credential path. The constructor rejects ANTHROPIC_API_KEY / OPENAI_API_KEY in the environment — see v1.0.69 (G31) OAuth-Only Enforcement.

Structs§

LlmEmbedding

Enums§

EmbeddingFlavour

Functions§

resolve_real_binary
Follows symlinks and shell-script shim exec targets to find the real ELF binary. Shim wrappers (like ~/.graphrag-shim/codex) can strip hardening flags; bypassing them is a security requirement.