Skip to main content

set_embeddings_batch

Function set_embeddings_batch 

Source
pub fn set_embeddings_batch(
    conn: &mut Connection,
    entries: &[(String, Vec<f32>)],
) -> Result<usize>
Expand description

v0.7.0 Wave-2 A5 (issue #853) — batched embedding writer.

Writes a slice of (id, embedding) pairs inside a single SQLite transaction. Equivalent to calling set_embedding in a loop, but collapses N UPDATE round-trips (N implicit commits in autocommit mode) into one transaction commit, which is the dominant cost on SQLite WAL when N grows past a handful of rows.

Dim-invariant policy matches set_embedding:

  • Empty embeddings are written as embedding_dim = NULL (legacy degenerate-case parity).
  • Per-namespace established dim is checked once per namespace (cached in-flight) and any pair whose embedding length conflicts returns an EmbeddingDimMismatch error — the whole transaction rolls back so callers never see a partial commit. The mismatch carries the FIRST offending pair’s namespace/established/attempted triple (consistent with the single-row path).

Returns the number of rows updated (rows whose id was not found in the memories table are silently skipped — same as set_embedding, where UPDATE … WHERE id = ? returns Ok(0) and the function still returns Ok(())).

Boot backfill use: crate::mcp::run_mcp_server calls this in fixed-size chunks (see DEFAULT_EMBED_BACKFILL_BATCH_SIZE) so the embedder produces vectors in parallel-friendly bursts and the SQLite commit cost amortises across the batch.

§Errors

  • Returns EmbeddingDimMismatch (boxed via anyhow) if any pair’s embedding dim disagrees with the namespace-established dim. The transaction is rolled back; no rows are mutated.
  • Returns the underlying SQLite error on transaction/prepare/execute failure.