sqlite-graphrag 1.0.77

# sqlite-graphrag (v1.0.76 LLM-Only One-Shot)

> Persistent memory for 27 AI agents in a single 6 MB Rust binary (LLM-only, one-shot)

27 AI agents. One 6 MB binary. Zero model download. sqlite-graphrag v1.0.76 gives every AI coding assistant a local, fast, private memory layer backed by a single SQLite file. Every `remember` / `ingest` / `recall` / `hybrid-search` spawns a headless `claude code` or `codex` subprocess (OAuth) for embedding generation. No daemon, no ONNX runtime, no 1.1 GB `multilingual-e5-small` model to download. Sub-50ms recall (with the LLM subprocess spawn dominating the first call). Graph-native retrieval. Deterministic JSON output ready for pipeline orchestration.

- Read this document in [Portuguese (pt-BR)](llms.pt-BR.txt).
- Architecture rationale: [ADR-0019](docs/decisions/adr-0019-llm-only-one-shot.md) (LLM-only one-shot), [ADR-0020](docs/decisions/adr-0020-pure-rust-cosine.md) (pure-Rust cosine), [ADR-0021](docs/decisions/adr-0021-deprecate-daemon.md) (daemon deprecated), [ADR-0022](docs/decisions/adr-0022-blob-embeddings.md) (BLOB-backed embeddings), [ADR-0023](docs/decisions/adr-0023-remove-tokenizers.md) (tokenizers removed), [ADR-0024](docs/decisions/adr-0024-fts5-coarse-cosine-refine.md) (FTS5 coarse + cosine refine), [ADR-0025](docs/decisions/adr-0025-oauth-only-embedding.md) (OAuth-only LLM)


## v1.0.76 Architecture (LLM-Only)
- The default build is LLM-only and one-shot. There is no daemon, no ONNX runtime, no `libonnxruntime.so` to ship, and no `multilingual-e5-small` model to download. The LLM subprocess IS the model.
- Embedding generation: a headless `claude code` (Anthropic OAuth) or `codex` (ChatGPT Pro OAuth) subprocess is spawned per call. The CLI never holds an embedding model in memory.
- Vector search: cosine similarity is computed in pure Rust over the BLOB embeddings stored in `memory_embeddings`, `entity_embeddings`, and `chunk_embeddings`. The `sqlite-vec` extension is gone.
- Daemon: deprecated in v1.0.76, removed in v1.1.0. Still works for source compatibility but no longer offers a speedup.
- `embedding-legacy` feature: opt-in transition window to keep the v1.0.74 fastembed + ort + tokenizers pipeline. REMOVED in v1.1.0. Use `cargo install sqlite-graphrag --features embedding-legacy --locked`.
- Migration from v1.0.74/v1.0.75: `sqlite-graphrag migrate --to-llm-only --drop-vec-tables --db <path>` is the one-shot upgrade. Combines checksum rewrite (--rehash) with V013 vec-table drop. The BLOB-backed embedding tables stay empty until the next `remember` / `edit` / `ingest` re-embeds via the LLM.


## Primary Documentation
### Canonical English sources for LLM ingestion
- [README](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/README.md): full installation, command reference, integration table and FAQ
- [HOW_TO_USE](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/docs/HOW_TO_USE.md): step-by-step walkthrough from install to first hybrid search in 60 seconds
- [COOKBOOK](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/docs/COOKBOOK.md): 34+ recipes covering ingestion, retrieval, graph traversal, backup, audit, and the v1.0.76 LLM-only migration
- [AGENTS](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/docs/AGENTS.md): persuasive guide for AI agent authors: economics, JSON contract, exit code routing
- [MIGRATION](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/docs/MIGRATION.md): upgrade path v1.0.74/v1.0.75 → v1.0.76 → v1.1.0
- [INTEGRATIONS](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/INTEGRATIONS.md): vendor-specific configuration for all 27 supported agents and IDEs
- [CHANGELOG](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/CHANGELOG.md): full release history with migration notes
- [CONTRIBUTING](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/CONTRIBUTING.md): pull request workflow and coding standards
- [SECURITY](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/SECURITY.md): responsible disclosure policy and contact channel
- [CODE_OF_CONDUCT](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/CODE_OF_CONDUCT.md): community standards


## Core Commands
### Subcommands grouped by lifecycle
- `init` initializes the SQLite database and validates that an LLM CLI (`claude` or `codex`) is reachable on `PATH`. No model download.
- `remember` saves a memory with name, type, description, body and optional entity graph; spawning an LLM subprocess per call for embedding generation
- `recall` performs pure-Rust cosine KNN similarity search over stored BLOB embeddings
- `hybrid-search` fuses FTS5 full-text and pure-Rust cosine KNN via Reciprocal Rank Fusion
- `deep-research` decomposes a query into up to 7 sub-queries, computes a separate embedding per sub-query (via the LLM subprocess), runs parallel vector+FTS hybrid search fused via RRF plus 3-hop graph traversal per sub-query, deduplicates and assembles directed evidence chains
- `read` `list` `forget` `rename` `edit` `history` `restore` manage the memory lifecycle
- `remember-batch` ingests multiple memories from a NDJSON stream in a single invocation
- `completions` generates shell completion scripts for bash, zsh, fish, elvish and powershell
- `link` `unlink` `related` manage typed entity relationships for multi-hop traversal
- `health` `stats` `migrate` `vacuum` `optimize` `sync-safe-copy` manage the database
- `migrate --rehash` rewrites recorded migration checksums to match the current file content (one-shot fix for v1.0.74 → v1.0.76 V002 mismatch)
- `migrate --to-llm-only --drop-vec-tables` one-shot upgrade for v1.0.74/v1.0.75 databases: rehash + apply V013 + report vec-table state
- `backup` creates a consistent database backup using SQLite Online Backup API, safe with WAL mode
- `fts rebuild` `fts check` `fts stats` repair, verify, and inspect the FTS5 full-text index
- `memory-entities` lists all entities linked to a named memory
- `delete-entity` removes an entity and optionally cascades to its relationships
- `rename-entity` renames a graph entity preserving all FK-based relationships and re-embeds for vector search
- `reclassify` changes the `entity_type` of one entity or a whole category via `--batch`
- `merge-entities` merges duplicate entity nodes into a single canonical node
- `prune-ner` removes NER-generated bindings from one entity (`--entity`) or all (`--all --yes`)
- `remember --dry-run` previews parsing and graph extraction without writing to the database or spawning the LLM
- `cleanup-orphans` `prune-relations` remove orphaned entities and weak or unused relations from the graph
- `purge` `namespace-detect` handle maintenance and namespace resolution
- `reclassify-relation` renames relationship types across the graph
- `normalize-entities` normalizes all entity names in the namespace to lowercase kebab-case ASCII
- `enrich` LLM-augmented graph quality pipeline via `--mode claude-code` or `--mode codex`


## Environment Variables
### Runtime configuration surface
- `SQLITE_GRAPHRAG_DB_PATH` explicit override for the SQLite database path
- Default behavior creates or opens `graphrag.sqlite` in the current working directory
- `SQLITE_GRAPHRAG_CACHE_DIR` directory for embedding model cache files
- `SQLITE_GRAPHRAG_LOG_LEVEL` tracing filter level with values `error` through `trace`
- `SQLITE_GRAPHRAG_LANG` CLI output language with values `en` or `pt`
- `SQLITE_GRAPHRAG_NAMESPACE` namespace override respected by all commands (fixed in v1.0.51)


## Graph Input
### Minimal payload contract for `remember`
- `--entities-file` expects a JSON array of entity objects
- Each entity must include `name` plus `entity_type` or alias `type`
- Valid `entity_type` values: `project`, `tool`, `person`, `file`, `concept`, `incident`, `decision`, `memory`, `dashboard`, `issue_tracker`, `organization`, `location`, `date`
- `--relationships-file` expects a JSON array of relationship objects
- Each relationship must include `source`/`from`, `target`/`to`, `relation`, and `strength`
- `strength` must be a float in `[0.0, 1.0]` and maps to `weight` in graph outputs
- File payload relations use underscore labels like `applies_to`, `depends_on`, and `tracked_in`; dashed aliases are normalized before storage
- CLI flags on `link` and `unlink` use dashed labels like `applies-to`, `depends-on`, and `tracked-in`


## NER Extraction
### GLiNER zero-shot entity extraction
- Pass `--enable-ner` or set `SQLITE_GRAPHRAG_ENABLE_NER=1` to activate on `remember` and `ingest`
- Works with `--graph-stdin`: pass `"entities": []` in JSON and GLiNER extracts automatically
- Model variants via `--gliner-variant`: `fp32` (1.1 GB), `fp16` (580 MB), `int8` (349 MB), `q4` (894 MB), `q4f16` (472 MB)
- Override model with `SQLITE_GRAPHRAG_GLINER_MODEL`; tune threshold with `SQLITE_GRAPHRAG_GLINER_THRESHOLD` (default `0.5`)
- Response field `extraction_method`: `gliner-<variant>+regex`, `regex-only`, or `none:extraction-failed`
- `--skip-extraction` is deprecated since v1.0.45; use `--enable-ner` instead
- `--max-rss-mb <MiB>` on `remember` and `ingest` aborts embedding when process RSS exceeds the threshold (default 8192 MiB)

## Ingest Modes (v1.0.62)
### Three extraction modes for bulk ingestion
- `--mode none` (default): body-only ingestion, no entity/relationship extraction
- `--mode gliner`: GLiNER NER extraction (requires `--enable-ner`, uses local ONNX model)
- `--mode claude-code`: LLM-curated extraction via locally installed Claude Code CLI binary
- Claude Code mode spawns `claude -p` headless with `--json-schema` for structured entity/relationship output
- Requires Claude Code >= 2.1.0 installed on the user's machine with active Pro/Max subscription
- Resumable via `--resume` flag with SQLite queue DB tracking per-file progress
- Per-file subprocess timeout via `--claude-timeout <S>` (default 300s); kills hung `claude -p` processes
- Budget control via `--max-cost-usd <N>` to limit cumulative API spend
- Rate limit handling with automatic exponential backoff (60s → 120s → 300s → 900s)
- Output is NDJSON: per-file events with `entities`, `rels`, `cost_usd` fields, followed by summary
- `--mode codex` spawns `codex exec --json` per file for LLM-curated extraction via OpenAI Codex CLI
- Requires Codex CLI installed locally; uses `--output-schema` for structured JSON output
- Codex-specific flags: `--codex-binary <PATH>`, `--codex-model <MODEL>`, `--codex-timeout <S>` (default 300)
- Environment variable `SQLITE_GRAPHRAG_CODEX_BINARY` overrides PATH lookup
- Token usage tracking via `input_tokens` and `output_tokens` fields (cost_usd unavailable from Codex CLI)
- Full embedding pipeline applied to Codex-ingested memories for recall and hybrid-search

Authentication: OAuth works out of the box for both modes — no API key needed.
--mode claude-code reads OAuth from ~/.claude/.credentials.json (Claude Pro/Max/Team).
--mode codex reads device auth from codex auth login (OpenAI).
API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY) ABORT the spawn with AppError::Validation. OAuth is the ONLY accepted credential flow.


## Exit Codes
### Deterministic status for pipeline routing
- `0` success: continue the agent loop
- `1` validation or runtime failure: log and surface to operator
- `2` invalid CLI argument: fix usage then retry
- `9` duplicate memory detected: skip or use `--force-merge`
- `3` optimistic update conflict: re-read `updated_at` and retry
- `4` memory or entity not found: handle missing resource gracefully
- `5` namespace could not be resolved: pass `--namespace` explicitly
- `6` payload exceeded configured limits: split body into smaller chunks
- `10` SQLite database error: run `health` to inspect integrity
- `11` embedding generation failed: check model files and retry
- `12` sqlite-vec extension failed to load: reinstall with bundled extension
- `13` batch partial failure: honor backoff and retry later
- `14` filesystem I/O error: cache directory not writable, nonexistent ingest target directory
- `15` database busy after retries: wait and retry
- `20` internal or JSON serialization error
- `75` EX_TEMPFAIL: all concurrency slots busy OR job singleton locked, retry with backoff
- `77` available RAM below minimum required to load embedding model


## What Changed in v1.0.68
### Critical fixes (G28 + G29)
- v1.0.68 is the first release since v1.0.65 that compiles on Windows via `cargo install`.  v1.0.66 and v1.0.67 broke with `error[E0308]` in `src/terminal.rs:29` because `HANDLE` in `windows-sys >= 0.59` is `*mut c_void` (was `isize` in 0.48/0.52).  Fix: `!handle.is_null() && handle != INVALID_HANDLE_VALUE` plus `windows-sys` pinned to `=0.59.0` exact, plus CI job `windows-build-check`.
- `enrich`, `ingest --mode claude-code`, and `ingest --mode codex` now acquire a per-namespace singleton via `lock::acquire_job_singleton(job_type, namespace, wait_seconds)`.  A second concurrent invocation on the same database returns `AppError::JobSingletonLocked { job_type, namespace }` (exit 75, retryable) instead of stacking 4 × N workers × 10 MCP servers.
- `claude_runner::build_claude_command` now respects `SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR` (opt-in).  When set to an empty directory, the subprocess is spawned with `CLAUDE_CONFIG_DIR=<that dir>`, suppressing user-scoped MCP servers.  This is the only mechanism Claude Code actually honours — `--strict-mcp-config` and `--mcp-config '{}'` are silently ignored per [anthropics/claude-code#10787].
- `enrich` emits a `tracing::warn!` when `--llm-parallelism > 4`, recommending the combination with `CLAUDE_CONFIG_DIR` override.
- `retry::CircuitBreaker` helper added with `AttemptOutcome::{Success, Transient, HardFailure}`.  Rate-limited and timeout errors are explicitly excluded from the failure count.
- 3 pre-existing test failures in `src/commands/{history,list,read}.rs` fixed (timezone-agnostic assertions).

## Optional References
### Supplementary materials for deeper context
- [CLAUDE guidance](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/docs/CLAUDE.md): Claude Code specific invocation patterns and session memory strategies
- [SKILL definitions](https://github.com/daniloaguiarbr/sqlite-graphrag/tree/main/skill/sqlite-graphrag-en): pre-built slash-command skills for Claude Code harness
- [crates.io package](https://crates.io/crates/sqlite-graphrag): published binary with semver and MSRV metadata
- [docs.rs API reference](https://docs.rs/sqlite-graphrag): rustdoc for library consumers


## Stable Facts
### Identity and version metadata
- Package name `sqlite-graphrag` published on crates.io under MIT OR Apache-2.0
- Current version 1.0.76 with MSRV Rust 1.88 declared in `Cargo.toml`
- Repository `https://github.com/daniloaguiarbr/sqlite-graphrag` with CI on push and tag
- Embedding model: delegated to `claude code` (Anthropic OAuth) or `codex` (ChatGPT Pro OAuth) headless subprocess per call; no in-process model
- Storage layer: `rusqlite` with bundled SQLite and FTS5 module. `sqlite-vec` extension REMOVED in v1.0.76; vector similarity is pure-Rust cosine over BLOB embeddings
- Up to four concurrent instances supported via counting semaphore with `fs4` advisory locks