# sqlite-graphrag (v1.0.76 LLM-Only One-Shot)
> Persistent memory for 27 AI agents in a single 6 MB Rust binary (LLM-only, one-shot)
27 AI agents. One 6 MB binary. Zero model download. v1.0.76 gives every AI coding assistant a local, private memory layer backed by a single SQLite file. The CLI is a thin orchestrator: every `remember` / `ingest` / `recall` / `hybrid-search` spawns a headless `claude code` (Anthropic OAuth) or `codex` (ChatGPT Pro OAuth) subprocess for embedding generation. No daemon, no ONNX runtime, no 1.1 GB `multilingual-e5-small` model to download. Graph-native retrieval. Deterministic JSON output ready for pipeline orchestration.
- Portuguese version available at [README.pt-BR.md](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/README.pt-BR.md)
- Compact summary in [llms.txt](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/llms.txt); Portuguese variant in [llms.pt-BR.txt](https://github.com/daniloaguiarbr/sqlite-graphrag/blob/main/llms.pt-BR.txt)
```bash
cargo install --locked sqlite-graphrag
```
## What is it?
### sqlite-graphrag delivers durable memory for AI agents
- Stores memories, entities and relationships inside a single SQLite file (6 MB binary)
- Embeds content via headless `claude code` or `codex` LLM subprocess (OAuth, no API key); the LLM subprocess IS the model
- Combines FTS5 full-text search with pure-Rust cosine KNN over BLOB embeddings into a hybrid Reciprocal Rank Fusion ranker
- `sqlite-vec` extension REMOVED in v1.0.76; cosine similarity is computed in-process over `memory_embeddings` / `entity_embeddings` / `chunk_embeddings` BLOB tables
- Extracts an entity graph with typed edges for multi-hop recall across memories
- Preserves every edit through an immutable version history table for full audit
- Runs on Linux, macOS and Windows natively with zero external services required beyond an LLM CLI on PATH
## v1.0.76 Architecture
- The default build is LLM-only and one-shot. See [ADR-0019](docs/decisions/adr-0019-llm-only-one-shot.md) for the full architectural rationale.
- Daemon: deprecated in v1.0.76, removed in v1.1.0. Still works for source compatibility but the daemon no longer offers a speedup because the LLM subprocess is the new "model loader".
- `embedding-legacy` feature: opt-in to keep the v1.0.74 fastembed + ort + tokenizers + sqlite-vec pipeline. REMOVED in v1.1.0. Use `cargo install sqlite-graphrag --features embedding-legacy --locked`.
- Migration from v1.0.74/v1.0.75: `sqlite-graphrag migrate --to-llm-only --drop-vec-tables --db <path>` is the one-shot upgrade. Combines checksum rewrite with V013 vec-table drop.
- OAuth enforcement: the spawn of `claude -p` or `codex exec` ABORTS with `AppError::Validation` if `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` is set in the environment. See [ADR-0025](docs/decisions/adr-0025-oauth-only-embedding.md).
- Cosine similarity is now pure Rust: see [ADR-0020](docs/decisions/adr-0020-pure-rust-cosine.md) for the algorithm. BLOB-backed embeddings replace vec0: see [ADR-0022](docs/decisions/adr-0022-blob-embeddings.md).
## Why sqlite-graphrag?
### Differentiators against cloud RAG stacks
- Offline-first architecture eliminates OpenAI embeddings and Pinecone recurring fees
- Single-file SQLite storage replaces Docker clusters of vector databases entirely
- Graph-native retrieval beats pure vector RAG on multi-hop questions by design
- Deterministic JSON output unlocks clean orchestration by LLM agents in pipelines
- Native cross-platform binary ships without Python, Node or Docker dependencies
- Remove recurring cloud vector database dependencies from local agent workflows
- Keep retrieval local to the workstation or CI runner instead of a remote RAG stack
## Economy That Converts
### Numbers That Sell The Switch
- Graph traversal can reduce repeated prompt stuffing in agent loops
- Daemon reuse reduces repeated model startup cost on heavy commands
- Single-file storage lowers the operational surface compared to a remote RAG stack
- Your proprietary data NEVER leaves the developer workstation or the CI runner
- Your compliance surface shrinks to one SQLite file under your own encryption
- Your audit trail lives in the `memory_versions` table with immutable history
## Quick Start
### Install and record your first memory in four commands
```bash
cargo install --locked sqlite-graphrag
sqlite-graphrag init
sqlite-graphrag remember --name onboarding-note --type user --description "first memory" --body "hello graphrag"
sqlite-graphrag recall "graphrag" --k 5 --json
```
- The flag `--locked` reuses the `Cargo.lock` shipped with the crate to prevent MSRV breakage
- Without `--locked` Cargo may resolve a patch release that requires a newer `rustc` than 1.88
## Superpowers for AI Agents
### First-class CLI contract for orchestration
- Every subcommand accepts `--json` producing deterministic stdout payloads
- Every invocation can stay stateless, while heavy embedding commands may also auto-start and reuse `sqlite-graphrag daemon` when needed
- Every write is idempotent through `--name` kebab-case uniqueness constraints
- Stdin is explicit: use `--body-stdin` for body text or `--graph-stdin` for one `{body?, entities, relationships}` object; raw entity and relationship arrays use `--entities-file` and `--relationships-file`
- `remember` accepts body payloads up to `512000` bytes and up to `512` chunks
- Stderr carries tracing output under `SQLITE_GRAPHRAG_LOG_LEVEL=debug` only
- Cross-platform behavior is identical across Linux, macOS and Windows hosts
## Compatible Agents and Orchestrators
### Catalog — 21 Supported Integrations
| Agent | Vendor | Minimum Version | Integration Type | Example |
| --- | --- | --- | --- | --- |
| Claude Code | Anthropic | 1.0+ | Subprocess | `sqlite-graphrag recall "query" --json` |
| Codex CLI | OpenAI | 0.5+ | AGENTS.md + subprocess | `sqlite-graphrag remember --name X --type user --description "..." --body "..."` |
| Gemini CLI | Google | any recent | Subprocess | `sqlite-graphrag hybrid-search "query" --json --k 5` |
| Opencode | open source | any recent | Subprocess | `sqlite-graphrag recall "auth flow" --json --k 3` |
| OpenClaw | community | any recent | Subprocess | `sqlite-graphrag list --type user --json` |
| Paperclip | community | any recent | Subprocess | `sqlite-graphrag read --name onboarding-note --json` |
| VS Code Copilot | Microsoft | 1.90+ | tasks.json | `{"command": "sqlite-graphrag", "args": ["recall", "$selection", "--json"]}` |
| Google Antigravity | Google | any recent | Runner | `sqlite-graphrag hybrid-search "prompt" --k 10 --json` |
| Windsurf | Codeium | any recent | Terminal | `sqlite-graphrag recall "refactor plan" --json` |
| Cursor | Cursor | 0.40+ | Terminal | `sqlite-graphrag remember --name cursor-ctx --type project --description "..." --body "..."` |
| Zed | Zed Industries | any recent | Assistant Panel | `sqlite-graphrag recall "open tabs" --json --k 5` |
| Aider | open source | 0.60+ | Shell | `sqlite-graphrag recall "refactor target" --k 5 --json` |
| Jules | Google Labs | preview | CI automation | `sqlite-graphrag stats --json` |
| Kilo Code | community | any recent | Subprocess | `sqlite-graphrag recall "recent tasks" --json` |
| Roo Code | community | any recent | Subprocess | `sqlite-graphrag hybrid-search "repo context" --json` |
| Cline | community | VS Code ext | Terminal | `sqlite-graphrag list --limit 20 --json` |
| Continue | open source | VS Code or JetBrains ext | Terminal | `sqlite-graphrag recall "docstring" --json` |
| Factory | Factory | any recent | API or subprocess | `sqlite-graphrag recall "pr context" --json` |
| Augment Code | Augment | any recent | IDE | `sqlite-graphrag hybrid-search "code review" --json` |
| JetBrains AI Assistant | JetBrains | 2024.2+ | IDE | `sqlite-graphrag recall "stacktrace" --json` |
| OpenRouter | OpenRouter | any | Router for multi-LLM | `sqlite-graphrag recall "routing rule" --json` |
## Commands
### Core database lifecycle
| Command | Arguments | Description |
| --- | --- | --- |
| `init` | `--namespace <ns>` | Initialize database and download embedding model |
| `health` | `--json` | Show database integrity and pragma status; reports super-hub detection and `relation_concentration_warning` when any entity or relation type dominates the graph |
| `stats` | `--json` | Count memories, entities and relationships |
| `migrate` | `--json` | Apply pending schema migrations via `refinery` |
| `vacuum` | `--json` | Checkpoint WAL and reclaim disk space |
| `optimize` | `--json` | Run `PRAGMA optimize` to refresh statistics |
| `sync-safe-copy` | `--output <path>` | Checkpoint then copy a sync-safe snapshot |
### Memory content lifecycle
| Command | Arguments | Description |
| --- | --- | --- |
| `remember` | `--name`, `--type`, `--description`, `--body` (or `--body-file`/`--body-stdin`), `--dry-run` | Save a memory with optional entity graph; `--dry-run` previews without writing |
| `recall` | `<query>`, `-k`/`--k` (alias `--limit` since v1.0.35), `--type` | Search memories semantically via KNN |
| `read` | `--name <name>`, `--id <id>` | Fetch a memory by exact kebab-case name or numeric ID |
| `list` | `--type`, `--limit`, `--offset` | Paginate memories sorted by `updated_at` |
| `forget` | `--name <name>` | Soft-delete a memory preserving history |
| `rename` | `--old`/`--from` (since v1.0.35) `<name>`, `--new`/`--to` (since v1.0.35) `<name>` | Rename a memory while keeping versions |
| `edit` | `--name`, `--body`, `--description`, `--type` | Edit body, description, or memory type creating new version |
| `history` | `--name <name>` | List all versions of a memory |
| `restore` | `--name`, `--version` | Restore a memory to a previous version |
| `remember-batch` | `--json`, `--force-merge` | Ingest multiple memories from a NDJSON stream in one invocation |
### Utilities
| Command | Arguments | Description |
| --- | --- | --- |
| `completions` | `<shell>` (bash, zsh, fish, elvish, powershell) | Generate shell completion script and print to stdout |
### Retrieval and graph
| Command | Arguments | Description |
| --- | --- | --- |
| `hybrid-search` | `<query>`, `-k`/`--k` (alias `--limit` since v1.0.35), `--rrf-k` | FTS5 plus vector fused via Reciprocal Rank Fusion |
| `deep-research` | `<query>`, `--k` (default 20), `--max-sub-queries` (default 7), `--max-hops` (default 3), `--min-weight`, `--max-concurrency`, `--timeout`, `--with-bodies`, `--max-results` (default 50), `--rrf-k` (default 60), `--graph-decay` (default 0.7), `--graph-min-score` (default 0.05), `--max-neighbors-per-hop`, `--mode` (none|claude-code|codex), `--max-cost-usd` | Parallel multi-hop GraphRAG research with per-sub-query embeddings, RRF fusion for KNN+FTS5, directed evidence chains, graph_context with entities and relationships |
| `reclassify-relation` | `--from-relation`, `--to-relation`, `--batch`, `--source`, `--target`, `--filter-source-type`, `--filter-target-type`, `--dry-run` | Bulk or single reclassification of relationship types with UNIQUE collision handling |
| `normalize-entities` | `--dry-run`, `--yes` | Normalize entity names to lowercase kebab-case ASCII, auto-merge collisions |
| `enrich` | `--operation` (memory-bindings, entity-descriptions, body-enrich), `--mode` (claude-code, codex), `--limit`, `--dry-run`, `--max-cost-usd`, `--resume`, `--retry-failed`, `--llm-parallelism <N>` | LLM-augmented graph quality pipeline with scan-judge-persist pattern; `--llm-parallelism` controls concurrent LLM subprocesses (default 1) |
| `link` | `--source`, `--target`, `--relation` | Create an explicit relationship between two entities |
| `unlink` | `--source`, `--target`, `--relation` | Remove a specific relationship between two entities |
| `related` | `<name>`, `--hops` | Traverse the entity graph N hops from a memory |
| `namespace-detect` | `--namespace <name>` | Resolve namespace precedence for invocation |
### Graph Input Payloads
- `--entities-file` expects a JSON array of entities
- Each entity MUST include `name` and `entity_type`; alias `type` is accepted
- Agents MUST NOT send both `entity_type` and `type` in the same object
- Valid `entity_type` values: `project`, `tool`, `person`, `file`, `concept`, `incident`, `decision`, `memory`, `dashboard`, `issue_tracker`, `organization`, `location`, `date`
- `--relationships-file` expects a JSON array of relationships
- Each relationship MUST include `source`/`from`, `target`/`to`, `relation`, and `strength`
- `strength` MUST be a float in `[0.0, 1.0]` and maps to `weight` in graph outputs
- File payload relations use underscore labels such as `applies_to`, `depends_on`, and `tracked_in`; dashed aliases are normalized before storage
- `link` and `unlink` flags use dashed labels such as `applies-to`, `depends-on`, and `tracked-in`
- `--graph-stdin` accepts a single object with optional `body` plus the same `entities` and `relationships` arrays
### NER Extraction
#### GLiNER zero-shot entity extraction
- Pass `--enable-ner` or set `SQLITE_GRAPHRAG_ENABLE_NER=1` to activate on `remember` and `ingest`
- Works with `--graph-stdin`: pass `"entities": []` in JSON and GLiNER extracts entities automatically
- Model variants via `--gliner-variant`: `fp32` (1.1 GB, best quality), `fp16` (580 MB), `int8` (349 MB, fastest), `q4` (894 MB), `q4f16` (472 MB)
- Override model with `SQLITE_GRAPHRAG_GLINER_MODEL`; tune threshold with `SQLITE_GRAPHRAG_GLINER_THRESHOLD` (default `0.5`)
- Response field `extraction_method` reports: `gliner-<variant>+regex`, `regex-only`, or `none:extraction-failed`
- `--skip-extraction` is deprecated since v1.0.45; NER is off by default, use `--enable-ner` to activate
- `--max-rss-mb <MiB>` on `remember` and `ingest` aborts embedding when process RSS exceeds the threshold (default 8192 MiB)
- Ingest NDJSON per-file status values: `indexed` (created), `skipped` (duplicate), `failed` (error)
- Duplicate memories on re-ingest emit `status: "skipped"` with `action: "duplicate"` and do not count as failures
### Ingest Modes (v1.0.62)
- `--mode none` (default): body-only ingestion, no entity/relationship extraction
- `--mode gliner`: GLiNER NER extraction (requires `--enable-ner`, uses local ONNX model)
- `--mode claude-code`: LLM-curated extraction via locally installed Claude Code CLI (`claude -p` headless)
- Claude Code flags: `--claude-binary <PATH>`, `--claude-model <MODEL>`, `--resume`, `--retry-failed`, `--max-cost-usd <N>`, `--rate-limit-wait <S>`, `--keep-queue`, `--queue-db <PATH>`, `--claude-timeout <S>` (default 300s per file)
- Requires Claude Code >= 2.1.0 with active Pro/Max subscription — uses the user's existing billing
- Per-file NDJSON events include `entities` (count), `rels` (count), `cost_usd` fields
- Queue DB (`.ingest-queue.sqlite`) tracks per-file progress for resume/retry support
- Rate limit: automatic exponential backoff (60s → 120s → 300s → 900s)
- Budget: `--max-cost-usd` stops ingestion when cumulative cost exceeds the limit, preserving queue for `--resume`
- `--mode codex` spawns `codex exec --json` per file for LLM-curated extraction via OpenAI Codex CLI
- Requires Codex CLI installed locally; uses `--output-schema` for structured JSON output
- Codex-specific flags: `--codex-binary <PATH>`, `--codex-model <MODEL>`, `--codex-timeout <S>` (default 300)
- Environment variable `SQLITE_GRAPHRAG_CODEX_BINARY` overrides PATH lookup
- Token usage tracking via `input_tokens` and `output_tokens` fields (cost_usd unavailable from Codex CLI)
- Full embedding pipeline applied to Codex-ingested memories for recall and hybrid-search
- Since v1.0.64: relation strings from LLM extraction are normalized before DB insertion (depends-on → depends_on) — consistent with remember command
- Since v1.0.64: `ingest --mode claude-code` disables hooks via `--settings '{"hooks":{}}'` for OAuth users and detects `terminal_reason: "max_turns"`; detects OAuth via `apiKeySource` and omits misleading `cost_usd`; validates body size BEFORE sending to LLM subprocess (files exceeding 512 KB are skipped); `rename` and `rename-entity` reject same-name renames with exit 1
- Since v1.0.64: `deep-research` command added for parallel multi-hop GraphRAG research via heuristic query decomposition
Authentication: OAuth works out of the box for both modes — no API key needed.
--mode claude-code reads OAuth from ~/.claude/.credentials.json (Claude Pro/Max/Team).
--mode codex reads device auth from codex auth login (OpenAI).
API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY) ABORT the spawn with AppError::Validation. OAuth is the ONLY accepted credential flow.
```json
[
{ "name": "SQLite", "entity_type": "tool" },
{ "name": "GraphRAG", "type": "concept" }
]
```
```json
[
{
"source": "SQLite",
"target": "GraphRAG",
"relation": "supports",
"strength": 0.8,
"description": "SQLite supports local GraphRAG retrieval"
}
]
```
### Maintenance
| Command | Arguments | Description |
| --- | --- | --- |
| `purge` | `--retention-days <n>`, `--dry-run`, `--yes` | Permanently delete soft-deleted memories |
| `cleanup-orphans` | `--dry-run`, `--yes` | Remove orphaned entities with no linked memories or edges |
| `prune-relations` | `--relation <type>`, `--dry-run`, `--yes`, `--json` | Remove all relationships of a given type in bulk |
| `backup` | `--output <path>`, `--json` | Create a consistent backup using SQLite Online Backup API, safe with WAL and concurrent reads |
| `fts rebuild` | `--json` | Rebuild the FTS5 full-text index B-tree from the memories table |
| `fts check` | `--json` | Verify FTS5 index integrity; returns `integrity_ok` boolean |
| `fts stats` | `--json` | Show FTS5 index statistics such as token count and segment info |
| `memory-entities` | `--name <name>`, `--json` | List all entities linked to a named memory |
| `delete-entity` | `--name <name>`, `--cascade`, `--json` | Delete an entity node; `--cascade` also removes its relationships |
| `rename-entity` | `--name <name>`, `--new-name <name>`, `--json` | Rename a graph entity preserving all FK-based relationships; re-embeds for search |
| `reclassify` | `--name <name>` or `--from-type <t>`, `--entity-type <t>` or `--to-type <t>`, `--batch`, `--json` | Change `entity_type` of one entity or a whole category |
| `merge-entities` | `--names <csv>`, `--into <name>`, `--json` | Merge duplicate entity nodes into a single canonical node |
| `prune-ner` | `--entity <name>` or `--all`, `--yes`, `--json` | Remove NER-generated bindings from one or all entities |
## Environment Variables
### Runtime configuration overrides
| Variable | Description | Default | Example |
| --- | --- | --- | --- |
| `SQLITE_GRAPHRAG_DB_PATH` | Path override for the SQLite database file | `./graphrag.sqlite` in the invocation directory | `/data/graphrag.sqlite` |
| `SQLITE_GRAPHRAG_CACHE_DIR` | Directory for embedding model cache | XDG cache dir | `~/.cache/sqlite-graphrag` |
| `SQLITE_GRAPHRAG_LANG` | CLI output language as `en` or `pt` | `en` | `pt` |
| `SQLITE_GRAPHRAG_LOG_LEVEL` | Tracing filter level for stderr output | `info` | `debug` |
| `SQLITE_GRAPHRAG_NAMESPACE` | Namespace override respected by all commands (fixed v1.0.51) | none | `project-foo` |
| `SQLITE_GRAPHRAG_CODEX_BINARY` | Path to the Codex CLI binary, overrides PATH lookup for `--mode codex` | PATH lookup | `/usr/local/bin/codex` |
## Integration Patterns
### Compose with Unix pipelines and tools
```bash
sqlite-graphrag recall "auth tests" --k 5 --json | jaq -r '.results[].name'
```
### Feed hybrid search into a summarizer endpoint
```bash
sqlite-graphrag hybrid-search "postgres migration" --k 10 --json \
| jaq -c '.results[] | {name, combined_score}' \
| xh POST http://localhost:8080/summarize
```
### Backup with atomic snapshot and compression
```bash
sqlite-graphrag sync-safe-copy --output /tmp/ng.sqlite
ouch compress /tmp/ng.sqlite /tmp/ng-$(date +%Y%m%d).tar.zst
```
### Claude Code subprocess example in Node
```javascript
const { spawn } = require('child_process');
const proc = spawn('sqlite-graphrag', ['recall', query, '--k', '5', '--json']);
```
### Docker Debian build for CI pipelines
```dockerfile
FROM rust:1.88-bookworm AS builder
RUN apt-get update && apt-get install -y --no-install-recommends pkg-config libssl-dev ca-certificates && rm -rf /var/lib/apt/lists/*
RUN cargo install --locked sqlite-graphrag
```
## Exit Codes
### Deterministic status codes for orchestration
| Code | Meaning | Recommended Action |
| --- | --- | --- |
| `0` | Success | Continue the agent loop |
| `1` | Validation or runtime failure | Log and surface to operator |
| `2` | CLI usage error | Fix arguments then retry |
| `9` | Duplicate memory detected | Skip or use `--force-merge` |
| `3` | Optimistic update conflict | Re-read `updated_at` and retry |
| `4` | Memory or entity not found | Handle missing resource gracefully |
| `5` | Namespace limit or unresolved | Pass `--namespace` explicitly |
| `6` | Payload exceeded allowed limits | Split body into smaller chunks |
| `10` | SQLite database error | Run `health` to inspect integrity |
| `11` | Embedding generation failed | Check model files and retry |
| `12` | `sqlite-vec` extension failed (only relevant for `embedding-legacy` feature) | Reinstall with `--features embedding-legacy` or drop the legacy feature |
| `13` | Batch partial failure | Honor backoff and retry later |
| `14` | Filesystem I/O error: cache directory not writable, nonexistent ingest target directory | Check permissions and disk space |
| `15` | Database busy after retries | Wait and retry the operation |
| `20` | Internal or JSON serialization error | File a bug report |
| `75` | `EX_TEMPFAIL` — all slots busy | Retry with backoff |
| `77` | Low RAM for embedding model | Free RAM until the host is back above the 2 GB guard threshold |
## JSON Output Format
### Recall — Vector-Only KNN
```json
{
"query": "graphrag retrieval",
"k": 3,
"direct_matches": [
{ "memory_id": 1, "name": "graphrag-intro", "namespace": "global", "type": "user", "description": "intro doc", "snippet": "GraphRAG combines...", "distance": 0.09, "source": "direct" }
],
"graph_matches": [],
"elapsed_ms": 12,
"results": [
{ "memory_id": 1, "name": "graphrag-intro", "namespace": "global", "type": "user", "description": "intro doc", "snippet": "GraphRAG combines...", "distance": 0.09, "source": "direct" }
]
}
```
### Hybrid Search — FTS5 Plus Vector RRF
```json
{
"query": "postgres migration",
"k": 5,
"rrf_k": 60,
"weights": { "vec": 1.0, "fts": 1.0 },
"elapsed_ms": 18,
"results": [
{ "name": "postgres-migration-plan", "score": 0.96, "vec_rank": 1, "fts_rank": 1 },
{ "name": "db-migration-checklist", "score": 0.88, "vec_rank": 2, "fts_rank": 3 }
]
}
```
## Performance
### Measured on a 1000-memory database
- Cold and warm timings depend on hardware, model residency, and database size
- `recall` and `hybrid-search` become much cheaper when the daemon already holds the model
- Since v1.0.50, the CLI auto-restarts a stale daemon on version mismatch before the first embedding request; manual `daemon --stop` after upgrades is no longer required
- First `init` downloads the quantized model once and caches it locally
- Embedding model uses roughly 1.1 GB of RAM per heavy process after the current calibration
## Safe Parallel Invocation
### Counting semaphore with four simultaneous slots
- Each heavy invocation spawns an LLM subprocess (`claude` or `codex`); the Rust binary itself holds no model in memory
- Up to four instances run in parallel via `MAX_CONCURRENT_CLI_INSTANCES` default
- Lock files live at `~/.cache/sqlite-graphrag/cli-slot-{1..4}.lock` using `flock`
- A fifth concurrent invocation waits up to 300 seconds then exits with code 75
- Use `--max-concurrency N` to override the slot limit for the current invocation
- Memory guard aborts with exit 77 when less than 2 GB of RAM is available
- SIGINT and SIGTERM trigger graceful shutdown via `shutdown_requested()` atomic
## Idempotency and Side Effects
### Read-Only Commands — Zero Mutations Guaranteed
- `recall` reads the vector and metadata tables without touching disk state
- `read` fetches a single row by name and emits JSON without side effects
- `list` paginates memories sorted deterministically with stable cursors
- `health` runs SQLite `PRAGMA integrity_check` and reports without writing
- `stats` counts rows in read-only transactions safe for concurrent agents
### Write Commands — Optimistic Locking Protects Concurrency
- `remember` uses `ON CONFLICT(name)` so duplicate calls return exit code `9`
- `rename` requires `--expected-updated-at` to detect stale writes via exit `3`
- `edit` creates a new row in `memory_versions` preserving immutable history; since v1.0.64 re-generates vector embedding when body changes
- `restore` rewinds content while appending a new version instead of overwriting; since v1.0.64 preserves the current name after rename
- `forget` is soft-delete so re-running it is safe and idempotent by design
## Payload Limits
### Ceilings — Enforced By The Binary
- `EMBEDDING_MAX_TOKENS` equals 512 tokens measured by the model tokenizer
- `TEXT_BODY_PREVIEW_LEN` equals 200 characters in list and recall snippets
- `MAX_CONCURRENT_CLI_INSTANCES` equals 4 across cooperating subprocess agents
- `CLI_LOCK_DEFAULT_WAIT_SECS` equals 300 seconds before exit code `75`
- `PURGE_RETENTION_DAYS_DEFAULT` equals 90 days before hard delete becomes allowed
## Troubleshooting FAQ
### Common issues and fixes
- Database locked after crash requires `sqlite-graphrag vacuum` to checkpoint the WAL
- First `init` validates the LLM CLI on PATH; no model download; cold-start LLM subprocess spawn is 1-3 s per embedding call
- Permission denied on Linux means the cache directory lacks write access for your user
- Namespace detection falls back to `global` when no explicit override is present
- Parallel invocations beyond four slots receive exit 75 and SHOULD retry with backoff
- Exit 10 signals database error: run `sqlite-graphrag health --json` to inspect integrity
- Exit 12 is reserved for the `embedding-legacy` feature path; the default LLM-only build never returns exit 12
- Exit 77 signals low RAM: free at least 1 GB before invoking the embedding model
## HOW TO USE
### Prerequisites
- Rust 1.88 or newer installed via `rustup` across Linux macOS and Windows
- SQLite version 3.40 or newer shipped with your operating system distribution
- Available RAM of 100 MB free for runtime plus 1 GB during embedding model load
- Disk space of 200 MB for the embedding model cache on first invocation
- Network access required ONLY for first `init` to download quantized embeddings
### First Command in 60 Seconds
```bash
cargo install --locked sqlite-graphrag
sqlite-graphrag init
sqlite-graphrag remember --name first-note --type user --description "first memory" --body "hello graphrag"
```
- First line downloads, builds and installs the binary into `~/.cargo/bin`
- Second line creates the SQLite database and downloads the embedding model
- Third line persists your first memory and indexes it for hybrid retrieval
- Your next `recall` call returns the note you just saved in milliseconds
### Advanced Patterns
#### Hybrid Search With Weighted Fusion
```bash
sqlite-graphrag hybrid-search "postgres migration strategy" \
--k 20 \
--rrf-k 60 \
--weight-vec 0.7 \
--weight-fts 0.3 \
--json \
| jaq '.results[] | {name, score, source}'
```
- Combines dense vector similarity and sparse full-text matches in one ranked list
- Weight tuning lets you favor semantic proximity against keyword precision per query
- Pipeline saves eighty percent of tokens compared to LLM-based re-ranking
#### Graph Traversal for Multi-Hop Recall
```bash
sqlite-graphrag link --source auth-design --target jwt-spec --relation depends-on
sqlite-graphrag related auth-design --hops 2 --json \
| jaq -r '.nodes[] | select(.depth == 2) | .name'
```
- Two hops surface transitive knowledge invisible to pure vector search methods
- Typed relations let your agent reason about cause, dependency and reference chains
- Multi-hop recall recovers context that flat embeddings consistently drop out of top-K
#### Snapshot-Safe Sync With Dropbox or iCloud
```bash
sqlite-graphrag sync-safe-copy --dest ~/Dropbox/graphrag.sqlite
ouch compress ~/Dropbox/graphrag.sqlite ~/Dropbox/graphrag-$(date +%Y%m%d).tar.zst
```
- `sync-safe-copy` checkpoints the WAL and copies a consistent snapshot atomically
- Dropbox, iCloud and Google Drive NEVER corrupt the active database during sync
#### Integration With Claude Code Orchestrator
```bash
sqlite-graphrag recall "$USER_QUERY" --k 5 --json \
| jaq -c '{
context: [.results[] | {name, body, score}],
generated_at: now | todate
}' \
| claude --print "Use this context to answer: $USER_QUERY"
```
- Structured JSON flows cleanly into any downstream orchestrator reading this command's stdout through its own stdin
- Token cost drops by seventy percent compared to full-corpus context stuffing
## COOKBOOK
### Bootstrap Memory Database In 60 Seconds
```bash
cargo install --locked sqlite-graphrag
sqlite-graphrag init --namespace default
sqlite-graphrag health --json
```
- Command `init` creates the SQLite file and validates that an LLM CLI (`claude` or `codex`) is reachable on `PATH`. No model download.
- Exit code `0` signals the database is ready for writes and reads from any agent
- Saves 30 minutes per laptop versus a Pinecone plus Docker plus Python bootstrap
### Bulk-Import Knowledge Base Via Stdin Pipeline
```bash
fd -e md docs/ -0 | xargs -0 -n 1 -I{} sh -c '
sqlite-graphrag remember \
--name "$(basename {} .md)" \
--type user \
--description "imported from {}" \
--body-stdin < {}
'
```
- `--body-stdin` pipes the Markdown body without quoting or shell escape accidents
- Exit code `9` flags duplicates for you to skip cleanly inside the outer shell
- Saves 4 hours per thousand files versus hand-crafted CSV loaders
### Combine Vector And FTS Search With Tunable Weights
```bash
sqlite-graphrag hybrid-search "authentication jwt" \
--weight-vec 0.7 \
--weight-fts 0.3 \
--k 10 \
--json \
| jaq '.results[] | {name, score}'
```
- `--weight-vec 0.7` gives 70 percent weight to semantic similarity
- `--weight-fts 0.3` gives 30 percent weight to exact keyword matches
- Adjust weights per query type: raise fts for code identifiers, raise vec for concepts
### Purge Soft-Deleted Memories On A Schedule
```bash
sqlite-graphrag purge --retention-days 90 --dry-run --json
sqlite-graphrag purge --retention-days 90 --yes
```
- `--dry-run` shows exactly which memories qualify for hard deletion before committing
- Run on a weekly cron or GitHub Actions schedule to keep the database lean
### Export Memories To NDJSON For Backup
```bash
sqlite-graphrag list --limit 1000 --json \
| jaq -c '.[]' \
> memories-backup-$(date +%Y%m%d).ndjson
```
- One line per memory in NDJSON format for streaming ingestion by any tool
- Pipe through `ouch compress` to reduce file size by sixty percent for archival
## INTEGRATIONS
### Claude Code — Subprocess Integration
```bash
sqlite-graphrag recall "$USER_PROMPT" --k 5 --json \
| jaq -r '.results[].body' \
| claude --print "Context: $(cat -) — Answer: $USER_PROMPT"
```
- Add to `.claude/hooks/pre-task.sh` for automatic context injection per session
- Capture exit code `75` as retry-later to keep the agent alive gracefully
- Use `SQLITE_GRAPHRAG_NAMESPACE=$(basename $PWD)` to isolate context per project
### Codex CLI — AGENTS.md Driven Integration
```bash
sqlite-graphrag recall "$QUERY" --k 5 --json
sqlite-graphrag remember --name "codex-$(date +%s)" --type project --description "Codex result" --body "$RESULT"
```
- Add the tool definition to your project `AGENTS.md` file for Codex to discover
- Use `--type project` for Codex-generated memories to filter them separately later
### Gemini CLI — Function Call Integration
```bash
sqlite-graphrag hybrid-search "$GEMINI_QUERY" --json --k 5 \
| jaq '{results: [.results[] | {name, score, snippet: .body[:200]}]}'
```
- Truncate `.body[:200]` to fit within Gemini function call output size limits
### GitHub Actions — CI Memory Integration
```yaml
- name: Recall project context
run: |
sqlite-graphrag recall "build failure patterns" --k 5 --json \
| jaq -r '.results[].body' > /tmp/context.txt
```
- Install binary with `cargo install --locked sqlite-graphrag` in a setup step
- Run the binary inside the workflow workspace so the default `graphrag.sqlite` stays local to that run
## Language Control
### Bilingual Output — One Flag Switches Locale
- Flag `--lang en` forces English messages regardless of system locale
- Flag `--lang pt` forces Portuguese messages regardless of system locale
- Env `SQLITE_GRAPHRAG_LANG=pt` overrides system locale when `--lang` is absent
- Unknown locales default to English without emitting any warning to stderr
## What Changed in v1.0.68
### Critical fixes (G28 + G29)
- v1.0.68 is the first release since v1.0.65 that compiles on Windows via `cargo install`. v1.0.66 and v1.0.67 broke with `error[E0308]` in `src/terminal.rs:29` because `HANDLE` in `windows-sys >= 0.59` is `*mut c_void` (was `isize` in 0.48/0.52). Fix: `!handle.is_null() && handle != INVALID_HANDLE_VALUE` plus `windows-sys` pinned to `=0.59.0` exact, plus CI job `windows-build-check`.
- `enrich`, `ingest --mode claude-code`, and `ingest --mode codex` now acquire a per-namespace singleton via `lock::acquire_job_singleton(job_type, namespace, wait_seconds)`. A second concurrent invocation on the same database returns `AppError::JobSingletonLocked { job_type, namespace }` (exit 75, retryable) instead of stacking 4 × N workers × 10 MCP servers.
- `claude_runner::build_claude_command` now respects `SQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR` (opt-in). When set to an empty directory, the subprocess is spawned with `CLAUDE_CONFIG_DIR=<that dir>`, suppressing user-scoped MCP servers. This is the only mechanism Claude Code actually honours — `--strict-mcp-config` and `--mcp-config '{}'` are silently ignored per [anthropics/claude-code#10787].
- `enrich` emits a `tracing::warn!` when `--llm-parallelism > 4`, recommending the combination with `CLAUDE_CONFIG_DIR` override.
- `retry::CircuitBreaker` helper added with `AttemptOutcome::{Success, Transient, HardFailure}`. Rate-limited and timeout errors are explicitly excluded from the failure count.
- 3 pre-existing test failures in `src/commands/{history,list,read}.rs` fixed (timezone-agnostic assertions).
## What Changed in v1.0.69
### OAuth-Only Enforcement (BEHAVIOUR CHANGE — REQUIRED)
- `claude -p` and `codex exec` spawns now ABORT with `AppError::Validation` if `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` are set. OAuth (Claude Pro/Max or ChatGPT Pro) is the only accepted credential flow. See `docs/decisions/adr-0011-oauth-only-enforcement.md`.
- The `--bare` flag is REMOVED from every executable path. `ANTHROPIC_API_KEY` and `OPENAI_API_KEY` are also excluded from the `env_clear` whitelist as defence in depth.
### `enrich` Subcommand (G29 + G35 + G37)
- `enrich --operation <op> --mode <claude-code|codex> --json` runs LLM-curated graph quality. Three operations are fully implemented: `memory-bindings`, `entity-descriptions`, `body-enrich` (now succeeds 100% after the G29 hotfix on the `source` CHECK constraint plus the G29 audit trail via `memory_versions`).
- `--preserve-threshold <FLOAT>` (default 0.7) is the Jaccard trigram preservation gate from `src/preservation.rs` (10 tests). Scores below the threshold are rejected as `EnrichItemResult::PreservationFailed`.
- `--preflight-check`, `--fallback-mode <claude-code|codex>`, `--rate-limit-buffer <SECONDS>` (default 300) prevent batch loss when the Claude OAuth 5-hour window closes mid-run.
- `--names <a,b,c>` and `--names-file <PATH>` select a specific subset of memory names instead of scanning all candidates. `--names-file` accepts `#` comments and blank lines. Both flags combine as a union.
- `--llm-parallelism <N>` warning is conditional to the mode: Claude warns at 5 (OAuth-MCP fan-out), Codex warns at 17 (rate-limit risk), Codex 5..16 is silent.
- `--max-load-check` refuses to start when load average > `2 × ncpus`. `--circuit-breaker-threshold <N>` (default 5) aborts after N consecutive `HardFailure` outcomes.
### `vec` Subcommand Family (G39)
- `vec orphan-list --json` lists orphan memory embedding rows with `vector_hash` (BLAKE3 of the embedding blob).
- `vec purge-orphan --yes [--dry-run] --json` purges the THREE vec tables (`vec_memories`, `vec_entities`, `vec_chunks`) in a single transaction.
- `vec stats --json` exposes `vec_memories_rows`, `vec_entities_rows`, `vec_chunks_rows`, `orphans`, and the last vacuum timestamp.
- `forget` now calls `memories::delete_vec` BEFORE the soft-delete, preventing new orphans in the steady state.
### `codex-models` Subcommand (G33)
- `codex-models --json` lists the ChatGPT Pro OAuth accepted-model whitelist: `codex-auto-review`, `gpt-5.3-codex-spark`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.5`. Returns `models`, `count`, `default`.
- `codex-models --suggest <substring> --json` returns the closest match via substring lookup with a Levenshtein fallback. `enrich --codex-model-validate` (default true) checks the model BEFORE the subprocess is spawned. `--codex-model-fallback <MODEL>` auto-substitutes instead of aborting.
### `optimize` and `backup` Hardening (G36 + G38)
- `optimize` pre-checks FTS5 health via `check_fts_functional` BEFORE rebuilding. `--fts-dry-run` exits 1 if rebuild is recommended. `--fts-progress <N>` (default 30) emits progress every N seconds. `--yes` skips the confirmation prompt. `--no-fts-skip-when-functional` forces a rebuild.
- `backup` defaults to `run_to_completion(1000, Duration::from_millis(5), None)` — 25x faster than the v1.0.68 defaults. `--backup-step-size <PAGES>`, `--backup-step-sleep-ms <MS>`, `--backup-no-sleep`, `--backup-progress <PAGES>` (default 100) provide tunability.
### Singleton Scoped by `db_hash` (G30)
- `lock::acquire_job_singleton(job_type, namespace, db_path, wait_seconds, force)`. Two concurrent `enrich` invocations against DIFFERENT databases no longer collide. `db_hash` is the first 12 hex chars of `blake3(canonicalize(db_path))`.
- `--wait-job-singleton <SECONDS>` polls for the lock. `--force-job-singleton` breaks a stale lock. Both available on `enrich` and `ingest`.
### Codex Spawn Helper Unified (G31 + G32 + G33)
- `src/commands/codex_spawn.rs` (~700 lines, 11 tests) unifies the spawn pipeline, JSONL parser, and ChatGPT Pro OAuth model validation. Both `enrich --mode codex` and `ingest --mode codex` consume the same canonical command. The external `~/.local/bin/codex-clean` wrapper is now obsolete.
- 7 hardening flags: `--json --output-schema --ephemeral --skip-git-repo-check --sandbox read-only --ignore-user-config --ignore-rules` plus `-c mcp_servers='{}' --ask-for-approval never`. Schema JSON now lives in `paths::AppPaths::cache_dir().join("schemas")` instead of `/tmp`.
### `MemorySource` Enum and Preservation (G29)
- `src/memory_source.rs` defines a type-safe enum of the five CHECK-constraint values: `Agent`, `User`, `System`, `Import`, `Sync`. `TryFrom<&str>` returns `AppError::Validation` listing the accepted values. Runtime guard `validate_source` is called from `memories::insert` and `memories::update`.
- Idempotency via `blake3::hash`: when `old_hash == new_hash`, the body is skipped with reason `"enriched body hash matches original (blake3:{hash}); idempotency skip"`. Reprocessing the same memory is safe.
### Circuit Breaker and System Load (G28-D)
- `retry::CircuitBreaker` is integrated into the worker loop with `breaker.record(AttemptOutcome::HardFailure)`. The loop aborts after `--circuit-breaker-threshold` consecutive failures (default 5, set to 0 to disable).
- `src/system_load.rs` provides `load_average_one`, `ncpus`, and `is_system_saturated`. `enrich` aborts the spawn when `load_average_one() > 2 * ncpus` and `--max-load-check` is set (default true).
### Orphan Reaper (G28-C)
- `src/reaper.rs` walks `/proc` at startup, kills any `claude`/`codex` orphan with `PPID=1` and age greater than 60s. Invoked from `main` BEFORE any work. 4-test suite: `orphan_min_age_is_one_minute`, `orphan_targets_include_claude_and_codex`, `reaper_report_starts_zeroed`, `scan_completes_without_panic_on_linux`.
## What Changed in v1.0.76
### LLM-Only One-Shot Architecture (G21 + G22 + G23 + G24 + G25)
- The default build is LLM-Only and one-shot. No daemon, no ONNX runtime, no `multilingual-e5-small` model download. Embedding generation and NER delegate to a headless `claude code` or `codex` subprocess (OAuth, no MCP, no hooks). Release binary is approximately 6 MB.
- `cargo install sqlite-graphrag --features embedding-legacy --locked` restores the legacy fastembed + ort + tokenizers pipeline for the v1.0.76 → v1.1.0 transition window. The feature is REMOVED in v1.1.0.
- See ADR-0019 (LLM-Only One-Shot), ADR-0020 (Pure-Rust Cosine), ADR-0021 (Daemon Deprecation), ADR-0022 (BLOB-Backed Embeddings), ADR-0023 (Tokenizer Removal), ADR-0024 (FTS5 Coarse Filter + Cosine Refinement), ADR-0025 (OAuth-Only LLM Credential Flow), ADR-0026 (V002 `vec_tables` Migration Drift).
### `migrate` Subcommand Family
- `migrate --rehash --json` rewrites recorded migration checksums via `SipHasher13(name|version|sql)`. The algorithm matches `refinery-core 0.9.1` (same SipHasher13 crate, same hashing order). Required for v1.0.74 → v1.0.76 upgrades where V002 was intentionally emptied to a no-op. Response schema: `migrate-rehash.schema.json`.
- `migrate --to-llm-only --drop-vec-tables --json` is the one-shot upgrade for v1.0.74 / v1.0.75 databases: rehash + V013 vec-table drop + vec-table state report. The `--drop-vec-tables` flag is REQUIRED as a safety guard. Response schema: `migrate-to-llm-only.schema.json`.
### BLOB-Backed Embedding Tables (G22)
- V013 migration drops the `vec_memories`, `vec_entities`, `vec_chunks` virtual tables and replaces them with regular BLOB-backed `memory_embeddings`, `entity_embeddings`, `chunk_embeddings` tables. Cosine similarity is computed in pure Rust on demand in `src/similarity.rs`.
### Hybrid Search Refinement (G24)
- `hybrid-search` uses FTS5 for coarse filtering and refines the candidate set with a pure-Rust cosine over the BLOB embeddings. FTS5 stays healthy because the rebuild is gated by `optimize --fts-skip-when-functional` (G36 from v1.0.69).
### Extraction Backend Selector
- New `--extraction-backend llm|embedding|none|both` global flag (default `llm`) selects the extraction backend. `llm` is the LLM-backed path; `embedding` is the legacy fastembed pipeline (requires `embedding-legacy` feature); `none` is a no-op; `both` runs them in parallel and merges the results.
- `src/extract/` exposes the `ExtractionBackend` trait with the four implementations. `src/spawn/` exposes the `VersionAdapter` trait with `CodexAdapter` (detects `codex 0.130.0` through `0.138+` and adapts flags), `ClaudeAdapter` (claude code 2.1.0+), and `OpencodeAdapter` (opencode headless).
### Daemon Deprecation (ADR-0021)
- The `daemon` subcommand is DEPRECATED and kept for source compatibility through v1.0.76 → v1.1.0. The daemon no longer offers a speedup because the LLM subprocess is the new "model loader". REMOVED in v1.1.0.
## Stable Identity
### Facts that remain invariant across doc updates
- Package `sqlite-graphrag` v1.0.68 published on crates.io with MSRV Rust 1.88
- Repository `https://github.com/daniloaguiarbr/sqlite-graphrag` with CI on push and tag
- License dual `MIT OR Apache-2.0` with individual license files at the repository root
- Embedding generation: delegated to a headless `claude code` or `codex` subprocess (OAuth) per call. The Rust binary holds no model.
- Storage layer `rusqlite` bundled SQLite plus `sqlite-vec` extension plus FTS5 module
## Acknowledgments
### Built on top of excellent open source
- `fastembed` provides local quantized embedding models without ONNX hassle
- `sqlite-vec` adds vector indexes directly inside SQLite as an extension
- `refinery` runs schema migrations with transactional safety guarantees
- `clap` powers the CLI argument parsing with derive macros
- `rusqlite` wraps SQLite with safe Rust bindings and bundled build
- `rayon` accelerates parallel batch operations on multicore hardware
- `tokio` drives the async runtime for I/O-bound orchestration tasks
## License
### Dual license MIT OR Apache-2.0
- Licensed under either of Apache License 2.0 or MIT License at your option
- See `LICENSE-APACHE` and `LICENSE-MIT` in the repository root for full text