sqlite-graphrag
Persistent memory for AI agents in a single Rust binary with built-in GraphRAG. Current release: v1.0.79 — LLM-Only and one-shot. Every build embeds through
claude -porcodex exec(OAuth, no MCP, no hooks). No daemon, no ONNX runtime, ~6 MB binary. Theembedding-legacyfeature was REMOVED in v1.0.79; there is no local-model build path.
-
Read this document in Portuguese (pt-BR).
-
Portuguese version available at README.pt-BR.md
-
Public package and repository are live on GitHub and crates.io
-
Install the latest published release with
cargo install sqlite-graphrag --locked -
Upgrade an existing install with
cargo install sqlite-graphrag --locked --force -
Verify the active binary with
sqlite-graphrag --version -
See CHANGELOG.md for the full release history
-
Release-grade validation includes the
slow-testscontract suites documented indocs/TESTING.md -
Build directly from the local checkout with
cargo install --path . -
Upgrading from v1.0.74 / v1.0.75? See docs/MIGRATION.md for the v1.0.76 / v1.0.77 / v1.0.78 migration procedure
What is it?
sqlite-graphrag delivers durable memory for AI agents
- Stores memories, entities and relationships inside a single SQLite file under 25 MB
- Build (v1.0.79): LLM-only and one-shot — embeddings are generated by spawning
claude -porcodex execwith OAuth; no local model, no daemon, no ONNX runtime, ~6 MB binary - Legacy build: REMOVED in v1.0.79 — the
embedding-legacyfeature and the local fastembed/ONNX path no longer exist - Combines FTS5 full-text search with pure-Rust cosine similarity into a hybrid Reciprocal Rank Fusion ranker
- Stores and traverses an explicit entity graph with typed edges for multi-hop recall across memories
- Preserves every edit through an immutable version history table for full audit
- Runs on Linux, macOS and Windows natively with zero external services required (default build needs
claudeorcodexCLI onPATH)
Why sqlite-graphrag?
Differentiators against cloud RAG stacks
- OAuth-only LLM flow — no API keys ever in the environment; the spawn ABORTS if
ANTHROPIC_API_KEYorOPENAI_API_KEYis set (defence in depth since v1.0.69) - No recurring embedding fees — embeddings come from your existing Claude Pro / Max or ChatGPT Pro subscription
- Single-file SQLite storage replaces Docker clusters of vector databases entirely
- Graph-native retrieval beats pure vector RAG on multi-hop questions by design
- Deterministic JSON output unlocks clean orchestration by LLM agents in pipelines
- Native cross-platform binary ships without Python, Node or Docker dependencies (default build needs only
claudeorcodexCLI)
Superpowers for AI Agents
First-class CLI contract for orchestration
- Every subcommand accepts
--jsonproducing deterministic stdout payloads - v1.0.76 is one-shot by default — no background process; each embedding call spawns a fresh
claude -porcodex exec - Every write is idempotent through
--namekebab-case uniqueness constraints - Stdin is explicit: use
--body-stdinfor body text or--graph-stdinfor one{body?, entities, relationships}object; raw entity and relationship arrays use--entities-fileand--relationships-file rememberaccepts body payloads up to512000bytes and up to512chunks- Relationship payloads use
strengthin[0.0, 1.0], mapped toweightin outputs - Stderr carries tracing output under
SQLITE_GRAPHRAG_LOG_LEVEL=debugonly --helpis English-first by design; use--langfor human-facing runtime messages, not static clap help text- Cross-platform behavior is identical across Linux, macOS and Windows hosts
Graph Schema
Entity types, relation labels and edge strength
entity_typeaccepts exactly 13 values:project,tool,person,file,concept,incident,decision,memory,dashboard,issue_tracker,organization,location,daterelation(CLI input) accepts any kebab-case or snake_case string. 12 canonical values are well-known:applies-to,uses,depends-on,causes,fixes,contradicts,supports,follows,related,mentions,replaces,tracked-in. Custom values (e.g.,implements,tested-by,blocks) are accepted with atracing::warn!. JSON output normalizes to underscores (e.g.,applies_to).strengthis a float in[0.0, 1.0]representing edge weight; mapped toweightin all read outputs- Unlisted
entity_typevalues are rejected at write time with exit code 1. Customrelationvalues are accepted since v1.0.49. - Use
sqlite-graphrag graph --format jsonto inspect the full stored graph at any time
27 AI agents and IDEs supported out of the box (21 catalogued + 6 community)
| Agent | Vendor | Minimum version | Integration pattern |
|---|---|---|---|
| Claude Code | Anthropic | 1.0 | Subprocess with --json stdout |
| Codex | OpenAI | 1.0 | Tool call wrapping cargo run -- recall |
| Gemini CLI | 1.0 | Function call returning JSON | |
| Opencode | Opencode | 1.0 | Shell tool with hybrid-search --json |
| OpenClaw | Community | 0.1 | Subprocess pipe into jaq filters |
| Paperclip | Community | 0.1 | Direct CLI invocation per message |
| VS Code Copilot | Microsoft | 1.85 | Terminal subprocess via tasks |
| Google Antigravity | 1.0 | Agent tool with structured JSON | |
| Windsurf | Codeium | 1.0 | Custom command registration |
| Cursor | Anysphere | 0.42 | Terminal integration or MCP wrapper |
| Zed | Zed Industries | 0.160 | Extension wrapping subprocess |
| Aider | Paul Gauthier | 0.60 | Shell command hook per turn |
| Jules | Google Labs | 1.0 | Workspace shell integration |
| Kilo Code | Community | 1.0 | Subprocess invocation |
| Roo Code | Community | 1.0 | Custom command via CLI |
| Cline | Saoud Rizwan | 3.0 | Terminal tool registered manually |
| Continue | Continue Dev | 0.9 | Context provider via shell |
| Factory | Factory AI | 1.0 | Tool call with JSON response |
| Augment Code | Augment | 1.0 | Terminal command wrapping |
| JetBrains AI Assistant | JetBrains | 2024.3 | External tool per IDE |
| OpenRouter | OpenRouter | 1.0 | Function routing through shell |
| Minimax | Minimax | 1.0 | Subprocess invocation |
| Z.ai | Z.ai | 1.0 | Subprocess invocation |
| Ollama | Ollama | 0.1 | Subprocess invocation |
| Hermes Agent | Community | 1.0 | Subprocess invocation |
| LangChain | LangChain | 0.3 | Subprocess via tool |
| LangGraph | LangChain | 0.2 | Subprocess via node |
Quick Start
Install and record your first memory in four commands
Required flags for
remember:--name,--type,--description. Body via--body "text",--body-file <path>, or--body-stdin(pipe from stdin). Body limit: 500 KB (512000 bytes). Larger inputs are rejected with exit code 6 (limit exceeded); split into multiple memories or trim before sending. Windows users (G29): v1.0.68 is the first release since v1.0.65 that successfully compiles viacargo installon Windows. If you must stay on v1.0.66 or v1.0.67, see docs/CROSS_PLATFORM.md for the manual workaround.
- GraphRAG is enabled by default and runs automatically. Every subcommand auto-initializes
graphrag.sqlitein the current working directory if it does not exist. Entity/relationship extraction comes from the LLM backend (--extraction-backend llm, the default) or from curated graph input (--graph-stdin,--entities-file).
Automatic extraction (--enable-ner)
-
Pass
--enable-neror setSQLITE_GRAPHRAG_ENABLE_NER=1to activate automatic extraction onrememberandingest -
Since v1.0.79 this runs URL-regex extraction ONLY — the local GLiNER zero-shot pipeline was removed together with the
ner-legacyfeature -
--gliner-variant,SQLITE_GRAPHRAG_GLINER_MODELandSQLITE_GRAPHRAG_GLINER_THRESHOLDare still accepted for compatibility but have NO effect -
Response field
extraction_methodreportsurl-regex,regex-only, ornone:extraction-failed -
For high-quality entity/relationship extraction prefer
ingest --mode claude-code/--mode codex(LLM-curated) or pass curated entities via--graph-stdin -
--skip-extractionis deprecated since v1.0.45 and has no effect -
sqlite-graphrag initis OPTIONAL but recommended on first use because it creates the database, applies migrations and validates that aclaudeorcodexCLI is reachable onPATH(there is no model download since v1.0.76 — embeddings come from the LLM subprocess). -
graphrag.sqliteis created in the current working directory by default (override with--db <path>orSQLITE_GRAPHRAG_DB_PATH) -
For the local checkout,
cargo install --path .is enough -
Re-run
sqlite-graphrag --versionafter any upgrade to confirm the active binary -
After the public release, prefer
--lockedto preserve the tested MSRV dependency graph
Version Highlights
- v1.0.79: G42 closed — the LLM embedding pipeline is no longer slow, serialized or fragile. (S1) configurable embedding dimensionality, default 64 (
--embedding-dim,SQLITE_GRAPHRAG_EMBEDDING_DIM, range [8, 4096]; precedence flag > env >schema_meta.dim> 64; existing 384-dim databases keep working unchanged, ZERO schema change). (S2) batched LLM calls ({items:[{i,v}]}— chunks at 8, entity names at 25 at dim 64, dim-adaptive via clamp(base×64/dim, 1, base) since G44; 39 spawns collapse into 4-5). (S3) real bounded parallelism viaSemaphore+JoinSetwith the new--llm-parallelismflag onremember(default 4),ingest(default 2) andedit; results stream through a bounded mpsc channel. (S4) codex schema tempfiles are RAIINamedTempFiles; the reaper also removes stalecodex-home-{pid}dirs. (S5)SQLITE_GRAPHRAG_CLAUDE_EMBED_MODELenv override. (S6) emptyCLAUDE_CONFIG_DIRby default on the embedding path (~40-50s → ~10-15s per call). (S7) actionable codex headless error. (S8) panic-free signal handler (second signal exits 130 with ZERO I/O). (S9) canonical re-embed:enrich --operation re-embedplusedit --force-reembed. (C5)validate_dimerrors on divergent vectors instead of silently normalising. Every LLM subprocess useskill_on_dropplusSQLITE_GRAPHRAG_EMBED_TIMEOUT_SECS(default 300s). Also REMOVED: the daemon infrastructure and the legacyembedding-legacy/ner-legacy/fullfeatures with the fastembed/ort/ndarray/tokenizers/hf-hub optional dependencies — every build is LLM-only. - v1.0.78: G41 fix —
migrate --rehashno longer inserts phantom rows for unapplied migrations (V013 was being registered without executing its SQL) - v1.0.77: G40 fix — the
run_rehashINSERT now writesapplied_on(RFC3339); a NULL there blocked every subsequent migration - v1.0.76: Breaking architectural change — the default build becomes LLM-only and one-shot: no daemon, no ONNX runtime, no local model download; embeddings/NER delegate to
claude -porcodex execheadless (OAuth). Migration V013 drops thevec_*virtual tables in favour of BLOB-backed embedding tables with pure-Rust cosine similarity. Newmigrate --rehashandmigrate --to-llm-only --drop-vec-tablesupgrade paths. 7 new ADRs (0019-0025) plus ADR-0026 documenting the V002 drift root cause - v1.0.75: new
ExtractionBackendtrait (G21) behind the global--extraction-backend llm|embedding|none|bothflag; LLM-backed extraction becomes the default - v1.0.74:
--skip-extractionno-op compatibility restored (v1.0.45 promise honored) — the hard validation error introduced in v1.0.67 reverted totracing::warn! - v1.0.73: CI fix —
clang/mold/lldinstalled inside thecrosscontainer foraarch64-unknown-linux-gnubuilds - v1.0.72: CI fix — mold linker installed on
ubuntu-latestrunners (12+ jobs failed withinvalid linker name in argument) - v1.0.71: CI fix —
Swatinem/rust-cacherepinned from the non-existentv2.8ref tov2.9.1across 17 call-sites - v1.0.70: i18n fix — manual POSIX locale precedence
LC_ALL > LC_MESSAGES > LANG(the cached system locale ignored runtime env vars) - v1.0.69: 12 gaps closed (G28-G39) with full OAuth-only enforcement. (OAuth-only behaviour change)
claude -pandcodex execspawns now ABORT withAppError::ValidationifANTHROPIC_API_KEYorOPENAI_API_KEYare set; the--bareflag is REMOVED from all executable code. Operators using API keys MUST migrate to OAuth. (G28 CRITICAL) 4 reinforcing fixes for process proliferation: 7 flags hardening inclaude_runner::build_claude_command(always passes--strict-mcp-config --mcp-config '{}' --settings '{"hooks":{}}' --dangerously-skip-permissions),SIGTERMon timeout, newsrc/reaper.rswalking/procat startup, andsrc/system_load.rsplusCircuitBreakerintegration. (G29)enrich --operation body-enrichnow succeeds 100% (was 100% CHECK constraint failure), with audit trail viamemory_versions, type-safeMemorySourceenum, Jaccard preservation gate (10 tests, default 0.7), andblake3idempotency skip. (G30) Singleton lock scoped per(job_type, namespace, db_hash)with new--wait-job-singletonand--force-job-singletonflags. (G31+G32+G33) Newsrc/commands/codex_spawn.rs(~700 lines, 11 tests) unifies spawn pipeline, JSONL parser, and ChatGPT Pro OAuth model validation;enrich --mode codexandingest --mode codexshare the same canonical command (was divergent, motivated the~/.local/bin/codex-cleanwrapper). (G34) Worker warning is conditional to mode (Claude > 4, Codex > 16). (G35)--preflight-check,--fallback-mode,--rate-limit-bufferprevent batch loss on Claude rate limit. (G36)optimizepre-checks FTS5 health before rebuilding, plus new--fts-dry-run,--fts-progress,--yes. (G37)--names <NAME>and--names-file <PATH>for selective enrichment. (G38) Backup defaults 25x faster (1000/5ms vs 100/50ms) with 4 new tuning flags. (G39) Newvec orphan-list/vec purge-orphan/vec statssubcommand family plusforgethook to prevent new orphans. +53 tests (692 → 745). 7 new ADRs (docs/decisions/adr-0011-0017-*.md) document every architectural decision. - v1.0.68: 2 CRITICAL fixes for Windows + process proliferation. (G29)
cargo installno Windows was breaking witherror[E0308]insrc/terminal.rs:29becauseHANDLEinwindows-sys >= 0.59is*mut c_void(wasisizein 0.48/0.52). Replaced with the type-safe idiom!handle.is_null() && handle != INVALID_HANDLE_VALUE, pinnedwindows-systo=0.59.0exact, and added CI jobwindows-build-checkthat runscargo check --target x86_64-pc-windows-msvcon every push. (G28-B) Addedlock::acquire_job_singletonper(job_type, namespace)so two parallelenrich/ingest --mode claude-code|codexinvocations against the same database now fail fast with the new exit-75AppError::JobSingletonLocked { job_type, namespace }instead of stacking 4 × N workers × 10 MCP processes (root cause of the 2026-06-03 276-load-average incident). (G28-A)claude_runner::build_claude_commandnow respectsSQLITE_GRAPHRAG_CLAUDE_EMPTY_CONFIG_DIR— when set to an empty directory, the subprocess is spawned withCLAUDE_CONFIG_DIR=<that dir>, suppressing user-scoped MCP servers and their 8-10-process fan-out. Deliberately avoids--strict-mcp-config/--mcp-config '{}'because [anthropics/claude-code#10787] documents that Claude Code CLI ignores both flags. (G28-D)retry::CircuitBreakerhelper plus atracing::warn!when--llm-parallelism > 4(combine withCLAUDE_CONFIG_DIRoverride to keep subprocess fan-out manageable). Also fixed 3 pre-existing test failures insrc/commands/{history,list,read}.rsthat were leaking theSQLITE_GRAPHRAG_DISPLAY_TZenv var between parallel tests. - v1.0.67: 2 NEW commands:
remember-batch(NDJSON batch memory creation with--transaction/--force-merge),completions(shell completions for Bash/Zsh/Fish/PowerShell/Elvish);read --idfor direct memory_id lookup,enrich --llm-parallelismfor parallel LLM workers,healthsuper-hub detection (degree > 50),editskip-embed optimization via body_hash comparison,renameghost purge for soft-deleted name conflicts, flag validation in hybrid-search/recall/ingest, V012 relationship timestamps migration, 24 gap fixes total - v1.0.66: 35 BUG/GAP fixes including 3 CRITICAL (reclassify-relation crash, evidence chain flooding, link weight),
edit --typeflag,graph_contextin deep-research, LLM-friendly aliases for graph/list JSON, full doc audit - v1.0.65: 3 NEW commands:
reclassify-relation(bulk relationship type renames with UNIQUE collision handling),normalize-entities(normalize entity names to kebab-case with auto-merge),enrich(LLM-augmented graph quality: memory-bindings, entity-descriptions, body-enrich); CRITICAL deep-research fixes: per-sub-query embeddings (was sharing one), RRF fusion for KNN+FTS5 (was hardcoded 0.5), directed evidence chains (was flat global dump); new deep-research flags--rrf-k,--graph-decay,--graph-min-score,--max-neighbors-per-hop; entity name normalization on all write paths;healthreports relation concentration;--max-entity-degreewarning on link/remember - v1.0.64: NEW
deep-researchcommand for parallel multi-hop GraphRAG research via query decomposition (up to 7 sub-queries) with bounded JoinSet + Semaphore fan-out and evidence chain assembly; ingest claude-code disables hooks via--settingsfor OAuth (was failing 65% of files), detects OAuth and omits misleadingcost_usd, validates body size BEFORE LLM extraction (files >512 KB skipped); rename/rename-entity reject same-name with exit 1 - v1.0.63: restore preserves current name after rename (was reverting to version's original name), ingest claude-code/codex normalizes relation strings before DB insertion, edit re-generates vector embeddings when body changes, OAuth-first auth docs
- v1.0.62: 10 bug fixes for ingest --mode claude-code (G01 CRITICAL: recall now works), NEW --mode codex for OpenAI Codex CLI extraction, new flags --codex-binary/--codex-model/--codex-timeout
- v1.0.61: 15 bug fixes for ingest --mode claude-code (B00-B13), new --claude-timeout flag, wait-timeout subprocess management
- v1.0.60: NEW ingest --mode claude-code for LLM-curated extraction via Claude Code CLI, queue DB for resume/retry, 7 new ingest flags
- v1.0.59: rename-entity name validation, unlink schema fix, reclassify
description_updatedfield, contract+schema tests for rename-entity, E2E entity validation tests, doc audit (6 files) - v1.0.58: FTS5 sync fix (CRITICAL: remember --force-merge was silently corrupting FTS5 index), merge-entities UNIQUE fix for memory_entities, new
rename-entitycommand, entity name validation,memory-entities --entityreverse lookup,reclassify --description, purge responseactionfield, fts help EXAMPLES, health tracing - v1.0.57: 16 fixes — merge-entities UNIQUE constraint, memory-entities column name, --clear-body validation, WAL checkpoint for fts rebuild/check, degree recalculation for delete-entity/merge-entities adjacents, atomic backup via tempfile-rename, 18 new contract+schema tests
- v1.0.56: 9 new commands (fts, backup, delete-entity, reclassify, merge-entities, memory-entities, prune-ner), 7 new flags, 19 new JSON fields, FTS5 graceful degradation, JSON error envelope
- v1.0.55: Full doc audit — export summary
total→exported, list response fields corrected,--tzexit code 1→2, exit 2 added to exit code table, stats legacy aliases documented - v1.0.54: WAL checkpoint for
prune-relations(last missing command),--graph-stdinempty body validation,memory_typeJSON field inlist/export,Vec::with_capacityin 9 cold paths - v1.0.53: WAL checkpoint TRUNCATE after every write command for Dropbox/cloud-sync safety,
export --jsoncontract fix,Vec::with_capacityin 12 hot paths - v1.0.52: 12 gaps fixed, new
exportsubcommand, exit code Duplicate 2→9 (breaking),forgetnot-found no JSON (breaking) - v1.0.51: Namespace env var fix (8 commands), remember on soft-deleted fix, per-chunk RSS watchdog (
--max-rss-mb), daemon test coverage - v1.0.50:
prune-relationssubcommand, daemon auto-restart on version mismatch, V011 index, 37 doc gaps fixed - v1.0.49: Extensible relation vocabulary, V010 migration, 15 doc updates
- v1.0.48: GLiNER NER functional, 5 bug fixes, full doc audit
- v1.0.47: Replace BERT NER with GLiNER zero-shot, 13 custom entity types,
--gliner-variantflag - v1.0.35: Flag aliases (
--from/--to,--old/--new,--limitas alias of--k)
Memory Lifecycle
Runnable sequence: init → remember → recall → forget → purge
# 1. Initialize (once per database)
# 2. Store a memory
# 3. Retrieve by semantic similarity
# 4. Soft-delete (reversible)
# 5. Permanently remove soft-deleted memories older than 0 days
All five commands above are safe to run in sequence on a fresh database.
Installation
Minimum supported toolchain
- Rust 1.88 or newer (
rust-version = "1.88"inCargo.toml); older toolchains will fail with an MSRV error duringcargo install.
Multiple distribution channels
- Install the latest published release with
cargo install sqlite-graphrag --locked - Upgrade an existing published binary with
cargo install sqlite-graphrag --locked --force - Pin to a specific version with
cargo install sqlite-graphrag --version <X.Y.Z> --locked - Install from the local checkout with
cargo install --path . - Build from the local checkout with
cargo build --release
Usage
Initialize the database
- Without
--dborSQLITE_GRAPHRAG_DB_PATH, every CRUD command in that directory uses./graphrag.sqlite
Remember a memory with an optional explicit entity graph
- By default,
rememberdoes NOT run automatic URL extraction (off by default) - Pass
--enable-nerto activate URL-regex extraction for that call, or setSQLITE_GRAPHRAG_ENABLE_NER=1(the GLiNER pipeline was removed in v1.0.79)
rememberJSON response includesurls_persisted(URLs routed tomemory_urlstable) andrelationships_truncated(bool, set when relationships were capped)- URLs are stored in
memory_urlsvia schema V007 and never pollute the entity graph - Sample JSON output illustrating extracted entities and relationships:
Automatic extraction status (GLiNER removed in v1.0.79)
- The local GLiNER zero-shot NER pipeline was REMOVED in v1.0.79 with the
ner-legacyfeature;--enable-nernow performs URL-regex extraction only - For LLM-curated entity/relationship extraction use
ingest --mode claude-codeoringest --mode codex - For exact control pass curated entities via
--graph-stdin,--entities-fileand--relationships-file - The
extraction_methodfield in the JSON response reports which path ran
Read, forget, edit and rename using positional name argument
- Positional name is equivalent to
--name <name>forread,forget,history,editandrename
Recall memories by semantic similarity
Hybrid search combining FTS5 and vector KNN
Deep research with parallel multi-hop query decomposition (v1.0.64)
- Decomposes the query into up to 7 sub-queries, runs them in parallel via bounded
JoinSet+Semaphore, merges results with cross-query deduplication, and assembles evidence chains from graph traversal - Defaults calibrated against NovelHopQA, StepChain, HopRAG benchmarks:
--k 20,--max-sub-queries 7,--max-hops 3
Inspect database health and stats
Purge soft-deleted memories after retention period
Default retention: 90 days. To purge ALL forgotten memories regardless of age, pass
--retention-days 0.
Bulk-ingest every Markdown file under a directory
Bulk-ingest with low-memory mode (single worker)
# Force single-threaded ingest to reduce RSS pressure (recommended for <4 GB RAM
# environments and container/cgroup constraints). Trade-off: 3-4x longer wall time.
# Or via env var (CLI flag takes precedence):
SQLITE_GRAPHRAG_LOW_MEMORY=1
Bulk-ingest with LLM-curated entities via Claude Code (v1.0.61)
# Extract entities and relationships using locally installed Claude Code CLI
# Resume interrupted ingestion
# Set budget limit
# Extract entities and relationships using locally installed OpenAI Codex CLI
Authentication: OAuth is the ONLY accepted credential flow. API keys are PROHIBITED.
--mode claude-codereads OAuth from~/.claude/.credentials.json(Claude Pro/Max/Team).--mode codexreads device auth fromcodex login(OpenAI ChatGPT). DefiningANTHROPIC_API_KEYorOPENAI_API_KEYin the environment ABORTS the spawn withAppError::Validationand exit code 1. The--bareflag (which would also demand an API key) is REMOVED from all executable code paths. Seedocs/decisions/adr-0011-oauth-only-enforcement.mdfor the full rationale.ingestemits NDJSON on stdout: one JSON line per file, then a summary line. Per-filestatusvalues:indexed(created),skipped(duplicate or invalid name),failed(error). Duplicates emitstatus: "skipped"withaction: "duplicate"and do not count as failures. Pass--dry-runto preview the name mapping (kebab-cased basenames) without writing anything to the database. Schema:docs/schemas/ingest-file-event.schema.json,docs/schemas/ingest-summary.schema.json.
Rename a memory while keeping its version history
Edit a memory body or description (creates a new version)
Restore a memory to a previous version
Apply pending schema migrations
Resolve namespace precedence for the current invocation
Refresh SQLite query planner statistics
Reclaim disk space and checkpoint the WAL
Create a typed relationship between two entities
Remove a specific relationship between two entities
Traverse memories connected via the entity graph
Empty results are normal for memories without graph edges yet — extract entities first via
rememberoringest. Edges form when ≥2 entities co-occur in the same memory body.
Export a graph snapshot in json, dot or mermaid
Remove orphan entities with no memories and no relationships
Bulk-delete relationships by type
Clear cached embedding/NER models from the XDG cache
List every version of a memory
Commands
Core database lifecycle
| Command | Arguments | Description |
|---|---|---|
init |
--namespace <ns> |
Initialize database, apply migrations and validate that a claude/codex CLI is reachable (no model download) |
health |
--json |
Show database integrity, FTS5 functional check, sqlite version, super-hub detection (degree > 50) |
stats |
--json |
Count memories, entities and relationships |
migrate |
--json |
Apply pending schema migrations via refinery |
vacuum |
--json |
Checkpoint WAL and reclaim disk space |
optimize |
--json, --skip-fts |
Run PRAGMA optimize and rebuild FTS5 index (skip with --skip-fts) |
backup |
--output <path> |
Back up the database using the SQLite Online Backup API |
sync-safe-copy |
--dest <path> (alias --output) |
Checkpoint then copy a sync-safe snapshot |
Memory content lifecycle
| Command | Arguments | Description |
|---|---|---|
remember |
--name, --type, --description, --body (or --body-file/--body-stdin), --entities-file, --relationships-file, --graph-stdin, --llm-parallelism <N> (default 4), --enable-ner (URL-regex only since v1.0.79), --force-merge, --clear-body, --dry-run |
Save a memory with optional entity graph; --type/--description optional with --force-merge (inherited from existing); --dry-run validates without persisting |
remember-batch |
--transaction, --force-merge, --fail-fast |
Batch-create memories from NDJSON stdin; one invocation, one slot, one DB connection |
recall |
<query>, -k/--k (alias --limit), --type, --max-hops, --max-distance, --all-namespaces, --no-graph |
Search memories semantically via KNN + graph traversal |
read |
[name] or --name <name>, --id <N>, --with-graph |
Fetch a memory by exact name or integer memory_id; --with-graph includes linked entities and relationships |
list |
--type, --limit, --offset, --include-deleted |
Paginate memories sorted by updated_at; default limit is all with --json, 50 for text; response includes total_count, truncated, body_length |
forget |
[name] or --name <name> |
Soft-delete a memory preserving history |
rename |
[old], or --name/--old/--from <NAME>, --new-name/--new/--to <NAME> |
Rename a memory while keeping versions |
edit |
[name] or --name, --body, --description, --type, --force-reembed, --llm-parallelism <N> |
Edit body, description or memory type creating new version; skips re-embedding when body content is unchanged; --force-reembed (v1.0.79) regenerates the embedding without changing the body |
history |
[name] or --name <name>, --diff |
List all versions of a memory; --diff includes character-level change summary |
memory-entities |
[name] or --name <name>, --entity <name> |
List entities linked to a memory, or memories linked to an entity (reverse lookup via --entity) |
restore |
--name, --version |
Restore a memory to a previous version |
ingest |
<DIR>, --type, --pattern <GLOB> (default *.md), --recursive, --mode (none/claude-code/codex; gliner accepted but URL-regex only since v1.0.79), --ingest-parallelism N, --llm-parallelism N (default 2, embedding workers), --low-memory, --enable-ner (URL-regex only since v1.0.79), --fail-fast, --dry-run, --claude-binary, --claude-model, --resume, --retry-failed, --max-cost-usd, --claude-timeout, --rate-limit-wait, --keep-queue, --queue-db |
Bulk-ingest every matching file as a separate memory (NDJSON output); --mode claude-code uses locally installed Claude Code CLI for LLM-curated entity/relationship extraction; --dry-run previews name mapping without writing; --claude-timeout sets per-file subprocess timeout (default 300s) |
export |
--namespace, --type, --include-deleted, --limit, --offset |
Export memories as NDJSON for backup or migration |
cache clear-models |
--yes |
Remove model files cached by versions ≤ v1.0.75 from the XDG cache directory (no build downloads models since v1.0.76) |
Memory name validation. Names must match
[a-z0-9-]+(kebab-case, ASCII only). Unicode and uppercase are rejected with exit code 1. Names longer than 60 chars emitted byingestare truncated to fit; review the WARN log to spot mangled names.
Retrieval and graph
| Command | Arguments | Description |
|---|---|---|
hybrid-search |
<query>, --k, --rrf-k, --with-graph, --max-hops, --min-weight, --weight-vec, --weight-fts |
FTS5 plus vector fused via Reciprocal Rank Fusion; graceful degradation when FTS5 is corrupted (fts_degraded, auto-rebuild); normalized_score for cross-method comparability |
namespace-detect |
--namespace <name> |
Resolve namespace precedence for invocation |
link |
--from, --to, --relation, --weight, --create-missing, --entity-type, --strict-relations |
Create a relationship; --strict-relations rejects non-canonical types; warnings in JSON for non-canonical |
unlink |
--from, --to, --relation, --entity, --all |
Remove relationships; --relation now optional (removes all between pair); --entity X --all removes all edges of entity |
related |
--name, --limit, --hops |
Traverse graph-connected memories from a seed memory |
graph |
--format, --output |
Export a graph snapshot in json, dot or mermaid |
Breaking change in v1.0.44.
graph entitiesJSON output renamed top-level array fromitemstoentities. Update jaq/jq filters:.items[]becomes.entities[]. Thelistcommand still usesitems.
Graph subcommands
| Subcommand | Description | Key flags |
|---|---|---|
graph traverse --from <ENTITY> |
Walk the entity graph from a starting node using BFS | --depth (default 2), --namespace |
graph stats |
Print graph statistics (node count, edge count, degree distribution) | --namespace |
graph entities |
List entities with degree count and sorting | --limit (default 50), --entity-type, --namespace, --sort-by degree|name|created_at, --order asc|desc |
Maintenance
| Command | Arguments | Description |
|---|---|---|
purge |
--retention-days <n>, --dry-run, --yes |
Permanently delete soft-deleted memories |
cleanup-orphans |
--namespace, --dry-run, --yes |
Remove entities that have no memories and no relationships |
prune-relations |
--relation <type>, --namespace, --dry-run, --yes, --show-entities |
Bulk-delete all relationships of a given type; --show-entities lists affected entities in the dry-run preview |
delete-entity |
--name <entity>, --cascade |
Delete an entity and cascade-remove all its relationships and bindings |
rename-entity |
--name <entity>, --new-name <name> |
Rename an entity preserving all relationships and memory bindings; re-embeds vector |
reclassify |
--name <entity> --new-type <type>, --description <text>, or --from-type <old> --to-type <new> --batch |
Reclassify entity types individually or in bulk; --description updates entity description in single mode |
merge-entities |
--names <a,b,c> --into <target> |
Merge source entities into target, moving all edges |
prune-ner |
--entity <name> or --all, --dry-run, --yes |
Remove NER bindings from memory_entities table |
fts rebuild |
--json |
Rebuild the FTS5 full-text search index from scratch |
fts check |
--json |
Run FTS5 integrity-check without modifying the index |
fts stats |
--json |
Show FTS5 index statistics (row count, shadow pages) |
completions |
bash, zsh, fish, powershell, elvish |
Generate shell completions for the specified shell |
enrich |
--operation <op> (memory-bindings, entity-descriptions, body-enrich, re-embed, weight-calibrate, relation-reclassify, entity-connect, entity-type-validate, description-enrich, cross-domain-bridges, domain-classify, graph-audit, deep-research-synth, body-extract), --mode <claude-code|codex>, --llm-parallelism <N>, --preserve-threshold <FLOAT>, --preflight-check, --fallback-mode <mode>, --rate-limit-buffer <SECONDS>, --names <NAMES>, --names-file <PATH>, --max-load-check, --circuit-breaker-threshold <N>, --codex-model-validate, --codex-model-fallback <MODEL>, --resume, --retry-failed, --max-cost-usd <USD>, --claude-binary/--claude-model/--claude-timeout, --codex-binary/--codex-model/--codex-timeout, --db <DB>, --wait-job-singleton <SECONDS>, --force-job-singleton |
LLM-augmented graph quality pipeline (G29 + G35 + G37); three fully implemented operations and 11 scan-only operations; OAuth-only via --mode claude-code (Anthropic) or --mode codex (ChatGPT Pro) |
vec orphan-list |
--json |
List orphan memory embedding rows (G39) with vector_hash for traceability |
vec purge-orphan |
--yes, --dry-run, --json |
Delete orphan memory embedding rows from vec_memories, vec_entities, vec_chunks (G39); --yes required as safety guard |
vec stats |
--json |
Show statistics for vec_memories, vec_entities, vec_chunks tables (G39) |
codex-models |
--json, --suggest <substring> |
List the ChatGPT Pro OAuth accepted-model whitelist (G33) or return the closest match via substring + Levenshtein |
remember-batch |
--json, --transaction, --force-merge, --fail-fast |
Batch-create memories from NDJSON stdin (one invocation, one slot, one DB connection) |
namespace-detect |
--json, --namespace <name> |
Resolve namespace precedence for the current invocation |
deep-research |
<query>, --k, --max-sub-queries, --max-hops, --min-weight, --max-results, --with-bodies, --max-concurrency, --timeout, --rrf-k, --graph-decay, --graph-min-score, --max-neighbors-per-hop, --json |
Parallel multi-hop GraphRAG research via query decomposition; returns sub_queries[], results[], evidence_chains[], graph_context?, stats |
cache subcommands
| Subcommand | Description |
|---|---|
clear-models |
Remove cached embedding/NER model files (forces re-download on next init) |
Environment Variables
Runtime configuration overrides
| Variable | Description | Default | Example |
|---|---|---|---|
SQLITE_GRAPHRAG_DB_PATH |
Path to the SQLite database file override | ./graphrag.sqlite in the invocation directory |
/data/graphrag.sqlite |
SQLITE_GRAPHRAG_HOME |
Override base directory for graphrag.sqlite (used when --db and SQLITE_GRAPHRAG_DB_PATH are absent) |
unset | /var/lib/sqlite-graphrag |
SQLITE_GRAPHRAG_CACHE_DIR |
Directory override for model cache and lock files | XDG cache dir | ~/.cache/sqlite-graphrag |
SQLITE_GRAPHRAG_LANG |
CLI output language as en or pt (aliases: pt-BR, portuguese) |
en |
pt |
SQLITE_GRAPHRAG_LOG_LEVEL |
Tracing filter level for stderr output | info |
debug |
SQLITE_GRAPHRAG_LOG_FORMAT |
Tracing output format on stderr (pretty or json) |
pretty |
json |
SQLITE_GRAPHRAG_NAMESPACE |
Namespace override bypassing detection | none | project-foo |
SQLITE_GRAPHRAG_DISPLAY_TZ |
IANA timezone for *_iso JSON fields |
UTC |
America/Sao_Paulo |
SQLITE_GRAPHRAG_EMBEDDING_DIM |
Embedding dimensionality override (v1.0.79); precedence: --embedding-dim flag > this env > schema_meta.dim > 64; range [8, 4096] |
64 (new databases) |
384 |
SQLITE_GRAPHRAG_CLAUDE_EMBED_MODEL |
Model override for claude -p embedding calls (v1.0.79, symmetric to the codex variable) |
CLI default model | claude-haiku-4-5-20251001 |
SQLITE_GRAPHRAG_EMBED_TIMEOUT_SECS |
Timeout per LLM embedding subprocess call (v1.0.79) | 300 |
600 |
SQLITE_GRAPHRAG_ENABLE_NER |
Enable automatic extraction on remember/ingest. Since v1.0.79 this runs URL-regex extraction only (the GLiNER pipeline was removed). Accepts 1/true/yes/on |
unset (off) | 1 |
SQLITE_GRAPHRAG_GLINER_VARIANT |
NO EFFECT since v1.0.79 (GLiNER removed) — accepted for compatibility, ignored | — | — |
SQLITE_GRAPHRAG_GLINER_THRESHOLD |
NO EFFECT since v1.0.79 (GLiNER removed) — accepted for compatibility, ignored | — | — |
SQLITE_GRAPHRAG_GLINER_MODEL |
NO EFFECT since v1.0.79 (GLiNER removed) — accepted for compatibility, ignored | — | — |
SQLITE_GRAPHRAG_EXTRACTION_MAX_TOKENS |
Token budget for entity/relationship extraction per memory; values outside [512, 100 000] fall back to default | 5000 |
8000 |
SQLITE_GRAPHRAG_MAX_ENTITIES_PER_MEMORY |
Maximum distinct entities persisted per memory; values outside [1, 1 000] fall back to default. Note: the extraction pipeline internally caps candidates at 30 before deduplication, so the persistence cap (default 50) acts as a safety ceiling and is only reached when the extractor is extended or replaced. | 50 |
100 |
SQLITE_GRAPHRAG_MAX_RELATIONS_PER_MEMORY |
Maximum distinct relationships persisted per memory; values outside [1, 10 000] fall back to default | 50 |
200 |
SQLITE_GRAPHRAG_LOW_MEMORY |
Force single-threaded ingest to reduce RSS. Accepts 1/true/yes/on (case-insensitive) |
unset (multi-thread) | 1 |
SQLITE_GRAPHRAG_CLAUDE_BINARY |
Explicit path to the Claude Code binary for ingest --mode claude-code |
PATH lookup | /usr/local/bin/claude |
SQLITE_GRAPHRAG_CODEX_BINARY |
Explicit path to the Codex CLI binary for ingest --mode codex |
PATH lookup | /usr/local/bin/codex |
ORT_DYLIB_PATH |
HISTORICAL (≤ v1.0.75) — no build loads ONNX since v1.0.76; the variable is ignored | — | — |
Integration Patterns
Compose with Unix pipelines and tools
|
Feed hybrid search into a summarizer endpoint
| |
Backup with atomic snapshot and compression
Claude Code subprocess example in Node
const = require;
const proc = ;
Docker Debian build for CI pipelines
FROM rust:1.88-bookworm AS builder
RUN apt-get update && apt-get install -y --no-install-recommends pkg-config libssl-dev ca-certificates && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . .
RUN cargo install --path .
Exit Codes
Deterministic status codes for orchestration
| Code | Meaning | Possible Cause |
|---|---|---|
0 |
Success | Command completed and JSON payload printed when requested |
1 |
Validation error or runtime failure | Invalid --type, malformed --relation (empty or non-snake_case), kebab-case violation, generic anyhow error |
2 |
CLI usage error | Invalid flag, missing required argument, invalid --tz timezone (Clap FromStr rejects before app code) |
9 |
Duplicate detected | Existing --name without --force-merge; ingest skips the file and emits status: "skipped" with action: "duplicate" instead |
3 |
Conflict during optimistic update | edit or restore raced against another writer |
4 |
Memory or entity not found | read, forget, edit, rename, restore or graph traverse target missing |
5 |
Namespace could not be resolved | No SQLITE_GRAPHRAG_NAMESPACE, no flag, no detected default |
6 |
Payload exceeded configured limits | --name longer than 80 bytes, body over 512000 bytes, more than 512 chunks |
10 |
SQLite database error | Corrupted file, schema mismatch, missing migration |
11 |
Embedding generation failed | LLM subprocess error or model load failure |
12 |
sqlite-vec extension failed to load |
Missing native extension or unsupported SQLite build |
13 |
Batch partial failure | import, reindex or stdin batch with at least one failing record |
14 |
Filesystem I/O error | Cache or database directory not writable, nonexistent ingest target directory |
15 |
Database busy after retries | WAL contention exceeded with_busy_retry budget |
20 |
Internal or JSON serialization error | Unexpected serde failure or invariant violation |
75 |
EX_TEMPFAIL lock timeout or all concurrency slots busy |
Five-plus concurrent invocations or flock waited longer than 300s |
77 |
Available RAM below minimum required | Less than 2 GB free RAM detected before model load |
Performance
Measured on a 1000-memory database
- Embedding latency is dominated by the headless LLM round-trip (~1-3 s per batched call); pure reads (
read,list,graph) stay in the low milliseconds - Since v1.0.79 LLM calls are BATCHED (calibration bases of 8 chunks / 25 entity names at dim 64, dim-adaptive — G44) and PARALLEL (
--llm-parallelism, boundedSemaphore+JoinSet), so a 39-item memory embeds in 4-5 calls instead of 39 serialized spawns --embedding-dim 64(the default) cuts the LLM output per vector ~6x compared to the old 384-dim payloadinitperforms no model download — it only creates the database and validates that aclaude/codexCLI is reachable- Build (v1.0.79): each embedding call spawns
claude -porcodex exec— RSS is ~350 MB per LLM worker (the 1100 MB ONNX model load no longer exists in any build)
Memory Requirements
Sizing RAM for ingest and recall workloads
- The CLI itself is lightweight (~6 MB binary); RAM is dominated by the LLM subprocesses at roughly 350 MB RSS per worker (
LLM_WORKER_RSS_MB) - Worker budget: effective parallelism is
min(--llm-parallelism, cpus, free_ram × 0.5 / 350 MB, 32)— the concurrency gate adapts to available memory automatically - Default parallelism increases RSS roughly linearly per worker (
--llm-parallelism 4≈ 4 × 350 MB of subprocess RSS on top of the CLI) - Low-memory mode: pass
--low-memory(or setSQLITE_GRAPHRAG_LOW_MEMORY=1) to force single-threaded ingest. Equivalent to--ingest-parallelism 1and overrides any explicit value, at the cost of 3-4x wall time. - Container/cgroup users: budget
MemoryMaxfor the CLI plus N × 350 MB LLM workers (the old 3 GB ONNX floor no longer exists)
Storage Footprint
Expected DB size relative to ingested content
Expected overhead: roughly 8× the total ingested body size (e.g., 7.6 MB of text → ~62.9 MB DB). Overhead comes from float embeddings (default 64-dim since v1.0.79; pre-existing databases keep their recorded dimensionality, e.g. 384), FTS5 full-text index, and the entities/relationships graph. Run
sqlite-graphrag vacuum --jsonafter bulkforget+purgecycles to reclaim reclaimed space.
Safe Parallel Invocation
Counting semaphore with up to four simultaneous slots
- Each LLM embedding worker (
claude -p/codex execsubprocess) consumes roughly 350 MB of RSS — the budget unit used by the concurrency gate since v1.0.79 MAX_CONCURRENT_CLI_INSTANCESremains the hard ceiling at 4 cooperating subprocesses- Heavy commands
init,remember,recall, andhybrid-searchare clamped lower dynamically when available RAM cannot sustain the requested parallelism safely - Lock files live at
~/.cache/sqlite-graphrag/cli-slot-{1..4}.lockusingflock - A fifth concurrent invocation waits up to 300 seconds then exits with code 75
- Use
--max-concurrency Nto request the slot limit for the current invocation; heavy commands may still be reduced automatically - Memory guard aborts with exit 77 when less than 2 GB of RAM is available
- SIGINT and SIGTERM trigger graceful shutdown via
shutdown_requested()atomic - Exit code 130 when interrupted by SIGINT (Ctrl+C)
- Exit code 141 when SIGPIPE fires (stdout closed by downstream consumer in pipeline)
- Exit code 143 when terminated by SIGTERM
- Second signal forces immediate exit without waiting for current operation
Troubleshooting FAQ
Cloud sync safety (Dropbox, iCloud, OneDrive)
- sqlite-graphrag uses WAL mode by default for high-concurrency writes
- Since v1.0.54, every write command runs
PRAGMA wal_checkpoint(TRUNCATE)after committing (v1.0.53 covered 11 of 12; v1.0.54 added the missingprune-relations) - This ensures the
.sqlitefile is always self-contained when cloud sync tools read it - If corruption occurs despite the checkpoint, recover with
sqlite3 broken.sqlite ".recover" | sqlite3 repaired.sqlite
Common issues and fixes
- Default behavior always creates or opens
graphrag.sqlitein the current working directory - Database locked after crash requires
sqlite-graphrag vacuumto checkpoint the WAL initis near-instant since v1.0.76 — there is no model download; if it fails, check that aclaudeorcodexCLI is reachable onPATH- Embedding calls failing with exit 11 usually mean the LLM CLI is missing, unauthenticated (OAuth required) or timing out — raise
SQLITE_GRAPHRAG_EMBED_TIMEOUT_SECS(default 300) for slow links ORT_DYLIB_PATH/libonnxruntime.soguidance is HISTORICAL (≤ v1.0.75) — no build loads ONNX since v1.0.76- Permission denied on Linux means the cache directory lacks write access for your user
- Namespace detection falls back to
globalwhen no explicit override is present - Parallel invocations that exceed the effective safe limit receive exit 75 and SHOULD retry with backoff; during audits start heavy commands with
--max-concurrency 1
Compatible Rust Crates
Invoke sqlite-graphrag from any Rust AI framework via subprocess
- Each crate calls the binary through
std::process::Commandwith--jsonflag - No shared memory or FFI required: the contract is pure stdout JSON
- Pin the binary version in your
Cargo.tomlworkspace for reproducible builds - All 18 crates below work identically on Linux, Apple Silicon macOS and Windows
rig-core
use Command;
let out = new
.args
.output.unwrap;
swarms-rs
use Command;
let out = new
.args
.output.unwrap;
autoagents
use Command;
let out = new
.args
.output.unwrap;
graphbit
use Command;
let out = new
.args
.output.unwrap;
agentai
use Command;
let out = new
.args
.output.unwrap;
llm-agent-runtime
use Command;
let out = new
.args
.output.unwrap;
anda
use Command;
let out = new
.args
.output.unwrap;
adk-rust
use Command;
let out = new
.args
.output.unwrap;
rs-graph-llm
use Command;
let out = new
.args
.output.unwrap;
genai
use Command;
let out = new
.args
.output.unwrap;
liter-llm
use Command;
let out = new
.args
.output.unwrap;
llm-cascade
use Command;
let out = new
.args
.output.unwrap;
async-openai
use Command;
let out = new
.args
.output.unwrap;
async-llm
use Command;
let out = new
.args
.output.unwrap;
anthropic-sdk
use Command;
let out = new
.args
.output.unwrap;
ollama-rs
use Command;
let out = new
.args
.output.unwrap;
mistral-rs
use Command;
let out = new
.args
.output.unwrap;
llama-cpp-rs
use Command;
let out = new
.args
.output.unwrap;
Contributing
Pull requests are welcome
- Read the contribution guidelines in CONTRIBUTING.md
- Open issues at the GitHub repository for bugs or feature requests
- Follow the code of conduct described in CODE_OF_CONDUCT.md
Security
Responsible disclosure policy
- Security reports follow the policy described in SECURITY.md
- Contact the maintainer privately before disclosing vulnerabilities publicly
JSON Schemas
Canonical contracts for every subcommand response
- Authoritative JSON Schemas for every
--jsonresponse live underdocs/schemas/and are versioned alongside the crate - 64 schemas cover
init,remember,remember-batch(+ summary),recall,hybrid-search,deep-research,list,read,forget,purge,rename,edit,history,restore,link,unlink,prune-relations,health,stats,migrate(+migrate-rehash+migrate-to-llm-only),vacuum,optimize,cleanup-orphans,sync-safe-copy,backup,graph(+ stats/traverse/entities),related,namespace-detect,debug-schema,entities-input,relationships-input,ingest-file-event(+ingest-summary),ingest-claude-phase(+ file-event + summary),export-memory-line(+ summary),enrich-phase(+ item-event + summary),fts rebuild(+fts check+fts stats),vec orphan-list(+vec purge-orphan+vec stats),codex-models,error-envelope - Treat these schemas as the agent contract; SKILL.md documents the same shapes in human-readable form
- Validate downstream consumers with any standard JSON Schema validator (e.g.
ajv,jsonschema)
Changelog
Release history tracked separately
- PRD — Product Requirements Document (source of truth for the 31 behavioral contracts)
- Read the full release history in CHANGELOG.md
Acknowledgments
Built on top of excellent open source
fastembedandsqlite-vecpowered the local embedding pipeline up to v1.0.75 (removed since — embeddings now come fromclaude/codexsubprocesses)refineryruns schema migrations with transactional safety guaranteesclappowers the CLI argument parsing with derive macrosrusqlitewraps SQLite with safe Rust bindings and bundled build
License
Dual license MIT OR Apache-2.0
- Licensed under either of Apache License 2.0 or MIT License at your option
- See
LICENSE-APACHEandLICENSE-MITin the repository root for full text