sqlite-graphrag

Your AI agents forget everything. Give any LLM agent a memory that survives restarts, cloud outages, and API bills. No cloud. No Python. No embeddings API. Still GraphRAG. This 25 MB binary gives them a brain.

Your AI agents forget everything. Give any LLM agent a memory that survives restarts, cloud outages, and API bills. No cloud. No Python. No embeddings API. Still GraphRAG. This 25 MB binary gives them a brain.

Portuguese version available at README.pt-BR.md
Public package and repository are live on GitHub and crates.io
Install the current published release with cargo install sqlite-graphrag --version 1.0.5 --locked
Build directly from the local checkout with cargo install --path .

cargo install --path .

What is it?

sqlite-graphrag delivers durable memory for AI agents

Stores memories, entities and relationships inside a single SQLite file under 25 MB
Embeds content locally via fastembed with the multilingual-e5-small model
Combines FTS5 full-text search with sqlite-vec KNN into a hybrid Reciprocal Rank Fusion ranker
Stores and traverses an explicit entity graph with typed edges for multi-hop recall across memories
Preserves every edit through an immutable version history table for full audit
Runs on Linux, macOS and Windows natively with zero external services required

Why sqlite-graphrag?

Differentiators against cloud RAG stacks

Offline-first architecture eliminates OpenAI embeddings and Pinecone recurring fees
Single-file SQLite storage replaces Docker clusters of vector databases entirely
Graph-native retrieval beats pure vector RAG on multi-hop questions by design
Deterministic JSON output unlocks clean orchestration by LLM agents in pipelines
Native cross-platform binary ships without Python, Node or Docker dependencies

Superpowers for AI Agents

First-class CLI contract for orchestration

Every subcommand accepts --json producing deterministic stdout payloads
Every invocation is stateless with explicit exit codes for routing decisions
Note: CLI is stateless — each invocation reloads the embedding model (~1s); daemon mode targeting <50ms latency is planned for v3.0.0
Every write is idempotent through --name kebab-case uniqueness constraints
Stdin accepts bodies or JSON payloads for entities and relationship batches
Relationship payloads use strength in [0.0, 1.0], mapped to weight in outputs
Stderr carries tracing output under SQLITE_GRAPHRAG_LOG_LEVEL=debug only
Cross-platform behavior is identical across Linux, macOS and Windows hosts

27 AI agents and IDEs supported out of the box

Agent	Vendor	Minimum version	Integration pattern
Claude Code	Anthropic	1.0	Subprocess with `--json` stdout
Codex	OpenAI	1.0	Tool call wrapping `cargo run -- recall`
Gemini CLI	Google	1.0	Function call returning JSON
Opencode	Opencode	1.0	Shell tool with `hybrid-search --json`
OpenClaw	Community	0.1	Subprocess pipe into `jaq` filters
Paperclip	Community	0.1	Direct CLI invocation per message
VS Code Copilot	Microsoft	1.85	Terminal subprocess via tasks
Google Antigravity	Google	1.0	Agent tool with structured JSON
Windsurf	Codeium	1.0	Custom command registration
Cursor	Anysphere	0.42	Terminal integration or MCP wrapper
Zed	Zed Industries	0.160	Extension wrapping subprocess
Aider	Paul Gauthier	0.60	Shell command hook per turn
Jules	Google Labs	1.0	Workspace shell integration
Kilo Code	Community	1.0	Subprocess invocation
Roo Code	Community	1.0	Custom command via CLI
Cline	Saoud Rizwan	3.0	Terminal tool registered manually
Continue	Continue Dev	0.9	Context provider via shell
Factory	Factory AI	1.0	Tool call with JSON response
Augment Code	Augment	1.0	Terminal command wrapping
JetBrains AI Assistant	JetBrains	2024.3	External tool per IDE
OpenRouter	OpenRouter	1.0	Function routing through shell
Minimax	Minimax	1.0	Subprocess invocation
Z.ai	Z.ai	1.0	Subprocess invocation
Ollama	Ollama	0.1	Subprocess invocation
Hermes Agent	Community	1.0	Subprocess invocation
LangChain	LangChain	0.3	Subprocess via tool
LangGraph	LangChain	0.2	Subprocess via node

Quick Start

Install and record your first memory in four commands

cargo install --path .
sqlite-graphrag init
sqlite-graphrag remember --name onboarding-note --type user --description "first memory" --body "hello graphrag"
sqlite-graphrag recall "graphrag" --k 5 --json

For the local checkout, cargo install --path . is enough
After the public release, prefer --locked to preserve the tested MSRV dependency graph

Installation

Multiple distribution channels

Install from the local checkout with cargo install --path .
Build from the local checkout with cargo build --release
Homebrew formula is planned under brew install sqlite-graphrag
Scoop bucket is planned under scoop install sqlite-graphrag
Docker image planned as ghcr.io/daniloaguiarbr/sqlite-graphrag:1.0.3

Usage

Initialize the database

sqlite-graphrag init
sqlite-graphrag init --namespace project-foo

Remember a memory with an optional explicit entity graph

sqlite-graphrag remember \
  --name integration-tests-postgres \
  --type feedback \
  --description "prefer real Postgres over SQLite mocks" \
  --body "Integration tests must hit a real database."

Recall memories by semantic similarity

sqlite-graphrag recall "postgres integration tests" --k 3 --json

Hybrid search combining FTS5 and vector KNN

sqlite-graphrag hybrid-search "postgres migration rollback" --k 10 --json

Inspect database health and stats

sqlite-graphrag health --json
sqlite-graphrag stats --json

Purge soft-deleted memories after retention period

sqlite-graphrag purge --retention-days 90 --dry-run --json
sqlite-graphrag purge --retention-days 90 --yes

Commands

Core database lifecycle

Command	Arguments	Description
`init`	`--namespace <ns>`	Initialize database and download embedding model
`health`	`--json`	Show database integrity and pragma status
`stats`	`--json`	Count memories, entities and relationships
`migrate`	`--json`	Apply pending schema migrations via `refinery`
`vacuum`	`--json`	Checkpoint WAL and reclaim disk space
`optimize`	`--json`	Run `PRAGMA optimize` to refresh statistics
`sync-safe-copy`	`--dest <path>` (alias `--output`)	Checkpoint then copy a sync-safe snapshot

Memory content lifecycle

Command	Arguments	Description
`remember`	`--name`, `--type`, `--description`, `--body`	Save a memory with optional entity graph
`recall`	`<query>`, `--k`, `--type`	Search memories semantically via KNN
`read`	`--name <name>`	Fetch a memory by exact kebab-case name
`list`	`--type`, `--limit`, `--offset`	Paginate memories sorted by `updated_at`
`forget`	`--name <name>`	Soft-delete a memory preserving history
`rename`	`--old <name>`, `--new <name>`	Rename a memory while keeping versions
`edit`	`--name`, `--body`, `--description`	Edit body or description creating new version
`history`	`--name <name>`	List all versions of a memory
`restore`	`--name`, `--version`	Restore a memory to a previous version

Retrieval and graph

Command	Arguments	Description
`hybrid-search`	`<query>`, `--k`, `--rrf-k`	FTS5 plus vector fused via Reciprocal Rank Fusion
`namespace-detect`	`--namespace <name>`	Resolve namespace precedence for invocation

Maintenance

Command	Arguments	Description
`purge`	`--retention-days <n>`, `--dry-run`, `--yes`	Permanently delete soft-deleted memories

Environment Variables

Runtime configuration overrides

Variable	Description	Default	Example
`SQLITE_GRAPHRAG_DB_PATH`	Path to the SQLite database file override	`./graphrag.sqlite` in the invocation directory	`/data/graphrag.sqlite`
`SQLITE_GRAPHRAG_CACHE_DIR`	Directory override for model cache and lock files	XDG cache dir	`~/.cache/sqlite-graphrag`
`SQLITE_GRAPHRAG_LANG`	CLI output language as `en` or `pt`	`en`	`pt`
`SQLITE_GRAPHRAG_LOG_LEVEL`	Tracing filter level for stderr output	`info`	`debug`
`SQLITE_GRAPHRAG_NAMESPACE`	Namespace override bypassing detection	none	`project-foo`

Integration Patterns

Compose with Unix pipelines and tools

sqlite-graphrag recall "auth tests" --k 5 --json | jaq -r '.results[].name'

Feed hybrid search into a summarizer endpoint

sqlite-graphrag hybrid-search "postgres migration" --k 10 --json \
  | jaq -c '.results[] | {name, combined_score}' \
  | xh POST http://localhost:8080/summarize

Backup with atomic snapshot and compression

sqlite-graphrag sync-safe-copy --dest /tmp/ng.sqlite
ouch compress /tmp/ng.sqlite /tmp/ng-$(date +%Y%m%d).tar.zst

Claude Code subprocess example in Node

const { spawn } = require('child_process');
const proc = spawn('sqlite-graphrag', ['recall', query, '--k', '5', '--json']);

Docker Alpine build for CI pipelines

FROM rust:1.88-alpine AS builder
RUN apk add musl-dev sqlite-dev
WORKDIR /app
COPY . .
RUN cargo install --path .

Exit Codes

Deterministic status codes for orchestration

Code	Meaning
`0`	Success
`1`	Validation error or runtime failure
`2`	Duplicate detected or invalid CLI argument
`3`	Conflict during optimistic update
`4`	Memory or entity not found
`5`	Namespace could not be resolved
`6`	Payload exceeded configured limits
`10`	SQLite database error
`11`	Embedding generation failed
`12`	`sqlite-vec` extension failed to load
`13`	Batch partial failure (import, reindex, stdin batch)
`14`	Filesystem I/O error
`15`	Database busy after retries (moved from 13 in the legacy line)
`20`	Internal or JSON serialization error
`75`	`EX_TEMPFAIL`: all concurrency slots busy
`77`	Available RAM below minimum required to load the embedding model

Performance

Measured on a 1000-memory database

In-process warm-model latency remains far lower than one-shot subprocess latency
Stateless CLI invocations typically spend about one second reloading the embedding model per heavy command
Warm in-process recall can stay well below the stateless subprocess timing once the model is already resident
First init downloads the quantized model once and caches it locally
Embedding model uses approximately 1100 MB of RAM per process instance after the v1.0.3 RSS calibration

Safe Parallel Invocation

Counting semaphore with up to four simultaneous slots

Each invocation loads multilingual-e5-small consuming roughly 1100 MB of RAM after the v1.0.3 measurement pass
MAX_CONCURRENT_CLI_INSTANCES remains the hard ceiling at 4 cooperating subprocesses
Heavy commands init, remember, recall, and hybrid-search are clamped lower dynamically when available RAM cannot sustain the requested parallelism safely
Lock files live at ~/.cache/sqlite-graphrag/cli-slot-{1..4}.lock using flock
A fifth concurrent invocation waits up to 300 seconds then exits with code 75
Use --max-concurrency N to request the slot limit for the current invocation; heavy commands may still be reduced automatically
Memory guard aborts with exit 77 when less than 2 GB of RAM is available
SIGINT and SIGTERM trigger graceful shutdown via shutdown_requested() atomic

Troubleshooting FAQ

Common issues and fixes

Default behavior always creates or opens graphrag.sqlite in the current working directory
Database locked after crash requires sqlite-graphrag vacuum to checkpoint the WAL
First init takes roughly one minute while fastembed downloads the quantized model
Permission denied on Linux means the cache directory lacks write access for your user
Namespace detection falls back to global when no explicit override is present
Parallel invocations that exceed the effective safe limit receive exit 75 and SHOULD retry with backoff; during audits start heavy commands with --max-concurrency 1

Compatible Rust Crates

Invoke sqlite-graphrag from any Rust AI framework via subprocess

Each crate calls the binary through std::process::Command with --json flag
No shared memory or FFI required: the contract is pure stdout JSON
Pin the binary version in your Cargo.toml workspace for reproducible builds
All 18 crates below work identically on Linux, macOS and Windows

rig-core

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "project goals", "--k", "5", "--json"])
    .output().unwrap();

swarms-rs

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["hybrid-search", "agent memory", "--k", "10", "--json"])
    .output().unwrap();

autoagents

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["remember", "--name", "task-context", "--type", "project",
           "--description", "current sprint goal", "--body", "finish auth module"])
    .output().unwrap();

graphbit

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "decision log", "--k", "3", "--json"])
    .output().unwrap();

agentai

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["hybrid-search", "previous decisions", "--k", "5", "--json"])
    .output().unwrap();

llm-agent-runtime

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "user preferences", "--k", "5", "--json"])
    .output().unwrap();

anda

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["stats", "--json"])
    .output().unwrap();

adk-rust

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "tool outputs", "--k", "5", "--json"])
    .output().unwrap();

rs-graph-llm

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["hybrid-search", "graph relations", "--k", "10", "--json"])
    .output().unwrap();

genai

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "model context", "--k", "5", "--json"])
    .output().unwrap();

liter-llm

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["remember", "--name", "session-notes", "--type", "user",
           "--description", "session recap", "--body", "discussed architecture"])
    .output().unwrap();

llm-cascade

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "fallback context", "--k", "3", "--json"])
    .output().unwrap();

async-openai

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "system prompt history", "--k", "5", "--json"])
    .output().unwrap();

async-llm

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["hybrid-search", "chat context", "--k", "5", "--json"])
    .output().unwrap();

anthropic-sdk

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "tool use patterns", "--k", "5", "--json"])
    .output().unwrap();

ollama-rs

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "local model outputs", "--k", "5", "--json"])
    .output().unwrap();

mistral-rs

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["hybrid-search", "inference context", "--k", "10", "--json"])
    .output().unwrap();

llama-cpp-rs

use std::process::Command;
let out = Command::new("sqlite-graphrag")
    .args(["recall", "llama session context", "--k", "5", "--json"])
    .output().unwrap();

Contributing

Pull requests are welcome

Read the contribution guidelines in CONTRIBUTING.md
Open issues at the GitHub repository for bugs or feature requests
Follow the code of conduct described in CODE_OF_CONDUCT.md

Security

Responsible disclosure policy

Security reports follow the policy described in SECURITY.md
Contact the maintainer privately before disclosing vulnerabilities publicly

Changelog

Release history tracked separately

Read the full release history in CHANGELOG.md

Acknowledgments

Built on top of excellent open source

fastembed provides local quantized embedding models without ONNX hassle
sqlite-vec adds vector indexes directly inside SQLite as an extension
refinery runs schema migrations with transactional safety guarantees
clap powers the CLI argument parsing with derive macros
rusqlite wraps SQLite with safe Rust bindings and bundled build

License

Dual license MIT OR Apache-2.0

Licensed under either of Apache License 2.0 or MIT License at your option
See LICENSE-APACHE and LICENSE-MIT in the repository root for full text

sqlite-graphrag 1.0.5