sqlitegraph 3.2.5

Embedded graph database with full ACID transactions, HNSW vector search, dual backend support, and comprehensive graph algorithms library
Documentation

SQLiteGraph

crates.io Documentation

Embedded graph database with dual backend architecture, graph algorithms, Cypher-inspired queries, and HNSW vector search.

Positioning: Single-binary embedded database (no server). Persistent storage with atomic batch commits. Graph algorithms + HNSW vector search in one engine. SQLite: stable, mature, and easy to inspect with standard tooling. Native V3: graph-oriented storage with cache, KV, pub/sub, and traversal features. See the benchmarks below for workload-specific behavior.

What's New in the 3.2 Line

The current 3.2 release line builds on the 3.0 backend/model transition with concrete HNSW and traversal improvements:

  • HnswIndex::batch_insert_vectors() for lower-overhead bulk vector ingestion.
  • Transactional topology persistence for HNSW metadata and layers.
  • HnswIndexStats runtime counters for inserts, searches, vector-cache hits, and vector-cache misses.
  • Streaming traversal iterators for BFS, DFS, topological sort, and connected components, so callers can avoid materializing full Vecs when they only need incremental results.
  • parking_lot-based lock cleanup across the HNSW path and related hot locks.

See CHANGELOG for full history.

Backends

Feature SQLite Native V3
Status Stable Stable
Storage .db file .graph file
Capacity model Storage-limited Storage-limited
Graph algorithms 35+ 35+
HNSW vectors Yes Yes
Pub/Sub Yes Yes
LRU Cache No Yes
Parallel BFS No Yes

Benchmarks

See Architecture for system design details and Benchmarking for methodology.

Representative clean samples from 2026-06-07 (AMD Ryzen 7 7800X3D, tmpfs, Rust 1.95.0):

Benchmark SQLite V3
Criterion bfs_traversal/small_random_1k_5k 2.3680 ms 3.3191 ms
Criterion bfs_traversal/medium_random_10k_50k 26.510 ms 56.240 ms
Release microbenchmark point lookup 3965 ns 146 ns

Use the release microbenchmark for fast sanity checks and the Criterion suites for workload comparisons.

Run benchmarks yourself:

git clone https://github.com/oldnordic/sqlitegraph.git
cd sqlitegraph
./scripts/run-curated-benchmarks.sh

cd sqlitegraph-core
cargo run --release --example test_performance_comparison --features native-v3
cargo bench --features native-v3 --bench backend_comparison
cargo bench --features native-v3 --bench sqlite_v3_curated

See examples/ for quick performance checks and docs/BENCHMARKING.md for the full workflow.

Quick Start

[dependencies]
# SQLite backend (default)
sqlitegraph = "3.2"

# OR Native V3 backend (graph-oriented storage)
sqlitegraph = { version = "3.2", features = ["native-v3"] }
use sqlitegraph_core::backend::{GraphBackend, NodeSpec};
use sqlitegraph_core::backend::sqlite::SqliteGraphBackend;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let backend = SqliteGraphBackend::in_memory()?;

    let node_id = backend.insert_node(NodeSpec {
        kind: "User".to_string(),
        name: "Alice".to_string(),
        file_path: None,
        data: serde_json::json!({"age": 30}),
    })?;

    println!("Created node: {}", node_id);
    Ok(())
}

TypedDiGraph (In-Memory)

A lightweight in-memory directed graph with typed node and edge weights, independent of the GraphBackend persistence layer. Useful for build-system DAGs, dependency graphs, and analysis passes that don't need disk storage.

use sqlitegraph::typed_digraph::{TypedDiGraph, NodeIndex, Direction};
use sqlitegraph::typed_digraph::algo::{toposort, tarjan_scc, Dfs};

let mut g = TypedDiGraph::<&str, i32>::new();
let a = g.add_node("compile");
let b = g.add_node("link");
let c = g.add_node("run");
g.add_edge(a, b, 1);
g.add_edge(b, c, 2);

// Topological order
let order = toposort(&g).expect("acyclic");
assert_eq!(order, vec![a, b, c]);

// DFS traversal
let mut dfs = Dfs::new(&g, a);
assert_eq!(dfs.by_ref().collect::<Vec<_>>(), vec![a, b, c]);

Available in the current 3.x line.

CLI

cargo install sqlitegraph-cli

# Query
sqlitegraph --db graph.db query "MATCH (n:User) RETURN n.name"

# Algorithms
sqlitegraph --db graph.db bfs --start 1 --depth 3
sqlitegraph --db graph.db algo pagerank --iterations 100

Copy-Paste CLI Demo

rm -f /tmp/sqlitegraph-demo.db

sqlitegraph --db /tmp/sqlitegraph-demo.db --write insert --kind User --name Alice --data '{"age":30}'
sqlitegraph --db /tmp/sqlitegraph-demo.db --write insert --kind User --name Bob --data '{"age":31}'
sqlitegraph --db /tmp/sqlitegraph-demo.db --write query 'CREATE (1)-[:KNOWS]->(2)'

sqlitegraph --db /tmp/sqlitegraph-demo.db query 'MATCH (a:User)-[:KNOWS]->(b:User) RETURN a.name, b.name'
sqlitegraph --db /tmp/sqlitegraph-demo.db algo scc

Hybrid Runtime Demo

This crate includes a runnable demo that combines ordinary SQLite rows, Native V3 graph metadata, SQLite-backed HNSW vectors, and V3 pub/sub:

cargo run -p sqlitegraph --example hybrid_sqlite_v3_hnsw_pubsub --features native-v3

Safety Invariants

  • Orphan edges are detected by verifying every edge endpoint references a stored entity before any reasoning or subgraph extraction runs.
  • Duplicate edges (identical (from,to,type) tuples) are tallied so traversal/pipeline counts stay deterministic and regressions surface quickly.
  • Invalid label/property references (metadata rows pointing at missing entities) are reported by the safety-check helpers.
  • Integrity sweeps perform a deep table walk (entities/edges/labels/properties), verifying sorted IDs, valid JSON payloads, and metadata references before committing to pipelines or migrations.

DSL Constraints

  • Supported clauses are limited to deterministic pattern, k-hop, filter type=…, and score steps; ordering matters and only one filter clause is allowed.
  • Combination syntax (CALLS*2, CALLS->USES) must not introduce conflicting filters or unknown tokens—ambiguous or unsupported input causes parser errors surfaced to the CLI/tests.

Performance & Instrumentation

Performance thresholds in sqlitegraph_bench.json gate releases. Benchmarks produce HTML reports under target/criterion. Use cargo bench --bench bench_insert (etc.) to isolate suites. For comparison work, prefer release-mode examples for quick sanity checks and Criterion suites for workload data.

Runtime instrumentation is exposed through the core APIs used by benchmarks and integration tests: prepare/execute counts, transaction begins/commits/rollbacks, and cache hits/misses can be captured while reproducing workloads.

Schema Compatibility Matrix

Version Description
1 Base tables (graph_entities, graph_edges, graph_labels, graph_properties) plus indexes and graph_meta.
2 Adds graph_meta_history rows so each migration application is recorded; exposed via run_pending_migrations / CLI migrate.
Future The CLI refuses to open DBs whose version exceeds the compiled SCHEMA_VERSION.

Upgrade workflow:

  1. Inspect the database with sqlitegraph --db <path> status.
  2. Review pending migrations through the library migration helpers.
  3. Apply migrations atomically through the library helper; history entries are appended automatically.

Ecosystem

Tools built on SQLiteGraph:

Tool Purpose Repository crates.io
Magellan Code graph indexing, symbol navigation github.com/oldnordic/magellan crates.io/crates/magellan
llmgrep Semantic code search github.com/oldnordic/llmgrep crates.io/crates/llmgrep
Mirage CFG analysis, path enumeration github.com/oldnordic/mirage crates.io/crates/mirage-analyzer
splice Precision code editing github.com/oldnordic/splice crates.io/crates/splice

Documentation

License

GPL-3.0-only