Skip to main content

Crate sqlitegraph

Crate sqlitegraph 

Source
Expand description

SQLite-based graph database with unified backend support.

sqlitegraph provides a lightweight, deterministic graph database for embedded Rust applications. It supports both SQLite and Native storage backends through a unified API.

§Architecture

The crate is organized into focused modules:

sqlitegraph/
├── graph/           # Core graph database (SqliteGraph, GraphEntity, GraphEdge)
├── backend/         # Unified backend trait (GraphBackend, SqliteGraphBackend, NativeGraphBackend)
├── algo/            # Graph algorithms (PageRank, Betweenness, Louvain, Label Propagation)
├── hnsw/            # Vector similarity search (HNSW index, distance metrics)
├── cache/           # LRU-K adjacency cache for traversal optimization
├── introspection/   # Debugging and observability APIs
├── progress/        # Progress tracking for long-running operations
├── mvcc/            # MVCC-lite snapshot system
├── pattern_engine/  # Triple pattern matching
├── query/           # High-level query interface
└── recovery/        # Backup and restore utilities

§Features

  • Dual Backend Support: Choose between SQLite (feature-rich) and Native (performance-optimized) backends
  • Entity and Edge Storage: Rich metadata support with JSON serialization
  • Pattern Matching: Efficient triple pattern matching with cache-enabled fast-path
  • Traversal Algorithms: Built-in BFS, k-hop, and shortest path algorithms
  • Graph Algorithms: PageRank, Betweenness Centrality, Louvain, Label Propagation
  • Vector Search: HNSW approximate nearest neighbor search with persistence
  • MVCC Snapshots: Read isolation with snapshot consistency
  • Bulk Operations: High-performance batch insertions for large datasets
  • Introspection: Debugging APIs for cache stats, file sizes, edge counts
  • Progress Tracking: Callback-based progress for long-running algorithms

§Quick Start

use sqlitegraph::{open_graph, GraphConfig, BackendKind};

// Use SQLite backend (default)
let cfg = GraphConfig::sqlite();
let graph = open_graph("my_graph.db", &cfg)?;

// Or use Native backend
let cfg = GraphConfig::native();
let graph = open_graph("my_graph.db", &cfg)?;

// Both backends support the same operations
let node_id = graph.insert_node(/* node spec */)?;
let neighbor_ids = graph.neighbors(node_id, /* query */)?;

§Backend Selection

§Feature Matrix

FeatureSQLite BackendNative Backend
ACID Transactions✅ Full✅ WAL-based
Graph Algorithms✅ Full support✅ Full support
HNSW Vector Search✅ With persistence✅ In-memory
MVCC Snapshots
Pattern Matching
Raw SQL Access✅ Native❌ Not supported
File FormatSQLite DBCustom binary
Startup TimeFastFaster
Dependencieslibsqlite3None (pure Rust)
Write PerformanceGoodBetter
Query PerformanceGoodBetter

§When to Use SQLite Backend

Choose SQLite backend when:

  • ACID guarantees are critical for your application
  • Raw SQL access needed for complex queries or joins
  • Database compatibility with SQLite tools (sqlite3, DB Browser)
  • Mature ecosystem with third-party tooling
  • HNSW persistence required (vectors survive restarts)

§When to Use Native Backend

Choose Native backend when:

  • Performance is critical (faster reads/writes)
  • No external dependencies desired (pure Rust)
  • Fast startup with large datasets
  • Custom binary format acceptable
  • HNSW in-memory only (vectors persist in separate file)

§Thread Safety

§SqliteGraph is NOT Thread-Safe

SqliteGraph uses interior mutability (RefCell) and is not Sync:

use sqlitegraph::SqliteGraph;
use std::thread;

let graph = SqliteGraph::open("test.db")?;

// ❌ WRONG: Sharing graph across threads for writes
let graph_clone = graph;
thread::spawn(move || {
    graph_clone.insert_node(...)?; // DATA RACE!
});

// ✅ CORRECT: Use snapshots for concurrent reads
let snapshot = graph.snapshot()?;
thread::spawn(move || {
    let neighbors = snapshot.neighbors(node_id)?; // Thread-safe
});

§Concurrent Read Access

Use GraphSnapshot for thread-safe concurrent reads:

use sqlitegraph::{GraphSnapshot, SqliteGraph};

let graph = SqliteGraph::open("my_graph.db")?;

// Create multiple snapshots for concurrent reads
let snapshot1 = graph.snapshot()?;
let snapshot2 = graph.snapshot()?;

// Both snapshots can be used concurrently (thread-safe)
let handle1 = std::thread::spawn(move || {
    snapshot1.neighbors(node_id)
});

let handle2 = std::thread::spawn(move || {
    snapshot2.neighbors(node_id)
});

§Write Serialization

All writes must be serialized:

// ✅ CORRECT: Single thread for all writes
let graph = SqliteGraph::open("my_graph.db")?;
for i in 0..1000 {
    graph.insert_node(...)?;
    graph.insert_edge(...)?;
}

// ❌ WRONG: Concurrent writes
let graph = Arc::new(Mutex::new(graph));
let handle1 = thread::spawn(|| {
    let g = graph.lock().unwrap();
    g.insert_node(...)
});
let handle2 = thread::spawn(|| {
    let g = graph.lock().unwrap();
    g.insert_node(...)
});
// Even with Mutex, this can cause issues due to RefCell

§Error Handling

All operations return Result<T, SqliteGraphError>:

use sqlitegraph::{SqliteGraph, SqliteGraphError};

let graph = SqliteGraph::open("my_graph.db")?;

match graph.insert_node(node_spec) {
    Ok(node_id) => println!("Created node {}", node_id),
    Err(SqliteGraphError::EntityNotFound) => {
        println!("Node not found");
    }
    Err(SqliteGraphError::DatabaseError(e)) => {
        eprintln!("Database error: {}", e);
    }
    Err(e) => {
        eprintln!("Other error: {}", e);
    }
}

§Performance Comparison

§Read Performance

  • SQLite Backend: 10-100μs per neighbor lookup (cached: ~100ns)
  • Native Backend: 1-10μs per neighbor lookup (cached: ~100ns)
  • Cache hit ratio: 80-95% for traversal workloads

§Write Performance

  • SQLite Backend: 100-500μs per insert (transaction-batched)
  • Native Backend: 10-100μs per insert (transaction-batched)
  • Bulk insert: 10-100x faster with bulk_insert_entities()

§Memory Usage

  • Base overhead: O(V + E) for graph storage
  • Cache overhead: 10-20% additional memory
  • HNSW index: 2-3x vector data size

§Public API Organization

This crate exports a clean, stable public API organized as follows:

§Core Types

§Configuration

§Operations

  • [insert_node()], [insert_edge()] - Single entity/edge insertion
  • bulk_insert_entities(), bulk_insert_edges() - Batch operations
  • [neighbors()] - Direct neighbor queries
  • [bfs()], [k_hop()], [shortest_path()] - Graph traversal algorithms
  • pattern_engine - Pattern matching and triple storage

§Graph Algorithms

§Utilities

Re-exports§

pub use graph_opt::GraphEdgeCreate;
pub use graph_opt::GraphEntityCreate;
pub use graph_opt::bulk_insert_edges;
pub use graph_opt::bulk_insert_entities;
pub use graph_opt::cache_stats;
pub use index::add_label;
pub use index::add_property;
pub use mvcc::GraphSnapshot;
pub use mvcc::SnapshotState;
pub use pattern_engine::PatternTriple;
pub use pattern_engine::TripleMatch;
pub use pattern_engine::match_triples;
pub use query::GraphQuery;
pub use recovery::dump_graph_to_path;
pub use recovery::load_graph_from_path;
pub use recovery::load_graph_from_reader;
pub use snapshot::SnapshotId;
pub use backend::BackendDirection;
pub use backend::ChainStep;
pub use backend::GraphBackend;
pub use backend::BackupResult;
pub use backend::EdgeSpec;
pub use backend::NativeGraphBackend;
pub use backend::NeighborQuery;
pub use backend::NodeSpec;
pub use backend::SqliteGraphBackend;
pub use config::BackendKind;
pub use config::GraphConfig;
pub use config::NativeConfig;
pub use config::SqliteConfig;
pub use config::open_graph;
pub use errors::SqliteGraphError;
pub use graph::GraphEdge;
pub use graph::GraphEntity;
pub use graph::SqliteGraph;
pub use algo::betweenness_centrality;
pub use algo::betweenness_centrality_with_progress;
pub use algo::label_propagation;
pub use algo::louvain_communities;
pub use algo::louvain_communities_with_progress;
pub use algo::pagerank;
pub use algo::pagerank_with_progress;
pub use progress::ConsoleProgress;
pub use progress::NoProgress;
pub use progress::ProgressCallback;
pub use progress::ProgressState;
pub use introspection::EdgeCount;
pub use introspection::GraphIntrospection;
pub use introspection::IntrospectError;
pub use cache::CacheStats;

Modules§

algo
Graph algorithms for centrality, community detection, and structure analysis.
backend
Backend trait bridging sqlitegraph with higher-level graph consumers.
backend_selector
bench_gates
bench_meta
bench_regression
bench_utils
bfs
cache
LRU-K adjacency cache for graph traversal optimization.
config
Configuration for backend selection and backend-specific options.
debug
Centralized debug logging with feature flag control
dsl
errors
graph
SQLite-backed graph database implementation.
graph_opt
hnsw
Hierarchical Navigable Small World (HNSW) Vector Search
index
introspection
Graph introspection APIs for debugging and observability.
multi_hop
mvcc
MVCC-lite snapshot system for SQLiteGraph
pattern
pattern_engine
Lightweight triple pattern matcher for SQLiteGraph.
progress
Progress tracking for long-running operations.
query
query_cache
High-level query cache layer for SQLiteGraph.
recovery
schema
snapshot
Snapshot isolation for ACID compliance

Macros§

checkpoint_error
Macro for creating checkpoint errors with context
recovery_error

Structs§

Label
NodeId
PropertyKey
PropertyValue

Functions§

match_triples_fast
Execute cache-enabled fast-path pattern matching.