Expand description
SQLite-based graph database with unified backend support.
sqlitegraph provides a lightweight, deterministic graph database for embedded Rust applications.
It supports both SQLite and Native storage backends through a unified API.
§Architecture
The crate is organized into focused modules:
sqlitegraph/
├── graph/ # Core graph database (SqliteGraph, GraphEntity, GraphEdge)
├── backend/ # Unified backend trait (GraphBackend, SqliteGraphBackend, NativeGraphBackend)
├── algo/ # Graph algorithms (PageRank, Betweenness, Louvain, Label Propagation)
├── hnsw/ # Vector similarity search (HNSW index, distance metrics)
├── cache/ # LRU-K adjacency cache for traversal optimization
├── introspection/ # Debugging and observability APIs
├── progress/ # Progress tracking for long-running operations
├── mvcc/ # MVCC-lite snapshot system
├── pattern_engine/ # Triple pattern matching
├── query/ # High-level query interface
└── recovery/ # Backup and restore utilities§Features
- Dual Backend Support: Choose between SQLite (feature-rich) and Native (performance-optimized) backends
- Entity and Edge Storage: Rich metadata support with JSON serialization
- Pattern Matching: Efficient triple pattern matching with cache-enabled fast-path
- Traversal Algorithms: Built-in BFS, k-hop, and shortest path algorithms
- Graph Algorithms: PageRank, Betweenness Centrality, Louvain, Label Propagation
- Vector Search: HNSW approximate nearest neighbor search with persistence
- MVCC Snapshots: Read isolation with snapshot consistency
- Bulk Operations: High-performance batch insertions for large datasets
- Introspection: Debugging APIs for cache stats, file sizes, edge counts
- Progress Tracking: Callback-based progress for long-running algorithms
§Quick Start
ⓘ
use sqlitegraph::{open_graph, GraphConfig, BackendKind};
// Use SQLite backend (default)
let cfg = GraphConfig::sqlite();
let graph = open_graph("my_graph.db", &cfg)?;
// Or use Native backend
let cfg = GraphConfig::native();
let graph = open_graph("my_graph.db", &cfg)?;
// Both backends support the same operations
let node_id = graph.insert_node(/* node spec */)?;
let neighbor_ids = graph.neighbors(node_id, /* query */)?;§Backend Selection
§Feature Matrix
| Feature | SQLite Backend | Native Backend |
|---|---|---|
| ACID Transactions | ✅ Full | ✅ WAL-based |
| Graph Algorithms | ✅ Full support | ✅ Full support |
| HNSW Vector Search | ✅ With persistence | ✅ In-memory |
| MVCC Snapshots | ✅ | ✅ |
| Pattern Matching | ✅ | ✅ |
| Raw SQL Access | ✅ Native | ❌ Not supported |
| File Format | SQLite DB | Custom binary |
| Startup Time | Fast | Faster |
| Dependencies | libsqlite3 | None (pure Rust) |
| Write Performance | Good | Better |
| Query Performance | Good | Better |
§When to Use SQLite Backend
Choose SQLite backend when:
- ACID guarantees are critical for your application
- Raw SQL access needed for complex queries or joins
- Database compatibility with SQLite tools (sqlite3, DB Browser)
- Mature ecosystem with third-party tooling
- HNSW persistence required (vectors survive restarts)
§When to Use Native Backend
Choose Native backend when:
- Performance is critical (faster reads/writes)
- No external dependencies desired (pure Rust)
- Fast startup with large datasets
- Custom binary format acceptable
- HNSW in-memory only (vectors persist in separate file)
§Thread Safety
§SqliteGraph is NOT Thread-Safe
SqliteGraph uses interior mutability (RefCell) and is not Sync:
ⓘ
use sqlitegraph::SqliteGraph;
use std::thread;
let graph = SqliteGraph::open("test.db")?;
// ❌ WRONG: Sharing graph across threads for writes
let graph_clone = graph;
thread::spawn(move || {
graph_clone.insert_node(...)?; // DATA RACE!
});
// ✅ CORRECT: Use snapshots for concurrent reads
let snapshot = graph.snapshot()?;
thread::spawn(move || {
let neighbors = snapshot.neighbors(node_id)?; // Thread-safe
});§Concurrent Read Access
Use GraphSnapshot for thread-safe concurrent reads:
ⓘ
use sqlitegraph::{GraphSnapshot, SqliteGraph};
let graph = SqliteGraph::open("my_graph.db")?;
// Create multiple snapshots for concurrent reads
let snapshot1 = graph.snapshot()?;
let snapshot2 = graph.snapshot()?;
// Both snapshots can be used concurrently (thread-safe)
let handle1 = std::thread::spawn(move || {
snapshot1.neighbors(node_id)
});
let handle2 = std::thread::spawn(move || {
snapshot2.neighbors(node_id)
});§Write Serialization
All writes must be serialized:
ⓘ
// ✅ CORRECT: Single thread for all writes
let graph = SqliteGraph::open("my_graph.db")?;
for i in 0..1000 {
graph.insert_node(...)?;
graph.insert_edge(...)?;
}
// ❌ WRONG: Concurrent writes
let graph = Arc::new(Mutex::new(graph));
let handle1 = thread::spawn(|| {
let g = graph.lock().unwrap();
g.insert_node(...)
});
let handle2 = thread::spawn(|| {
let g = graph.lock().unwrap();
g.insert_node(...)
});
// Even with Mutex, this can cause issues due to RefCell§Error Handling
All operations return Result<T, SqliteGraphError>:
ⓘ
use sqlitegraph::{SqliteGraph, SqliteGraphError};
let graph = SqliteGraph::open("my_graph.db")?;
match graph.insert_node(node_spec) {
Ok(node_id) => println!("Created node {}", node_id),
Err(SqliteGraphError::EntityNotFound) => {
println!("Node not found");
}
Err(SqliteGraphError::DatabaseError(e)) => {
eprintln!("Database error: {}", e);
}
Err(e) => {
eprintln!("Other error: {}", e);
}
}§Performance Comparison
§Read Performance
- SQLite Backend: 10-100μs per neighbor lookup (cached: ~100ns)
- Native Backend: 1-10μs per neighbor lookup (cached: ~100ns)
- Cache hit ratio: 80-95% for traversal workloads
§Write Performance
- SQLite Backend: 100-500μs per insert (transaction-batched)
- Native Backend: 10-100μs per insert (transaction-batched)
- Bulk insert: 10-100x faster with
bulk_insert_entities()
§Memory Usage
- Base overhead: O(V + E) for graph storage
- Cache overhead: 10-20% additional memory
- HNSW index: 2-3x vector data size
§Public API Organization
This crate exports a clean, stable public API organized as follows:
§Core Types
GraphEntity- Graph node/vertex representationGraphEdge- Graph edge/relationship representationGraphBackend- Unified trait for backend implementationsSqliteGraphBackend- SQLite backend implementationNativeGraphBackend- Native backend implementation
§Configuration
BackendKind- Runtime backend selection enumGraphConfig- Unified configuration for both backendsSqliteConfig- SQLite-specific optionsNativeConfig- Native-specific optionsopen_graph()- Unified factory function
§Operations
- [
insert_node()], [insert_edge()] - Single entity/edge insertion bulk_insert_entities(),bulk_insert_edges()- Batch operations- [
neighbors()] - Direct neighbor queries - [
bfs()], [k_hop()], [shortest_path()] - Graph traversal algorithms pattern_engine- Pattern matching and triple storage
§Graph Algorithms
pagerank- PageRank centralitybetweenness_centrality- Betweenness centralitylouvain_communities- Louvain community detectionlabel_propagation- Label propagation algorithm
§Vector Search
hnsw::HnswIndex- HNSW vector search indexhnsw::HnswConfig- HNSW configurationhnsw::DistanceMetric- Distance metrics (Cosine, Euclidean, etc.)
§Utilities
SqliteGraphError- Comprehensive error handlingGraphSnapshot- MVCC snapshot systemGraphIntrospection- Introspection and debugging APIsProgressCallback- Algorithm progress trackingrecovery- Database backup and restore utilities
Re-exports§
pub use graph_opt::GraphEdgeCreate;pub use graph_opt::GraphEntityCreate;pub use graph_opt::bulk_insert_edges;pub use graph_opt::bulk_insert_entities;pub use graph_opt::cache_stats;pub use index::add_label;pub use index::add_property;pub use mvcc::GraphSnapshot;pub use mvcc::SnapshotState;pub use pattern_engine::PatternTriple;pub use pattern_engine::TripleMatch;pub use pattern_engine::match_triples;pub use query::GraphQuery;pub use recovery::dump_graph_to_path;pub use recovery::load_graph_from_path;pub use recovery::load_graph_from_reader;pub use snapshot::SnapshotId;pub use backend::BackendDirection;pub use backend::ChainStep;pub use backend::GraphBackend;pub use backend::BackupResult;pub use backend::EdgeSpec;pub use backend::NativeGraphBackend;pub use backend::NeighborQuery;pub use backend::NodeSpec;pub use backend::SqliteGraphBackend;pub use config::BackendKind;pub use config::GraphConfig;pub use config::NativeConfig;pub use config::SqliteConfig;pub use config::open_graph;pub use errors::SqliteGraphError;pub use graph::GraphEdge;pub use graph::GraphEntity;pub use graph::SqliteGraph;pub use algo::betweenness_centrality;pub use algo::betweenness_centrality_with_progress;pub use algo::label_propagation;pub use algo::louvain_communities;pub use algo::louvain_communities_with_progress;pub use algo::pagerank;pub use algo::pagerank_with_progress;pub use progress::ConsoleProgress;pub use progress::NoProgress;pub use progress::ProgressCallback;pub use progress::ProgressState;pub use introspection::EdgeCount;pub use introspection::GraphIntrospection;pub use introspection::IntrospectError;pub use cache::CacheStats;
Modules§
- algo
- Graph algorithms for centrality, community detection, and structure analysis.
- backend
- Backend trait bridging sqlitegraph with higher-level graph consumers.
- backend_
selector - bench_
gates - bench_
meta - bench_
regression - bench_
utils - bfs
- cache
- LRU-K adjacency cache for graph traversal optimization.
- config
- Configuration for backend selection and backend-specific options.
- debug
- Centralized debug logging with feature flag control
- dsl
- errors
- graph
- SQLite-backed graph database implementation.
- graph_
opt - hnsw
- Hierarchical Navigable Small World (HNSW) Vector Search
- index
- introspection
- Graph introspection APIs for debugging and observability.
- multi_
hop - mvcc
- MVCC-lite snapshot system for SQLiteGraph
- pattern
- pattern_
engine - Lightweight triple pattern matcher for SQLiteGraph.
- progress
- Progress tracking for long-running operations.
- query
- query_
cache - High-level query cache layer for SQLiteGraph.
- recovery
- schema
- snapshot
- Snapshot isolation for ACID compliance
Macros§
- checkpoint_
error - Macro for creating checkpoint errors with context
- recovery_
error
Structs§
Functions§
- match_
triples_ fast - Execute cache-enabled fast-path pattern matching.