SQLiteGraph
Deterministic, embedded graph database with SQLite and high-performance Native V2 backends.
Current Status: Active Development
SQLiteGraph provides two backend options for different use cases:
- SQLite Backend: Mature, ACID-compliant, WAL mode enabled by default for concurrent performance (file-based databases)
- Native V2 Backend: High-performance, clustered adjacency with complete WAL implementation for transactional durability (direct file I/O)
Features
Dual Backend Architecture
- SQLite Backend: Traditional SQLite storage with full ACID transactions and WAL mode for concurrency
- Native V2 Backend: Custom binary format with clustered adjacency for maximum performance
- Backend Abstraction: Unified API works with either backend
- Easy Migration: Switch backends with configuration changes
WAL Mode (Both Backends)
- SQLite Backend: WAL mode enabled by default for all file-based databases
- Native V2 Backend: Complete WAL implementation with cluster-affinity logging
- Automatic Enablement: WAL enabled by default (SQLite) or available via config (Native V2)
- Concurrent Performance: 30-50% improvement for concurrent read/write workloads
- Crash Recovery: Full transaction recovery from WAL logs
- ACID Compliance: Full transaction support with rollback capabilities
- Automatic File Management: WAL and SHM files created and managed automatically
- Network Filesystem Support: Graceful fallback to DELETE mode when WAL is unsupported
Core Graph Operations
- Entity Management: Insert, update, retrieve, delete graph entities
- Edge Management: Create and manage relationships between entities
- JSON Data Storage: Arbitrary JSON metadata with entities and edges
- Deterministic Operations: Consistent ordering and behavior
Traversal & Querying
- Neighbor Queries: Get incoming/outgoing connections
- Pattern Matching: Advanced graph pattern queries with fast-path caching
- Traversal Algorithms: BFS, shortest path, connected components
- Query Cache: Cached K-hop and shortest path queries
- Reasoning Pipelines: Multi-step analysis with filtering and scoring
HNSW Vector Search
- Approximate Nearest Neighbor: O(log N) search complexity
- High Performance: In-memory vector index with 95%+ recall
- Multiple Distance Metrics: Cosine, Euclidean, Dot Product, Manhattan
- SIMD Optimized: AVX2/AVX-512 support for distance calculations
- Dynamic Updates: Insert and delete vectors without full rebuilds
- Configuration: Flexible HNSW parameters for accuracy/speed tradeoffs
Bulk Operations & Snapshots
- Bulk Insert: High-performance batch entity and edge insertion
- Snapshot Export: Atomic graph snapshot creation with 70%+ storage efficiency
- Snapshot Import: Fast graph restoration from snapshots
- Cross-Platform: Platform-independent binary snapshot format
MVCC & Transactions
- MVCC Snapshots: Read isolation with snapshot consistency
- Transaction Support: Full ACID transactions (SQLite) or WAL transactions (Native V2)
- Rollback: Complete transaction rollback capabilities
Performance & Safety
- Benchmark Gates: Automated performance regression prevention
- Safety Tools: Orphan edge detection, integrity validation
- Memory Management: Configurable caching and buffer management
- Error Handling: Comprehensive error reporting and recovery
Quick Start
Add to your Cargo.toml:
[]
= "0.2.1"
SQLite Backend (Default)
use ;
Native V2 Backend (High Performance)
Enable the Native V2 backend in your Cargo.toml:
[]
= { = "0.2.1", = ["native-v2"] }
use ;
Testing
# Run all tests
# Test specific backend
# Run benchmarks
# Run working examples
Current Capabilities
What Works Today
Core Operations:
- Entity CRUD operations with JSON metadata
- Edge creation and management
- In-memory and persistent storage
- Both backends fully functional
Performance:
- Native V2: 50K-100K operations/second (benchmarked)
- SQLite: Standard SQLite performance with optimizations
- Deterministic behavior across platforms
Data Integrity:
- ACID transactions (SQLite backend)
- Corruption prevention in V2 backend
- Comprehensive safety checks
- Benchmark regression gates
Current Limitations
Scope:
- Focused on embedded use cases (not distributed)
- Single-machine graph processing
- No built-in clustering or replication
API Surface:
- Concentrated on graph operations, limited advanced analytics
- No built-in machine learning or advanced analytics
- Limited visualization capabilities
Performance Characteristics:
- Native V2 optimized for read-heavy workloads
- Write performance varies by workload pattern
- Large graphs (>1M edges) may need tuning
Documentation
- Manual - Detailed operator guide
- API Documentation - Complete API reference
- Examples - Working code examples
- CHANGELOG - Version history and changes
License
GPL-3.0-only - see LICENSE for details.
Development Notes
V2 Architecture Status
V2 Native Backend Status
- All V1 legacy code removed
- Clustered adjacency storage implemented
- Corruption prevention active
- Comprehensive test coverage
- Experimental high-performance features
Performance Benchmarks
Current performance characteristics (Native V2):
- Node insertion: ~50K ops/sec
- Edge insertion: ~100K ops/sec
- Traversal: Varies by graph structure
- Memory usage: Optimized with configurable buffers
Known Limitations
- Compilation Warnings: ~50 warnings (non-critical, mostly unused code paths)
- Single Machine: No built-in distributed capabilities
- Memory Usage: Large graphs may require buffer tuning
- Documentation: API evolving as new features added