Sombra - High-Performance Graph Database
⚠️ Alpha Software: Sombra is under active development. APIs may change, and the project is not yet recommended for production use. Feedback and contributions are welcome!
Sombra is a file-based graph database inspired by SQLite's single-file architecture. Built in Rust with a focus on reliability, performance, and ACID transactions.
Features
Core Features
- Property Graph Model: Nodes, edges, and flexible properties
- Single File Storage: SQLite-style database files
- ACID Transactions: Full transactional support with rollback
- Write-Ahead Logging: Crash-safe operations
- Page-Based Storage: Efficient memory-mapped I/O
Performance Features ✨ NEW
- Label Index: Fast label-based queries with O(1) lookup
- LRU Node Cache: 90% hit rate for repeated reads
- B-tree Primary Index: 25-40% memory reduction, better cache locality
- Optimized Graph Traversals: 18-23x faster than SQLite for graph operations
- Performance Metrics: Real-time monitoring of cache, queries, and traversals
- Scalability Testing: Validated for 100K+ node graphs
Language Support
- Rust API: Core library with full feature support
- TypeScript/Node.js API: Complete NAPI bindings for JavaScript/TypeScript
- Python API: PyO3 bindings with native performance (build with
maturin -F python) - Cross-Platform: Linux, macOS, and Windows support
Reliability Features
- Comprehensive Error Handling: All errors handled gracefully with
Resulttypes - Corruption Resistance: Safe deserialization with comprehensive validation
- Structured Logging: Full tracing support with
tracingcrate - Health Monitoring: Built-in health checks and extended metrics
- Graceful Shutdown: Clean database closure with WAL checkpoint
- Resource Limits: Configurable limits for database size, WAL, and transactions
- Database Inspector: CLI tools for inspection and repair
Testing & Quality
- 100+ Comprehensive Tests: Unit, integration, stress, and fuzz tests
- Corruption Fuzzing: 10,000+ scenarios tested without crashes
- Multi-Platform CI: Linux, macOS, Windows with full test coverage
- Zero Clippy Warnings: Strict linting with
-D warnings - Benchmark Suite: Performance regression testing
Quick Start
Rust API
use *;
// Open or create a database
let mut db = open?;
// Use transactions for safe operations
let mut tx = db.begin_transaction?;
// Add nodes and edges
let user = tx.add_node?;
let post = tx.add_node?;
tx.add_edge?;
// Commit to make changes permanent
tx.commit?;
// Query the graph
let neighbors = db.get_neighbors?;
println!;
// Create property indexes for fast queries
db.create_property_index?;
let users_age_30 = db.find_nodes_by_property?;
println!;
TypeScript/Node.js API
import { SombraDB, SombraPropertyValue } from 'sombradb';
const db = new SombraDB('./my_graph.db');
const createProp = (type: 'string' | 'int' | 'float' | 'bool', value: any): SombraPropertyValue => ({
type,
value
});
const alice = db.addNode(['Person'], {
name: createProp('string', 'Alice'),
age: createProp('int', 30)
});
const bob = db.addNode(['Person'], {
name: createProp('string', 'Bob'),
age: createProp('int', 25)
});
const knows = db.addEdge(alice, bob, 'KNOWS', {
since: createProp('int', 2020)
});
const aliceNode = db.getNode(alice);
console.log('Alice:', aliceNode);
const neighbors = db.getNeighbors(alice);
console.log(`Alice has ${neighbors.length} connections`);
const bfsResults = db.bfsTraversal(alice, 3);
console.log('BFS traversal:', bfsResults);
const tx = db.beginTransaction();
try {
const charlie = tx.addNode(['Person'], {
name: createProp('string', 'Charlie')
});
tx.addEdge(alice, charlie, 'KNOWS');
tx.commit();
} catch (error) {
tx.rollback();
throw error;
}
db.flush();
db.checkpoint();
Python API
=
=
=
=
=
=
=
Installation
Rust
TypeScript/Node.js
Python
# Install from PyPI (coming soon)
# Or build from source
CLI Tools
Install the unified CLI for database inspection, repair, and verification:
# Via Cargo (recommended)
# The 'sombra' command will be available system-wide
The CLI is also bundled with npm and pip installations:
# Via npm
# Via pip
See the CLI documentation for complete usage guide.
Architecture
Sombra is built in layers:
- Storage Layer: Page-based file storage with 8KB pages
- Pager Layer: In-memory caching and dirty page tracking
- WAL Layer: Write-ahead logging for crash safety
- Transaction Layer: ACID transaction support
- Graph API: High-level graph operations
- NAPI Bindings: TypeScript/Node.js interface layer
Documentation
Getting Started
- Getting Started Guide - Quick start tutorial
- Configuration Guide - Configuration options and tuning
- Operations Guide - Production deployment and monitoring
- Migration Guide - Upgrading from 0.1.x to 0.2.0
Language-Specific Guides
- Python Guide - Using Sombra from Python
- Node.js Guide - Using Sombra from TypeScript/JavaScript
- CLI Tools - Command-line tools for inspection, repair, and verification
Technical Documentation
- Architecture - System architecture and design
- Transaction Design - ACID transaction implementation
- Data Model - Graph data structure details
- B-tree Index Implementation - Primary index details
- Performance Metrics - Monitoring and observability
Development
- Contributing - Development guidelines
- Roadmap - Future development plans
- API Documentation - Complete API reference
Testing
# Run all tests
# Run transaction tests specifically
# Run smoke tests
# Run stress tests
Performance
Phase 1 Optimizations ✅ COMPLETE
Sombra now includes production-ready performance optimizations:
| Optimization | Improvement | Status |
|---|---|---|
| Label Index | Fast O(1) label queries | ✅ Complete |
| Node Cache | 90% hit rate for repeated reads | ✅ Complete |
| B-tree Index | 25-40% memory reduction | ✅ Complete |
| Metrics System | Real-time monitoring | ✅ Complete |
Benchmark Results (100K nodes):
Node Lookups: ~1.5M ops/sec
Neighbor Queries: ~9.9M ops/sec
Index Memory: 25% reduction (3.2MB → 2.4MB)
Cache Hit Rate: 90% after warmup
Graph Traversal Performance (vs SQLite):
- Medium Dataset: 7,778 ops/sec vs 452 ops/sec (18x faster)
- Large Dataset: 1,092 ops/sec vs 48 ops/sec (23x faster)
Running Benchmarks
# Index performance comparison
# BFS traversal performance
# Scalability testing (50K-500K nodes)
# Performance metrics demo
Current Status
Version 0.3.2 - Alpha
Core Features:
- Core graph operations (nodes, edges, properties)
- Page-based storage with B-tree indexing (25-40% memory savings)
- Write-ahead logging (WAL) for crash recovery
- ACID transactions with rollback support
- Label secondary index with O(1) lookup
- LRU node cache (90% hit rate)
- Adjacency indexing for fast traversals (18-23x faster than SQLite)
- Property-based indexes for O(log n) queries
- Multi-reader concurrency support (100+ concurrent readers)
Quality & Reliability:
- ✅ Comprehensive error handling with graceful degradation
- ✅ Corruption resistance - 10,000+ fuzzing scenarios
- ✅ Structured logging - Full
tracingsupport - ✅ Health monitoring - Extended metrics and health checks
- ✅ Graceful shutdown - Clean
close()method - ✅ Resource limits - Configurable size/timeout limits
- ✅ CLI tools - Inspector and repair utilities
- ✅ 100+ tests passing - Unit, integration, stress, fuzz
- ✅ Complete documentation - API docs, guides, examples
- ✅ Multi-platform CI - Linux, macOS, Windows
Language Bindings:
- ✅ Rust API (native)
- ✅ Python bindings (PyO3)
- ✅ TypeScript/Node.js bindings (NAPI)
🚀 Roadmap to Production (v1.0)
In Progress:
- Real-world testing and feedback collection
- API stabilization and versioning strategy
- Performance optimization and profiling
Planned Features:
- Page-level checksums for data integrity validation
- MVCC for improved concurrency
- Query planner with cost-based optimization
- Replication and high availability
- Backup/restore utilities
- Performance dashboard
- Production deployment case studies
See CHANGELOG.md for detailed release notes and docs/roadmap.md for future plans.
Examples
See the tests/ directory for comprehensive examples:
tests/smoke.rs- Basic usage patternstests/stress.rs- Performance and scalabilitytests/transactions.rs- Transaction usage examples
License
This project is open source. See LICENSE for details.
Contributing
See Contributing Guidelines for information on how to contribute to Sombra.