Expand description
§SynaDB
An AI-native embedded database written in Rust.
SynaDB synthesizes the embedded simplicity of SQLite, the columnar analytical speed of DuckDB, and the schema flexibility of MongoDB. It exposes a C-ABI for polyglot integration and is optimized for AI/ML workloads including vector search, tensor operations, model versioning, and experiment tracking.
§Features
§Core Database
- Append-only log structure - Fast sequential writes, immutable history
- Schema-free - Store heterogeneous data types without migrations
- Delta & LZ4 compression - Minimize storage for time-series data
- Crash recovery - Automatic index rebuild on open
- Thread-safe - Concurrent read/write access with mutex-protected writes
§Vector Search
vector::VectorStore- Embedding storage with similarity searchmmap_vector::MmapVectorStore- Ultra-high-throughput vector storage (7x faster than VectorStore)hnsw::HnswIndex- O(log N) approximate nearest neighbor searchgwi::GravityWellIndex- Novel O(N) build time index (faster than HNSW)cascade::CascadeIndex- Three-stage hybrid index (LSH + bucket tree + graph)sparse_vector_store::SparseVectorStore- Inverted index for lexical embeddings (SPLADE, BM25, TF-IDF)distance- Cosine, Euclidean, and Dot Product metrics
§AI/ML Platform
tensor::TensorEngine- Batch tensor operations with chunked storagemodel_registry::ModelRegistry- Version models with SHA-256 checksum verificationexperiment::ExperimentTracker- Log parameters, metrics, and artifacts
§Performance
mmap::MmapReader- Memory-mapped I/O for zero-copy readsgpu::GpuContext- GPU Direct memory access (optionalgpufeature)faiss_index- FAISS integration for billion-scale search (optionalfaissfeature)
§FFI
ffi- C-ABI interface for Python, Node.js, C++, or any FFI-capable language
§Architecture Philosophy
SynaDB uses a modular architecture where each component is optimized for its workload:
| Component | Purpose |
|---|---|
SynaDB | Core key-value store with history |
vector::VectorStore | Embedding storage with HNSW search |
mmap_vector::MmapVectorStore | High-throughput vector ingestion |
gwi::GravityWellIndex | Fast-build vector index |
cascade::CascadeIndex | Hybrid three-stage index |
sparse_vector_store::SparseVectorStore | Lexical search (SPLADE, BM25) |
tensor::TensorEngine | Batch tensor operations |
model_registry::ModelRegistry | Model versioning with checksums |
experiment::ExperimentTracker | Experiment tracking |
§Quick Start
§Core Database
use synadb::{SynaDB, Atom, Result};
fn main() -> Result<()> {
let mut db = SynaDB::new("my_data.db")?;
// Write different data types
db.append("temperature", Atom::Float(23.5))?;
db.append("count", Atom::Int(42))?;
db.append("name", Atom::Text("sensor-1".to_string()))?;
// Read values back
if let Some(temp) = db.get("temperature")? {
println!("Temperature: {:?}", temp);
}
// Build history for ML
db.append("temperature", Atom::Float(24.1))?;
db.append("temperature", Atom::Float(24.8))?;
// Extract as tensor
let history = db.get_history_floats("temperature")?;
println!("History: {:?}", history); // [23.5, 24.1, 24.8]
Ok(())
}§Vector Store
use synadb::vector::{VectorStore, VectorConfig, SearchResult};
use synadb::distance::DistanceMetric;
fn main() -> synadb::Result<()> {
let config = VectorConfig {
dimensions: 768,
metric: DistanceMetric::Cosine,
..Default::default()
};
let mut store = VectorStore::new("vectors.db", config)?;
// Insert embeddings
let embedding = vec![0.1f32; 768];
store.insert("doc1", &embedding)?;
// Search for similar vectors
let query = vec![0.1f32; 768];
let results = store.search(&query, 10)?;
for r in results {
println!("{}: {:.4}", r.key, r.score);
}
Ok(())
}§High-Throughput Vector Ingestion
use synadb::mmap_vector::{MmapVectorStore, MmapVectorConfig};
use synadb::distance::DistanceMetric;
fn main() -> synadb::Result<()> {
let config = MmapVectorConfig {
dimensions: 768,
metric: DistanceMetric::Cosine,
initial_capacity: 100_000,
..Default::default()
};
let mut store = MmapVectorStore::new("vectors.mmap", config)?;
// Insert vectors one at a time or in batches
let embedding = vec![0.1f32; 768];
store.insert("doc_0", &embedding)?;
// Batch insert (7x faster than VectorStore)
let keys: Vec<&str> = vec!["doc_1", "doc_2", "doc_3"];
let vecs: Vec<Vec<f32>> = vec![vec![0.1f32; 768]; 3];
let vec_refs: Vec<&[f32]> = vecs.iter().map(|v| v.as_slice()).collect();
store.insert_batch(&keys, &vec_refs)?;
// Build index for fast search
store.build_index()?;
Ok(())
}§Model Registry
use synadb::model_registry::{ModelRegistry, ModelStage};
use std::collections::HashMap;
fn main() -> synadb::Result<()> {
let mut registry = ModelRegistry::new("models.db")?;
// Save model with metadata
let model_data = vec![0u8; 1024];
let mut metadata = HashMap::new();
metadata.insert("accuracy".to_string(), "0.95".to_string());
let version = registry.save_model("classifier", &model_data, metadata)?;
println!("Saved v{} with checksum {}", version.version, version.checksum);
// Load with checksum verification
let (data, info) = registry.load_model("classifier", None)?;
// Promote to production
registry.set_stage("classifier", version.version, ModelStage::Production)?;
Ok(())
}§Experiment Tracking
use synadb::experiment::{ExperimentTracker, RunStatus};
fn main() -> synadb::Result<()> {
let mut tracker = ExperimentTracker::new("experiments.db")?;
// Start a run
let run_id = tracker.start_run("mnist", vec!["baseline".to_string()])?;
// Log parameters and metrics
tracker.log_param(&run_id, "learning_rate", "0.001")?;
for epoch in 0..100 {
let loss = 1.0 / (epoch + 1) as f64;
tracker.log_metric(&run_id, "loss", loss, Some(epoch as u64))?;
}
// End run
tracker.end_run(&run_id, RunStatus::Completed)?;
Ok(())
}§Feature Flags
| Feature | Description |
|---|---|
gpu | Enable GPU Direct memory access (requires CUDA) |
faiss | Enable FAISS integration for billion-scale search |
async | Enable async runtime for parallel operations |
§Storage Architecture
SynaDB uses an append-only log structure. Each entry consists of:
- A fixed-size
LogHeader(15 bytes) with timestamp, lengths, and flags - The key as UTF-8 bytes
- The value serialized with bincode
An in-memory index maps keys to file offsets for O(1) lookups.
An in-memory index maps keys to file offsets for O(1) lookups.
Re-exports§
pub use engine::close_db;pub use engine::free_tensor;pub use engine::open_db;pub use engine::with_db;pub use engine::DbConfig;pub use engine::SynaDB;pub use error::Result;pub use error::SynaError;pub use types::Atom;pub use types::LogHeader;pub use types::HEADER_SIZE;pub use types::IS_COMPRESSED;pub use types::IS_DELTA;pub use types::IS_TOMBSTONE;pub use tensor::optimal_chunk_size;pub use tensor::DType;pub use tensor::MmapTensorMeta;pub use tensor::MmapTensorRef;pub use tensor::TensorEngine;pub use tensor::TensorMeta;pub use tensor::CHUNK_SIZE_LARGE;pub use tensor::CHUNK_SIZE_MEDIUM;pub use tensor::CHUNK_SIZE_SMALL;pub use tensor::DEFAULT_CHUNK_SIZE;pub use mmap::MmapReader;pub use mmap_vector::MmapSearchResult;pub use mmap_vector::MmapVectorConfig;pub use mmap_vector::MmapVectorStore;pub use gpu::GpuContext;pub use gpu::GpuTensor;pub use cascade::CascadeConfig;pub use cascade::CascadeIndex;pub use vector::SearchResult;pub use vector::VectorConfig;pub use vector::VectorStore;pub use hnsw::HnswConfig;pub use hnsw::HnswIndex;pub use gwi::GravityWellIndex;pub use sparse_vector::SparseVector;pub use sparse_vector_store::SparseIndexStats;pub use sparse_vector_store::SparseSearchResult;pub use sparse_vector_store::SparseVectorStore;pub use model_registry::ModelRegistry;pub use model_registry::ModelStage;pub use model_registry::ModelVersion;pub use experiment::ExperimentTracker;pub use experiment::Run;pub use experiment::RunStatus;pub use distance::DistanceMetric;
Modules§
- arch
- Hybrid Hot/Cold Vector Architecture
- cascade
- Cascade Index: Fast vector index with O(N) build time
- compression
- Compression utilities for Syna database.
- distance
- Distance metrics for vector similarity search.
- engine
- Core database engine for Syna.
- error
- Error types for Syna database operations.
- experiment
- Experiment tracking for ML workflows.
- ffi
- C-ABI Foreign Function Interface for Syna database.
- ffi_
sparse - FFI Layer for Sparse Vector Store
- gpu
- GPU Direct memory access (optional feature)
- gwi
- Gravity Well Index (GWI) - Append-Only Vector Indexing
- hnsw
- Hierarchical Navigable Small World (HNSW) index for approximate nearest neighbor search
- mmap
- Memory-mapped file access for zero-copy reads.
- mmap_
vector - Memory-mapped vector store for ultra-high-throughput embedding storage.
- model_
registry - Model registry for versioned artifact storage.
- sparse_
vector - Sparse Vector for Lexical Embeddings
- sparse_
vector_ store - Sparse Vector Store with Inverted Index
- tensor
- Tensor engine for batch operations on numerical data.
- types
- Core data types for Syna database.
- vector
- Vector store for embedding storage and similarity search.