Sekejap-DB
A graph-first, embedded multi-model database engine for Rust.
Part 1: General Description
Sekejap-DB is a graph-native database where relationships are first-class citizens. Multi-model data (vectors, geo, text) attaches to graph nodes - the graph is the core, everything else enhances it.
What makes it special?
- Graph-First: Built for relationship-heavy workloads (RCA, knowledge graphs, agentic AI)
- Multi-Model Nodes: Attach vectors, geo, and text to graph nodes
- Embedded: No server needed, runs in your Rust application
- Causal Queries: Native backward traversal for Root Cause Analysis
- Fast: Optimized for high-velocity writes and low-latency queries
- MVCC: Multi-Version Concurrency Control with soft deletes
Graph-First Philosophy
┌─────────────────────────────────────────────────────────┐
│ Sekejap-DB │
│ Graph-First Design │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ causal_edge ┌─────────┐ │
│ │ Node │◄────────────────►│ Node │ │
│ │ (vector)│ (0.85) │ (geo) │ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ │ │ │
│ ┌────┴────┐ ┌────┴────┐ │
│ │ Vectors │ │ Geo data │ │
│ │(embeddings)│ │(Point/Polygon)│ │
│ └─────────┘ └─────────┘ │
│ │
│ → Graph is the CORE │
│ → Vectors/Geo are ATTRIBUTES on nodes │
│ → Queries traverse RELATIONSHIPS │
└─────────────────────────────────────────────────────────┘
Quick Example
use SekejapDB;
let mut db = new?;
// Write a document with vector embedding
db.write_json?;
// Find similar documents using vector search
let results = db.query
.vector_search
.execute?;
// Add causal relationship
db.add_edge?;
// Root cause analysis
let causes = db.traverse?;
Part 2: Main Usage
2.1 Basic CRUD Operations
Write Data
use ;
let mut db = new?;
// Simple write (goes to Tier 1 staging, promoted later)
db.write?;
// Immediate write to Tier 2
db.write_with_options?;
// Batch write
db.write_many?;
Read Data
use ;
// Read from Tier 2 only (validated data)
if let Some = db.read?
// Read including staged Tier 1 data
let event = db.read_with_options?;
Delete Data
use ;
// Cascade delete (removes edges too)
db.delete?;
// Keep edges for audit trail
db.delete_with_options?;
2.2 Multi-Model Data (Vectors, Geo, Text)
Adding Vectors
use ;
// Option 1: WriteOptions with vector
let embedding = vec!;
db.write_with_options?;
// Option 2: JSON with vectors
db.write_json?;
Adding Geo Data
use ;
// Point coordinates
db.write_with_options?;
// Polygon geometry
let polygon = Polygon;
db.write_with_options?;
// Or use JSON
db.write_json?;
Querying Multi-Model Data
// Vector similarity search (requires "vector" feature)
let query_vec = vec!;
let similar = db.query
.vector_search
.execute?;
// Spatial radius search (requires "spatial" feature)
let nearby = db.query
.spatial?
.execute?;
// Combined multi-model query
let results = db.query
.spatial?
.limit
.execute?;
2.3 Collections and Schema
Identity System
Sekejap-DB uses ArangoDB-style identity:
use EntityId;
// EntityId format: "collection/key"
let entity = new;
assert_eq!;
assert_eq!;
assert_eq!;
// Parse from string
let entity = parse?;
Collections via JSON
Define collections and their schemas in JSON using the define_collection() API:
use SekejapDB;
let mut db = new?;
// Define collections with their indexing schemas
db.define_collection?;
// List registered collections
let collections = db.list_collections;
println!;
// Check if a collection exists
if db.has_collection
Schema Definition via Rust Code
use ;
// Create collection
let mut collection = new;
// Define schema programmatically
let mut schema = new;
// Add vector channels
// Add spatial fields
let spatial_schema = schema.add_spatial;
spatial_schema.index_rtree = true;
// Set hot fields for query optimization
let mut hot = new;
hot.add_vector_field;
hot.add_spatial_field;
hot.add_fulltext_field;
hot.add_fulltext_field;
collection.set_schema;
Schema Modes
| Mode | Description | Use Case |
|---|---|---|
| Flex Mode | No schema defined | Rapid prototyping, flexible data |
| Schema Mode | Full JSON/Rust schema | Production, deterministic indexing |
2.4 Causal Graph and Traversal
Adding Edges
use SekejapDB;
// Create edge with weight
db.add_edge?;
// Multiple edge types
db.add_edge?;
db.add_edge?;
db.add_edge?;
Root Cause Analysis
// Backward BFS traversal from effect to causes
let results = db.traverse?;
println!;
println!;
for edge in &results.edges
2.5 Query Builder
use SekejapDB;
// Simple slug query
let results = db.query
.by_slug
.execute?;
// Multi-model query
let results = db.query
.spatial?
.limit
.execute?;
// With edge filter
let results = db.query
.has_edge_from
.limit
.execute?;
Part 3: Technical Architecture
3.1 Graph-First Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Sekejap-DB │
│ Graph-Native Design │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ TIER 1: Ingestion Buffer (Writes) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ HashMap<slug_hash, NodeHeader> │ │
│ │ └─ Fast staging for high-velocity writes │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIER 2: Serving Layer (Nodes) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ head_index: HashMap<slug_hash, HeadPointer> │ │
│ │ node_store: HashMap<(node_id, rev), NodeHeader> │ │
│ │ blob_store: Large payload storage (JSON, vectors, geo) │ │
│ │ └─ MVCC with revisions, tombstones │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIER 3: Knowledge Graph (Edges) ⭐ CORE │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ outgoing: HashMap<EntityId, Vec<WeightedEdge>> │ │
│ │ incoming: HashMap<EntityId, Vec<EntityId>> │ │
│ │ └─ Forward & reverse indexes for O(1) edge lookup │ │
│ │ │ │
│ │ CSR Sparse Matrix (optional): │ │
│ │ └─ 10-100x memory reduction for sparse graphs │ │
│ │ │ │
│ │ Bloom Filter (optional): │ │
│ │ └─ Fast "edge exists?" checks, no false negatives │ │
│ │ │ │
│ │ Bitmap Traversal (optional): │ │
│ │ └─ O(1) set operations for BFS/DFS frontier │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ⭐ GRAPH IS THE CORE - Nodes and multi-model data enhance it │
└─────────────────────────────────────────────────────────────────────┘
3.2 Graph Query Excellence
Why Graph Queries Excel
┌─────────────────────────────────────────────────────────────────────┐
│ Graph Query Performance │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ EDGE LOOKUP: O(1) with HashMap │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ HashMap<EntityId, Vec<WeightedEdge>> │ │
│ │ │ │ │
│ │ │ Hash(entity_id) │ │
│ │ ▼ │ │
│ │ O(1) access to all outgoing/incoming edges │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ BACKWARD TRAVERSAL: O(E) where E = edges traversed │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ traverse("crime", max_hops=5, threshold=0.3) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ 1. Find starting node (O(1)) │ │
│ │ 2. Get incoming edges (O(1) via incoming index) │ │
│ │ 3. Filter by weight_threshold (O(degree)) │ │
│ │ 4. Queue unique predecessors (O(1) with HashSet) │ │
│ │ 5. Repeat until max_hops or empty queue │ │
│ │ │ │
│ │ Total: O(total_edges_traversed) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ EDGE FILTERS: Applied during traversal (no extra lookups) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ - weight_threshold: Skip edges below threshold │ │
│ │ - edge_type: Filter by "_type" (causal, influences...) │ │
│ │ - time_window: Valid range [start, end] │ │
│ │ - decay: Effective weight with temporal decay │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ CSR SPARSE OPTIMIZATION: 10-100x memory reduction │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Graph: 1M nodes, 10M edges (0.001% dense) │ │
│ │ ├─ Adjacency List: ~800MB (each edge ~80 bytes) │ │
│ │ └─ CSR Matrix: ~8MB (compressed) │ │
│ │ │ │
│ │ Benefit: Fits in L2/L3 cache, faster iteration │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Traversal Flow Diagram
traverse(slug, max_hops=3, weight_threshold=0.3)
═══════════════════════════════════════════════════════════════════════
START: "crime-001"
│
▼
LOOKUP HEAD INDEX (O(1))
│
▼
GET NODE → EntityId("crime-001")
│
▼
BACKWARD BFS ITERATION
│
├── HOPS 0: ["crime-001"] (starting node)
│
├── LOOKUP INCOMING INDEX (O(1))
│ │
│ └── incoming["crime-001"] → ["poverty", "economic-slump"]
│
├── FILTER: weight >= 0.3?
│ ├── poverty → crime-001 (weight: 0.8) ✓
│ └── economic-slump → crime-001 (weight: 0.9) ✓
│
├── ADD TO QUEUE: ["poverty", "economic-slump"]
│ MARK VISITED: {"crime-001", "poverty", "economic-slump"}
│
├── HOPS 1: "poverty"
│ │
│ └── incoming["poverty"] → ["unemployment"]
│ unemployment → poverty (weight: 0.85) ✓
│ ADD TO QUEUE: ["economic-slump", "unemployment"]
│
├── HOPS 2: "economic-slump"
│ │
│ └── incoming["economic-slump"] → ["regulation"]
│ regulation → economic-slump (weight: 0.7) ✓
│ ADD TO QUEUE: ["unemployment", "regulation"]
│
├── HOPS 3: "unemployment" (max_hops reached, stop)
│
▼
RESULT:
path: ["poverty", "economic-slump", "unemployment", "regulation"]
edges: [4 weighted edges]
total_weight: 3.25
═══════════════════════════════════════════════════════════════════════
3.3 MVCC (Multi-Version Concurrency Control)
// Each update creates a new revision
// Old versions are preserved for historical queries
// Version 0
db.write?; // rev = 0
// Version 1 (creates new revision)
db.write?; // rev = 1
// Version 2 (creates new revision)
db.write?; // rev = 2
// All versions are preserved
let v0 = storage.get_by_id?;
let v1 = storage.get_by_id?;
let current = storage.get_by_slug?; // rev = 2
3.4 Tombstones (Soft Deletes)
// Delete creates a tombstone (doesn't physically remove data)
db.delete?;
// Tombstone stores:
// - deleted_at: timestamp
// - reason: optional deletion reason
// get_by_slug returns None for deleted nodes
assert!;
// But historical versions still accessible
let old = storage.get_by_id?;
3.5 Key Data Structures
NodeHeader {
node_id: u128, // Unique identifier
slug_hash: u64, // Hashed slug for fast lookup
rev: u64, // Revision number (MVCC)
payload_ptr: BlobPtr, // Pointer to blob store
vector_ptr: Option<BlobPtr>, // Vector embedding
deleted: bool, // Tombstone flag
tombstone: Option<Tombstone>,
entity_id: Option<EntityId>,
}
WeightedEdge {
_from: EntityId, // Source entity
_to: EntityId, // Target entity
weight: f32, // Evidence strength (0-1)
_type: String, // User-defined edge type
payload: Option<EdgePayload>,
}
EntityId {
collection: CollectionId, // e.g., "news"
key: String, // e.g., "flood-2026"
}
3.6 Feature Flags
| Feature | Description | Enables |
|---|---|---|
vector |
Vector similarity search | .vector_search(), HNSW index |
spatial |
Geo queries | .spatial(), R-tree index |
fulltext |
Full-text search | .fulltext(), Tantivy index |
all |
All features | vector + spatial + fulltext |
# Build with all features
# Build with specific features
3.7 File Structure
src/
├── lib.rs # Main API (SekejapDB struct)
├── types/
│ ├── mod.rs # Type exports and options
│ ├── node.rs # NodeHeader, NodePayload
│ ├── edge.rs # WeightedEdge, EdgePayload
│ ├── blob.rs # BlobStore for large payloads
│ ├── geometry.rs # Point, Polygon, Polyline
│ ├── collection.rs # EntityId, CollectionId
│ ├── schema.rs # CollectionSchema, VectorSchema
│ └── ...
├── storage/
│ ├── single.rs # MVCC storage (SingleStorage)
│ ├── ingestion.rs # Tier 1 buffer
│ └── promote.rs # Auto-promotion
├── graph/
│ ├── mod.rs # CausalGraph
│ ├── concurrent.rs # Thread-safe graph
│ └── ...
├── index/
│ ├── mod.rs # SlugIndex, SpatialIndex
│ └── spatial.rs # R-tree geo index
├── vectors/
│ ├── ops.rs # Vector operations
│ └── index.rs # Vector index
├── query.rs # Query builder
├── atoms.rs # Atomic operations
└── sekejapql.rs # JSON query language
Building and Testing
# Run tests
# Build with all features
# Run specific example
# Check for errors
License
MIT