Sekejap-DB
A graph-first, embedded multi-model database engine for Rust and Python.
1: Overview
Sekejap-DB is a graph-native database designed for high-performance, relationship-heavy workloads like Root Cause Analysis (RCA), RAG, and Agentic AI.
It unifies Graph, Vector, Spatial, and Full-Text search into a single, cohesive engine where the Graph acts as the primary structure and other models serve as attributes or filters.
Think of it as "Data Legos": a transparent, honest engine that replaces hidden black-box magic with explicit connections, giving you the total freedom to see, build, and navigate your own logic brick-by-brick.
1.1 Features
- HNSW Engine: Custom, SIMD-accelerated HNSW implementation (AVX2/FMA) for panic-free, high-concurrency vector search.
- Graph-First: Relationships are first-class citizens. Queries traverse edges to prune the search space before applying expensive vector or text filters.
- Hybrid Querying: Native Index Intersection allows combining Graph, Vector, Spatial, and Text conditions.
- Embedded: Runs directly in your application process (Rust/Python). Zero network overhead.
- Atomic Primitives: Exposes low-level "atoms" (
traverse,search_vector) for building complex custom query logic.
Graph-First Philosophy
┌─────────────────────────────────────────────────────────┐
│ Sekejap-DB │
│ Graph-First Design │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ causal_edge ┌─────────┐ │
│ │ Node │◄────────────────►│ Node │ │
│ │ (vector)│ (0.85) │ (geo) │ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ │ │ │
│ ┌────┴────┐ ┌────┴────┐ │
│ │ Vectors │ │ Geo data │ │
│ │(embeddings)│ │(Point/Polygon)│ │
│ └─────────┘ └─────────┘ │
│ │
│ → Graph is the CORE │
│ → Vectors/Geo are ATTRIBUTES on nodes │
│ → Queries traverse RELATIONSHIPS │
└─────────────────────────────────────────────────────────┘
2: Main Usage (Rust & Python)
2.1 Basic CRUD Operations
Write Data
Rust:
use SekejapDB;
let mut db = new?;
// Simple write
db.write?;
// JSON with Vector & Geo
db.write_json?;
Python:
=
# Simple write
# JSON with Vector & Geo
Read Data
Rust:
if let Some = db.read?
Python:
=
Delete Data
Rust:
// Cascade delete (removes edges too)
db.delete?;
// Keep edges for audit trail
db.delete_with_options?;
Python:
# Cascade delete (removes edges too)
# Keep edges for audit trail
2.2 Defining Schema & Collections
To enable Hybrid Search, you must define which fields are indexed. This tells Sekejap-DB:
- Which fields are Vectors (and what HNSW model to use).
- Which fields are Spatial (Point vs Polygon).
- Which fields are Full-Text searchable.
Rust:
db.define_collection?;
Python:
# Define schema for 'news' collection
2.3 Hybrid Query (Graph + Vector + Spatial + Text)
Scenario: Find events caused by "Heavy Rain" (Graph), in "South Jakarta" (Spatial), matching "Accident" (Text), and similar to a "Severe Crash" vector.
Rust:
// Use the Query Builder for automatic intersection
let results = db.query
.has_edge_from // Graph
.spatial? // Spatial (5km)
.fulltext? // Text
.vector_search // Vector
.execute?;
Python:
# Use the Query Builder for automatic intersection
= \
\
\
\
\
2.4 Graph Traversal & Aggregation
Scenario: Count events rolling up to a District (Hierarchy: Event -> SubDistrict -> District).
Rust:
// Traverse 2 hops: Event -> SubDistrict -> District
// traverse_forward(slug, hops, min_weight, edge_type, time_window)
let results = db.traverse_forward?;
// Logic to check if District node is in results.path...
Python:
# Traverse 2 hops: Event -> SubDistrict -> District
# traverse_forward(slug, hops, min_weight, edge_type)
=
# Logic to check if District node is in results.path...
2.5 Causal Root Cause Analysis
Scenario: Find the root causes of a specific crime event (backward traversal).
Rust:
// traverse(slug, hops, min_weight, edge_type)
let results = db.traverse?;
for edge in &results.edges
Python:
# traverse(slug, hops, min_weight, edge_type)
=
3: Architecture & Performance
3.1 Multi-Tier Storage Architecture
Sekejap-DB employs a unique three-tier storage design to balance write throughput, read latency, and graph traversability.
-
Tier 1: Ingestion Buffer (LSM-Tree)
- Purpose: High-velocity write staging.
- Behavior: Accepts writes immediately. Data is "staged" and eventually promoted.
-
Tier 2: Serving Layer (CoW B+Tree)
- Purpose: Low-latency reads and persistence.
- Behavior: Stores the canonical version of Nodes and Blobs. Optimized for random access by ID.
-
Tier 3: Knowledge Graph (Adjacency)
- Purpose: Relationship traversal and RCA.
- Structure: In-memory adjacency lists (forward/reverse) backed by concurrent maps.
- Behavior: Edges connect Nodes across Tiers 1 and 2. Traversal algorithms run here.
Data Flow: Write -> Tier 1 -> (Async Promotion) -> Tier 2 -> (Graph Indexing) -> Tier 3.
3.2 Query Execution: Index Intersection
Unlike PostgreSQL (Cost-Based Planner) or ArangoDB (Inverted Index), SekejapDB uses Explicit Set Intersection to ensure deterministic performance.
- Parallel Drivers: Enabled searchers (Vector, Spatial, Graph) run independently to fetch candidate Node IDs.
- Bitwise Intersection: Candidate sets are intersected (
HashSet). - Deterministic Latency: Performance is predictable and scales with the selectivity of the strongest filter.
3.3 Vector Engine (Hyper-Sekejap HNSW)
Sekejap-DB implements a custom HNSW engine from scratch to resolve stability issues found in other libraries.
- SIMD Acceleration: AVX2/FMA optimized distance kernels for x86_64.
- Zero-Panic: Built with
crossbeam-epochfor safe, lock-free concurrency. - Dynamic Mmap: Automatically expands storage files as data grows.
- Contiguous Layout: Vectors stored in aligned, memory-mapped buffers for cache efficiency.
3.4 Spatial & Full-Text Indexing
- Spatial (R-Tree): Uses
rstarfor O(log n) point-in-radius and polygon intersection queries. Spatial keys map directly to Node IDs. - Full-Text (Tantivy): Integrates Tantivy for schema-aware lexical search. Writes are staged in Tier 1 before being committed to the Tantivy index, ensuring near-real-time visibility.
Installation
Rust
Add to Cargo.toml:
[]
= { = "0.1.0", = ["fulltext", "vector", "spatial"] }
Python
Building and Testing
# Run tests
# Run benchmarks
# Check for errors
License
MIT