OverGraph is a graph database that runs inside your process. No server, no network calls, no Docker containers. You open a directory, and you have a full graph database with temporal edges, weighted relationships, and sub-microsecond lookups.
I built it because I wanted a graph database that was genuinely fast — not "fast for a database," but fast enough that you forget it's there. Node lookups in 200 nanoseconds. Neighbor traversals in 2 microseconds. Batch writes at 600K+ nodes per second. And I wanted it to work everywhere I work: Rust, Node.js, and Python, all backed by the same engine.
It's written entirely in Rust and it ships native connectors for Node.js (napi-rs) and Python (PyO3) so you can use it from whatever you're building in.
What makes it different
- Truly embedded. No separate process. No socket. Your database is a folder on disk. Copy it, move it, back it up with
cp -r. - Rich graph primitives. Weighted nodes and edges, temporal validity windows, exponential decay scoring, automatic retention policies. Model relationships that evolve over time, and let the graph clean up what's no longer relevant.
- Fast where it matters. Node lookups in ~200ns. Neighbor traversal in ~2μs. Batch writes at 600K+ nodes/sec. The storage engine is a log-structured merge tree with mmap'd immutable segments, so reads never block writes.
- Three languages, one engine. Rust core with native bindings for Node.js (napi-rs) and Python (PyO3). Not a wrapper around a REST API. Actual FFI into the same Rust engine, with <3x overhead on most operations.
- No query language. Just functions.
upsert_node,neighbors,find_nodes. If you can call a function, you can use OverGraph.
Performance
All numbers from a real benchmark suite running on the Rust core (group-commit durability mode, small profile: 10K nodes / 50K edges). Full methodology and reproducibility guide in docs/04-quality/Benchmark-Methodology.md.
| Operation | Latency | Throughput |
|---|---|---|
get_node |
0.2 μs | 5.5M ops/s |
upsert_node |
2.2 μs | 807K ops/s |
neighbors (1-hop) |
2.1 μs | 541K ops/s |
batch_upsert_nodes (100) |
236 μs | 625K nodes/s |
top_k_neighbors |
17.5 μs | 81K ops/s |
personalized_pagerank |
254 μs | 4.3K ops/s |
Node.js and Python connectors add roughly 1.5-3x overhead depending on the operation. Batch operations are nearly 1:1 because they amortize the FFI boundary cost. Full cross-language comparison in the launch benchmark pack.
Install
Rust
# Cargo.toml
[]
= "0.1"
Node.js
Python
Note: Prebuilt binaries are coming soon. For now, you'll need the Rust toolchain installed to compile the native extension during
npm installorpip install.
Quick start
Rust
use ;
use BTreeMap;
use Path;
Node.js
import from 'overgraph';
const db = await ;
// Create nodes
const = await db.;
// Connect them
await db.;
// Traverse
const neighbors = await db.;
console.log;
await db.;
Python
# Create nodes
=
=
# Connect them
# Traverse
=
Or async:
= await
= await
Features
Core graph operations
- Upsert semantics. Nodes are keyed by
(type_id, key). Upsert the same key twice and you get an update, not a duplicate. Edges can optionally enforce uniqueness on(from, to, type_id). - Batch operations.
batch_upsert_nodesandbatch_upsert_edgesamortize WAL and memtable overhead. There's also a packed binary format for maximum throughput. - Atomic graph patch.
graph_patchlets you upsert nodes, upsert edges, delete nodes, delete edges, and invalidate edges in a single atomic operation.
Temporal edges
- Validity windows. Edges have optional
valid_fromandvalid_totimestamps. Query at any point in time with theat_epochparameter and only see edges that were valid at that moment. - Edge invalidation. Mark an edge as no longer valid without deleting it. The history is preserved.
- Decay scoring. Pass a
decay_lambdato neighbor queries and edge weights are automatically scaled byexp(-lambda * age_hours). Recent connections matter more.
Queries and traversal
- 1-hop and 2-hop neighbors. With optional edge type filters, direction control, and limit.
- Constrained 2-hop. Traverse specific edge types in the first hop, then filter target nodes by type in the second hop. Built for "find all X related to Y through Z" patterns.
- Top-K neighbors. Get the K highest-scoring neighbors by weight, recency, or decay-adjusted score.
- Personalized PageRank. Run PPR from seed nodes to find the most relevant nodes in the graph. Useful for context retrieval in RAG pipelines.
- Subgraph extraction. Pull out a connected subgraph up to N hops deep. Good for building local context windows.
- Property search. Find nodes by
type_id+ property equality. Hash-indexed for O(1) lookup. - Time-range queries. Find nodes created or updated within a time window. Sorted timestamp index for efficient range scans.
Pagination
Every collection-returning API supports keyset pagination with limit and after parameters. Cursors are stable across concurrent writes. No offset-based pagination, no skipping, no missed records.
Retention and pruning
- Manual prune. Drop nodes older than X, below weight Y, or matching type Z. Incident edges cascade automatically.
- Named prune policies. Register policies like
"short_term_memory"that run automatically during compaction. Nodes matching any policy are invisible to reads immediately (lazy expiration) and cleaned up during the next compaction pass.
Storage engine
- Write-ahead log. Every mutation hits the WAL before the memtable. Crash recovery replays the WAL on startup.
- Configurable durability.
Immediatemode fsyncs every write for maximum safety.GroupCommitmode (default) batches fsyncs on a 10ms timer for ~20x better write throughput with at most one timer interval of data at risk. - Background compaction. Segments are merged automatically when thresholds are met. Compaction runs on a background thread and never blocks reads or writes. Uses metadata sidecars for fast filtered merging without full record decoding.
- mmap'd reads. Immutable segments are memory-mapped. The OS page cache handles caching. Reads never block writes.
- Portable databases. Each database is a self-contained directory.
cp -r ./my-db /backup/my-dband you're done.
How it works
OverGraph uses a log-structured merge tree (LSM) storage engine, similar in spirit to how LevelDB and RocksDB work, but purpose-built from scratch in pure Rust.
Write path: Mutations are appended to a write-ahead log and applied to an in-memory table (the memtable). When the memtable gets large enough, it's frozen and flushed to disk as an immutable segment with pre-built indexes.
Read path: Queries check the memtable first (freshest data), then scan immutable segments from newest to oldest. A K-way merge combines results across all sources with early termination for paginated queries.
Compaction: A background thread periodically merges older segments together, applying tombstones, prune policies, and deduplication. The compaction path uses metadata sidecars to plan merges and raw-copies winning records without deserializing them, then rebuilds all indexes from metadata. This keeps compaction fast even on large datasets.
On-disk layout:
my-graph/
manifest.current # atomic checkpoint (JSON)
data.wal # append-only write-ahead log
segments/
seg_0001/
nodes.dat # node records
edges.dat # edge records
adj_out.idx # outgoing adjacency index
adj_in.idx # incoming adjacency index
key_index.dat # (type_id, key) -> node_id
type_index.dat # type_id -> [id...]
tombstones.dat # deleted IDs
seg_0002/
...
For a deeper dive, see the architecture overview.
API reference
- Rust: Run
cargo doc --opento browse the full rustdoc. All public types and methods are documented. - Node.js: See the TypeScript declarations for the complete API surface. Every method has a sync and async variant.
- Python: See the type stubs for the complete API surface. Both sync (
OverGraph) and async (AsyncOverGraph) classes are available.
Running the benchmarks
# Rust
# Node.js
# Python
Benchmark methodology, FAQ, and reproducibility instructions are in docs/04-quality/.
Building from source
# Rust core
# Node.js connector
# Python connector
Contributing
See CONTRIBUTING.md for build instructions, coding conventions, and how to submit a pull request.
License
Licensed under either of
at your option.