kyu-graph
High-performance embedded property graph database for Rust.
KyuGraph is a pure-Rust embedded graph database implementing the openCypher query language. It uses columnar storage, vectorized execution, and optional Cranelift JIT compilation for analytical graph workloads.
Features
- openCypher queries —
MATCH,CREATE,SET,DELETE,MERGE,WITH,ORDER BY,COPY FROM, and more - In-memory or persistent — zero-config in-memory mode, or durable on-disk storage with WAL and automatic checkpointing
- Parameterized queries — safe
$paramplaceholders resolved at bind time, with JSON construction helpers - Columnar engine — cache-friendly node-group layout with selection-vector filtering and morsel-driven pipelines
- Cranelift JIT — filter predicates and projections compiled to native code for up to 22x speedup
- Bulk ingestion —
COPY FROMfor CSV, Parquet, and Arrow IPC - Extension system — pluggable graph algorithms, full-text search, and vector similarity
- Arrow Flight — gRPC interface for remote clients and BI tools
Quick Start
Add kyu-graph to your Cargo.toml:
[]
= "0.1"
In-Memory Database
use ;
let db = in_memory;
let conn = db.connect;
// Create schema
conn.query
.unwrap;
conn.query
.unwrap;
// Insert data
conn.query.unwrap;
conn.query.unwrap;
conn.query
.unwrap;
// Query
let result = conn.query.unwrap;
for row in result.iter_rows
Persistent Database
use Database;
let db = open.unwrap;
let conn = db.connect;
conn.query.unwrap;
// Data is checkpointed to disk automatically.
Parameterized Queries
Pass dynamic values safely via $param placeholders instead of string interpolation:
use ;
let db = in_memory;
let conn = db.connect;
conn.query
.unwrap;
conn.query.unwrap;
// From the json! macro
let ctx = with_params_json;
// From a JSON string (useful for config files or HTTP request bodies)
let ctx = with_params_str.unwrap;
// Or use HashMap<String, TypedValue> directly
use HashMap;
use TypedValue;
let mut params = new;
params.insert;
let result = conn.query_with_params.unwrap;
Environment Bindings
For automation pipelines, env() lookups resolve from an environment map at bind time:
use ;
use HashMap;
use TypedValue;
let db = in_memory;
let conn = db.connect;
conn.query.unwrap;
let params = new;
let mut env = new;
env.insert;
let result = conn.execute;
Bulk Data Import
use Database;
let db = in_memory;
let conn = db.connect;
conn.query.unwrap;
conn.query.unwrap;
Delta Fast Path
For high-throughput ingestion (agentic code graphs, document pipelines), apply_delta provides conflict-free idempotent upserts that bypass OCC:
use ;
let db = in_memory;
let conn = db.connect;
conn.query.unwrap;
conn.query.unwrap;
let batch = new
.upsert_node
.upsert_node
.upsert_edge
.build;
let stats = conn.apply_delta.unwrap;
println!; // nodes: +2/~0/-0, edges: +1/~0/-0, ...
Semantics: last-write-wins on timestamp. Replaying the same batch is a no-op (idempotent). Use DeltaBatch::with_vector_clock() for causal ordering between workers. See kyu-delta for full API.
Extensions
KyuGraph ships with pluggable extensions. Register them on the database before creating connections:
use ;
let mut db = in_memory;
// db.register_extension(Box::new(ext_algo::AlgoExtension));
let conn = db.connect;
Procedures are invoked with CALL:
CALL algo.pageRank(0.85, 20, 0.000001)
CALL fts.search('graph database', 5)
ext-algo — Graph Algorithms
| Call | Returns | Description |
|---|---|---|
CALL algo.pageRank(damping, max_iter, tol) |
node_id INT64, rank DOUBLE |
PageRank centrality (defaults: 0.85, 20, 1e-6) |
CALL algo.wcc() |
node_id INT64, component INT64 |
Weakly connected components |
CALL algo.betweenness() |
node_id INT64, centrality DOUBLE |
Betweenness centrality |
ext-fts — Full-Text Search
Built on Tantivy with BM25 ranking.
| Call | Returns | Description |
|---|---|---|
CALL fts.add(content) |
doc_id INT64 |
Index a document |
CALL fts.search(query, limit) |
doc_id INT64, score DOUBLE, snippet STRING |
BM25-ranked search (default limit=10) |
CALL fts.clear() |
status STRING |
Reset the index |
ext-vector — Vector Similarity Search
HNSW index with SIMD-accelerated distance computation (NEON on ARM, AVX2 on x86).
| Call | Returns | Description |
|---|---|---|
CALL vector.build(dim, metric) |
status STRING |
Create an index ('l2' or 'cosine') |
CALL vector.add(id, vector_csv) |
status STRING |
Insert a vector (comma-separated floats) |
CALL vector.search(query_csv, k) |
id INT64, distance DOUBLE |
Approximate k-nearest-neighbor search |
ext-json — JSON Functions
SIMD-accelerated JSON scalar functions (powered by simd-json) usable in any expression context:
RETURN json_extract(doc.data, '$.name') AS name
| Function | Signature | Description |
|---|---|---|
json_extract |
(json STRING, path STRING) -> value |
Extract a value by JSONPath |
json_valid |
(json STRING) -> BOOL |
Check if a string is valid JSON |
json_type |
(json STRING) -> STRING |
Return the JSON type name |
json_keys |
(json STRING) -> LIST[STRING] |
Extract top-level object keys |
json_array_length |
(json STRING) -> INT64 |
Count elements in a JSON array |
json_contains |
(json STRING, value STRING) -> BOOL |
Check if a value exists in the JSON |
json_set |
(json STRING, path STRING, value STRING) -> STRING |
Set a value at a JSONPath |
API Overview
| Method | Description |
|---|---|
Database::in_memory() |
Create an in-memory database |
Database::open(path) |
Open or create a persistent database |
db.connect() |
Create a connection |
conn.query(cypher) |
Execute a Cypher statement |
conn.query_with_params(cypher, params) |
Execute with $param bindings |
conn.execute(cypher, params, env) |
Execute with params and env() bindings |
Type System
KyuGraph maps Cypher values to Rust via TypedValue:
| Cypher | Rust (TypedValue) |
|---|---|
INT64 |
TypedValue::Int64(i64) |
DOUBLE |
TypedValue::Double(f64) |
BOOL |
TypedValue::Bool(bool) |
STRING |
TypedValue::String(SmolStr) |
DATE |
TypedValue::Date(i32) |
TIMESTAMP |
TypedValue::Timestamp(i64) |
INTERVAL |
TypedValue::Interval(Interval) |
LIST |
TypedValue::List(Vec<TypedValue>) |
MAP |
TypedValue::Map(Vec<(SmolStr, TypedValue)>) |
NULL |
TypedValue::Null |
Bidirectional conversion with serde_json::Value is supported via From impls.
Multi-Tenant Coordination
For cloud or multi-tenant deployments, the kyu-coord crate provides infrastructure primitives that sit above the database layer:
Application / Agents
│ submit Task
▼
kyu-coord (Coordinator)
│ routes by availability + tenant
┌────┼────┐
Worker Worker Worker ← stateless, scale by adding more
└────┼────┘
│ DeltaBatch / QueryResult
▼
kyu-api (per-tenant Database)
│
kyu-storage / kyu-index / kyu-executor …
[]
= "0.1"
= "0.1"
use ;
use Arc;
use PathBuf;
// Register tenants with isolated configs
let registry = new;
registry.register;
// Shared priority task queue
let queue = new;
queue.push;
// Worker pool pulls tasks and executes them
let pool = new;
pool.shutdown; // graceful drain
Key Types
| Type | Description |
|---|---|
TenantRegistry |
Thread-safe registry mapping tenant IDs to configs (lock-free reads) |
TenantConfig |
Per-tenant S3 bucket, cache directory, memory budget, connection limit |
TaskQueue |
Priority queue with blocking pop_blocking() for workers |
Task |
Unit of work: tenant ID, query string, priority, timestamp |
WorkerPool |
Spawns N threads pulling from a shared TaskQueue |
JIT on Apple Silicon
The Cranelift JIT feature is available but not enabled by default. The official Cranelift releases have an aarch64 PLT relocation bug that causes JIT-compiled code to crash on Apple Silicon. To enable JIT on ARM Macs, add a [patch.crates-io] override pointing to the patched fork:
[]
= { = "0.1", = ["jit"] }
[]
= { = "https://github.com/darmie/wasmtime", = "fix-plt-aarch64", = "cranelift-jit" }
= { = "https://github.com/darmie/wasmtime", = "fix-plt-aarch64", = "cranelift-module" }
= { = "https://github.com/darmie/wasmtime", = "fix-plt-aarch64", = "cranelift-codegen" }
= { = "https://github.com/darmie/wasmtime", = "fix-plt-aarch64", = "cranelift-frontend" }
= { = "https://github.com/darmie/wasmtime", = "fix-plt-aarch64", = "cranelift-native" }
On x86_64 Linux/macOS/Windows, the stock crates.io Cranelift releases work fine — just enable the feature:
[]
= { = "0.1", = ["jit"] }
License
MIT — see LICENSE for details.