# Grust
Grust is a modern property graph API for Rust.
It gives Rust applications one small, backend-neutral way to build, validate,
traverse, and eventually persist graph data. The core model is intentionally
plain:
```text
Graph = nodes + edges
Node = id + label + properties
Edge = optional id + from + to + label + properties
```
That shape is expressive enough for persistent graph databases such as
SurrealDB and HelixDB, but small enough to use in tests, import/export tools,
scrapers, knowledge-graph pipelines, and local in-memory workflows.
Grust is early, but the direction is deliberate: keep graph construction and
domain modeling independent from database query languages. Application code
should build a `grust::Graph`; backend crates should decide how to write or
query that graph.
## Why Grust?
Rust has excellent in-memory graph libraries, especially `petgraph`, but many
applications need a property graph abstraction that maps naturally to graph
databases:
- stable application IDs
- node labels and edge labels
- typed node and edge properties
- backend-neutral graph construction
- optional schema metadata
- traversal expressed as an IR rather than a database query string
- an async store trait for persistence backends
Grust focuses on that persistent property-graph layer. It is not trying to
replace `petgraph` for graph algorithms. A Grust memory backend can use simple
maps today and could use `petgraph` internally later where that helps.
## Current Workspace
```text
crates/
grust/ Public facade package (`grust-graph`) and prelude
grust-cocoindex/ CocoIndex-style graph target-state export adapter
grust-core/ Core model, builder, schema, traversal IR, GraphStore trait
grust-falkor/ FalkorDB writer using Redis GRAPH.QUERY
grust-helix/ HelixDB writer using HTTP or the Rust SDK
grust-lancedb/ LanceDB store using the Rust SDK
grust-memory/ Deterministic in-memory store for tests and local use
grust-pggraph/ PostgreSQL/pgGraph store over universal graph tables
grust-sail/ Sail SparkConnect backend using Spark DataFrames
grust-surreal/ SurrealDB writer using HTTP or the Rust SDK
```
The backend crates expose reads and traversal as they mature behind the same
`GraphStore` APIs instead of leaking backend query languages into application
code.
`grust-cocoindex` is intentionally different: it exports Grust graphs as
CocoIndex-style node and relationship target state so an incremental indexing
flow can propagate changes into a downstream graph or table backend.
## Core Model
The core types live in `grust-core` and are re-exported by `grust`.
```rust
use grust::prelude::*;
pub struct Graph {
pub nodes: Vec<Node>,
pub edges: Vec<Edge>,
}
pub struct Node {
pub id: NodeId,
pub label: Label,
pub props: Props,
}
pub struct Edge {
pub id: Option<EdgeId>,
pub from: NodeId,
pub to: NodeId,
pub label: Label,
pub props: Props,
}
```
Properties are a map of string keys to typed values:
```rust
pub type Props = std::collections::BTreeMap<String, Value>;
pub enum Value {
Null,
Bool(bool),
Int(i64),
Float(f64),
String(String),
StringArray(Vec<String>),
Json(serde_json::Value),
}
```
Edge properties are first-class. This matters because modern graph databases
usually store data on relationships as well as on nodes.
## Quick Start
Use the prelude for the common graph-building API:
```rust
use grust::prelude::*;
let mut graph = GraphBuilder::new();
let talk = graph
.node("Talk", "talk:rust-graph-api")
.prop("title", "A Modern Graph API for Rust")
.prop("abstract", "Building backend-neutral property graphs in Rust.")
.finish();
let speaker = graph
.node("Person", "person:ada")
.prop("name", "Ada Example")
.prop("organization", "Graph Systems Lab")
.finish();
graph
.edge("PRESENTED_BY", &talk, &speaker)
.prop("source", "conference-schedule")
.finish();
let graph = graph.build();
```
The builder deduplicates nodes by `NodeId` and, by default, deduplicates edges
by `(from, label, to)`. If your domain needs multi-edges, use
`EdgePolicy::AllowDuplicates`.
```rust
let mut graph = GraphBuilder::new().edge_policy(EdgePolicy::AllowDuplicates);
```
## In-Memory Store
Enable the `memory` feature to use `MemoryGraphStore` from the public facade:
```toml
[dependencies]
grust = { package = "grust-graph", version = "0.4.0", features = ["memory"] }
```
Then load and traverse a graph:
```rust
use grust::prelude::*;
# async fn example() -> grust::Result<()> {
let mut builder = GraphBuilder::new();
let talk = builder.node("Talk", "talk:rust-graph-api").finish();
let speaker = builder.node("Person", "person:ada").finish();
builder.edge("PRESENTED_BY", &talk, &speaker).finish();
let graph = builder.build();
let store = MemoryGraphStore::new();
store.put_graph(&graph).await?;
let speakers = store
.traverse(
Traversal::from_node("talk:rust-graph-api")
.out("PRESENTED_BY")
.to("Person"),
)
.await?;
assert_eq!(speakers.len(), 1);
# Ok(())
# }
```
## GraphStore
Backends implement `GraphStore`:
```rust
#[async_trait::async_trait]
pub trait GraphStore: Send + Sync {
async fn apply_schema(&self, schema: &GraphSchema) -> Result<()>;
async fn put_node(&self, node: &Node) -> Result<NodeId>;
async fn put_edge(&self, edge: &Edge) -> Result<Option<EdgeId>>;
async fn put_graph(&self, graph: &Graph) -> Result<LoadReport>;
async fn put_typed_graph(&self, schema: &GraphSchema, graph: &Graph) -> Result<LoadReport>;
async fn get_node(&self, id: &NodeId) -> Result<Option<Node>>;
async fn get_edges(&self, query: EdgeQuery) -> Result<Vec<Edge>>;
async fn traverse(&self, traversal: Traversal) -> Result<Vec<Node>>;
}
```
`put_graph` borrows the graph instead of consuming it. That makes retries,
validation, comparison, and multi-backend loads easier.
`put_typed_graph` validates a graph against `GraphSchema`, applies that schema
to the backend, and then writes the graph.
Administrative backends can also implement `GraphAdminStore` for setup and
replacement workflows:
```rust
#[async_trait::async_trait]
pub trait GraphAdminStore: GraphStore {
async fn bootstrap(&self) -> Result<()> {
Ok(())
}
async fn clear(&self) -> Result<()>;
}
```
## Backend Stores
Backend crates are optional facade features:
```toml
[dependencies]
grust = { package = "grust-graph", version = "0.4.0", features = ["falkor", "helix", "lancedb", "pggraph", "sail", "surreal"] }
```
`grust-falkor` writes nodes and edges through Redis/FalkorDB Cypher queries and
supports graph replacement with `GRAPH.DELETE`.
`grust-helix` provides both `HelixHttpGraphStore` and `HelixSdkGraphStore`.
Both batch node and edge writes and use configured labels for replacement.
`grust-cocoindex` converts `Graph` values into serializable node and
relationship states with stable keys, endpoint labels, and plain JSON
properties. It is a sync/export adapter rather than a `GraphStore`.
`grust-lancedb` stores graphs in LanceDB tables using the official Rust SDK,
upserts nodes and edges with `merge_insert`, supports backend-neutral reads and
bounded traversal over universal node/edge tables, and can mirror schema-labeled
nodes and edges into typed Arrow tables.
`grust-pggraph` stores Grust graphs in universal PostgreSQL tables, registers
those tables with the pgGraph extension, supports SQL-backed reads/traversal,
can build a pgGraph projection for graph-index experiments, and lowers
`GraphSchema` into typed label views and expression indexes.
`grust-sail` stores graphs as Spark DataFrames through Sail's SparkConnect
server, lowers traversal IR to Spark SQL joins, and can mirror schema-labeled
rows into typed Delta tables.
`grust-surreal` provides both `SurrealHttpGraphStore` and
`SurrealSdkGraphStore`. It bootstraps namespaces/databases, maps labels and
relationships to Surreal tables, upserts nodes, and relates edges through
relation tables. `GraphSchema` lowers to Surreal `DEFINE TABLE` and
`DEFINE FIELD` statements.
## Traversal IR
Grust does not expose SurrealQL, HQL, Cypher, or SQL in the common layer. It
uses a small traversal IR:
```rust
let traversal = Traversal::from_node("talk:rust-graph-api")
.out("PRESENTED_BY")
.to("Person")
.limit(10);
```
Backends are responsible for lowering that IR into their native query language
or SDK calls.
Conceptually:
```text
Grust: talk -[PRESENTED_BY]-> Person
Surreal: talk:id->presented_by->person
Helix: N<Talk>(id)::Out<PresentedBy>
pgGraph: SQL over grust_nodes/grust_edges, optionally graph.build()
Sail: Spark SQL joins over grust_nodes/grust_edges
LanceDB: SDK table filters over grust_nodes/grust_edges
Memory: adjacency-map lookup
```
## Schema Layer
The schema model is optional. It exists for backends that benefit from
declarations, type generation, indexes, or validation:
```rust
pub struct GraphSchema {
pub nodes: Vec<NodeType>,
pub edges: Vec<EdgeType>,
}
pub struct NodeType {
pub label: Label,
pub fields: Vec<Field>,
}
pub struct EdgeType {
pub label: Label,
pub from: Vec<Label>,
pub to: Vec<Label>,
pub fields: Vec<Field>,
pub directed: bool,
pub uniqueness: EdgeUniqueness,
}
```
`GraphSchema::builder()` and `Field::required` / `Field::optional` provide a
compact way to declare this structure:
```rust
let schema = GraphSchema::builder()
.node(
"Person",
vec![
Field::required("name", FieldType::String),
Field::optional("age", FieldType::Int),
],
)
.edge(
"WORKS_ON",
vec![Label::new("Person")],
vec![Label::new("Project")],
vec![Field::required("role", FieldType::String)],
)
.build();
```
The current backends use schema differently:
- SurrealDB can run schemaless, but schema can define record tables, relation
tables, and typed fields.
- HelixDB validates schema names through the dynamic-query backend while future
schema-file generation remains backend-specific.
- pgGraph keeps universal tables while exposing typed label views and indexes.
- Sail keeps universal DataFrames while mirroring rows into typed Delta tables.
- LanceDB keeps universal tables while mirroring rows into typed Arrow tables.
- FalkorDB uses schema declarations to create label/property indexes.
- Memory uses schema for validation tests and local conformance.
## Backend Mapping
### SurrealDB
SurrealDB maps naturally to Grust's model:
```text
Node label -> table
Node id -> record id or stored property
Edge label -> relation table
Edge properties -> relation record fields
Traversal -> arrow traversal
```
Example conceptual write:
```text
RELATE talk:rust_graph_api->presented_by->person:ada CONTENT {
source: "conference-schedule"
}
```
### HelixDB
HelixDB is schema and query oriented:
```text
Node label -> node type
Edge label -> edge type
Node properties -> node fields/properties
Edge properties -> edge Properties block
Traversal -> typed Out/In traversal
```
The Helix backend should hide generated or named queries behind `GraphStore`
so application code remains backend-neutral.
### pgGraph
pgGraph keeps PostgreSQL as the source of truth and builds a derived graph
projection for bounded traversal. The Grust backend starts with universal
tables:
```text
grust_nodes(id, label, props)
grust_edges(id, from_id, to_id, label, props)
```
`PgGraphStore` implements ordinary reads and Grust traversal with SQL over
those tables. `GraphAdminStore::bootstrap()` creates the tables, installs the
`graph` extension, and registers the universal edge table with pgGraph using
the edge `label` column as the dynamic relationship type.
### Sail / SparkConnect
Sail maps Grust's model to two Delta Lake tables and lowers the traversal IR
to multi-JOIN Spark SQL:
```text
Node id / label / props -> row in grust_nodes
Edge endpoints / type -> row in grust_edges (with src_label, dst_label)
put_node / put_edge -> MERGE INTO (Delta upsert)
get_node -> SELECT … WHERE id = ? LIMIT 1
traverse -> multi-JOIN Spark SQL, one JOIN pair per step
```
Example traversal SQL for `.out("PRESENTED_BY").to("Talk")`:
```text
SELECT n1.id, n1.label, n1.props
FROM grust_nodes n0
JOIN grust_edges e0 ON e0.src_id = n0.id
AND e0.edge_type = 'PRESENTED_BY'
JOIN grust_nodes n1 ON n1.id = e0.dst_id
AND n1.label = 'Talk'
WHERE n0.id = 'person:ada'
```
`GraphAdminStore::bootstrap()` creates the tables with `USING delta`.
`clear()` issues `DELETE FROM` on both tables.
### LanceDB
LanceDB maps Grust's graph model to two Lance tables using Arrow batches and
the Rust SDK:
```text
Node id / label / props -> row in grust_nodes
Edge key / endpoints -> row in grust_edges
put_node / put_edge -> merge_insert upsert
get_node / get_edges -> SDK query filters
traverse -> repeated edge/node filters per IR step
```
`LanceDbGraphStore::connect()` opens a local or remote LanceDB URI,
`GraphAdminStore::bootstrap()` creates empty universal tables when needed, and
`clear()` drops and recreates them. Node IDs are the node upsert key. Edges use
an explicit edge ID when present and otherwise use `(from, label, to)` as a
stable key. Properties are stored as JSON text for backend-neutral reads today;
typed property columns and vector indexes can be layered on through schema and
backend-specific extension traits later.
## Design Principles
- Keep graph data independent from database query languages.
- Make IDs explicit and stable.
- Treat edge properties as first-class data.
- Prefer typed values over ad hoc JSON strings.
- Keep schema optional.
- Keep traversal backend-neutral.
- Keep backend-specific capabilities as extension traits when they appear.
- Make the in-memory backend deterministic and boring, especially for tests.
## Status
Grust is pre-release.
Implemented:
- core property graph model
- typed IDs and labels
- typed property values
- graph builder
- schema structs
- traversal structs and fluent helpers
- async `GraphStore` trait
- CocoIndex-style graph export adapter
- in-memory backend
- FalkorDB, HelixDB, LanceDB, pgGraph, Sail, and SurrealDB backend crates
Planned:
- richer validation in `GraphBuilder`
- import/export helpers
- backend-specific schema lowering
- more traversal result shapes
- query and index helpers
## Development
Run the full test suite:
```sh
cargo test
```
Format the workspace:
```sh
cargo fmt
```
Run checks for all crates:
```sh
cargo check --workspace --all-targets
```
## License
Grust is dual-licensed under either of:
- Apache License, Version 2.0
- MIT license
Choose either license when using, modifying, or distributing Grust.