# mindgraph
[](https://crates.io/crates/mindgraph)
[](https://docs.rs/mindgraph)
[](LICENSE)
[](https://github.com/shuruheel/mindgraph-rs/actions/workflows/ci.yml)
A structured semantic memory graph for agentic systems, built in Rust with [CozoDB](https://www.cozodb.org/) as the embedded Datalog storage engine.
## Overview
`mindgraph` provides a typed, versioned knowledge graph organized into six conceptual layers:
| **Reality** | Raw observations & sources | Source, Snippet, Entity, Observation |
| **Epistemic** | Reasoning & knowledge | Claim, Evidence, Hypothesis, Theory, Concept |
| **Intent** | Goals & decisions | Goal, Project, Decision, Option, Constraint |
| **Action** | Affordances & workflows | Affordance, Flow, FlowStep, Control |
| **Memory** | Persistence & recall | Session, Trace, Summary, Preference |
| **Agent** | Control plane | Agent, Task, Plan, Approval, Policy |
The graph supports **48 built-in node types** and **70 built-in edge types**, each with type-safe property structs. User-defined custom types are also supported via the `CustomNodeType` trait.
## Features
- **Type-safe schema** -- 48 node types and 70 edge types as Rust enums with typed props, plus extensible `Custom(String)` variants
- **CozoDB storage** -- Embedded Datalog database with SQLite persistence or in-memory mode
- **Full-text search** -- FTS indices on node labels and summaries with scoring and type/layer filters
- **Structured filtering** -- `NodeFilter` builder for type (single or multi-type), layer, label substring, prop value, and confidence range queries
- **Graph traversal** -- Optimized 2-query BFS, reasoning chains, neighborhoods, path finding, subgraph extraction, weight threshold filtering
- **Builder pattern** -- Ergonomic fluent API for node and edge updates
- **Pagination** -- Bounded result sets with `has_more` detection for production use
- **Batch operations** -- Multi-row inserts (chunked at 100) and `GraphOp`-based batch apply
- **Versioning** -- Append-only version history for both nodes and edges, with point-in-time snapshots
- **Tombstone cascade** -- Soft-delete a node and all connected edges in one call
- **Data lifecycle** -- `purge_tombstoned()` for hard-deleting old data; `export()`/`import()` for graph snapshots; `backup()`/`restore_backup()` for file-level backups
- **Provenance tracking** -- Link extracted knowledge to its sources
- **Entity resolution** -- Alias table, fuzzy matching, `merge_entities()` for deduplication
- **Multi-agent support** -- `AgentHandle` provides scoped per-agent identity; all mutations auto-set `changed_by`, with `sub_agent()` for hierarchical agents
- **Custom node/edge types** -- `CustomNodeType` trait for compile-time registration of user-defined types with typed ser/de
- **Default agent identity** -- `set_default_agent()` reduces boilerplate in builder patterns
- **Confidence & salience** -- Validated 0.0-1.0 scores on all nodes and edges
- **Thread safety** -- `MindGraph` is `Send + Sync`, safe to share via `Arc<MindGraph>` or `into_shared()`
- **Async support** -- Optional `AsyncMindGraph` wrapper for tokio runtimes (feature flag: `async`) with all methods
- **Server-side query filtering** -- Query patterns push filtering into CozoDB Datalog for efficient large-graph queries
- **Embedding/vector search** -- Pluggable `EmbeddingProvider` (sync) and `AsyncEmbeddingProvider` (native async) traits, CozoDB HNSW indices, `semantic_search()` with cosine distance
- **Salience decay** -- Exponential decay with configurable half-life via `decay_salience()`, plus `auto_tombstone()` for cleanup
- **Event subscriptions** -- `on_change()` callbacks, `on_change_filtered()` with `EventFilter`, and `watch()` async streaming via broadcast channels
- **Convenience constructors** -- `add_claim()`, `add_entity()`, `add_goal()`, `add_observation()`, `add_session()`, `add_preference()`, `add_summary()`, `add_link()`
- **Graph statistics** -- `stats()` returns comprehensive `GraphStats` with counts by type/layer
- **Enhanced query composition** -- OR filters, time ranges, salience ranges, prop conditions, graph-aware `connected_to` filter
- **Typed export/import** -- `export_typed()` / `import_typed()` with `TypedSnapshot` for structured graph transfer
- **Validated batch** -- `validate_batch()` pre-validates operations before `apply_validated_batch()`
- **OpenAI embeddings** -- Optional `openai` feature flag for `OpenAIEmbeddings` provider via `ureq`
- **Tracing integration** -- Optional `tracing` feature flag for observability instrumentation on key graph methods
- **Production-safe async** -- `AsyncMindGraph` returns `Error::TaskJoin` instead of panicking on spawn failures
## Quick Start
```rust
use mindgraph::*;
fn main() -> Result<()> {
// Open a persistent graph (SQLite-backed)
let graph = MindGraph::open("my_graph.db")?;
// Or in-memory for testing:
// let graph = MindGraph::open_in_memory()?;
// Add a claim node
let claim = graph.add_node(
CreateNode::new("Rust is memory safe", NodeProps::Claim(ClaimProps {
content: "Rust is memory safe".into(),
claim_type: Some("factual".into()),
..Default::default()
}))
.confidence(Confidence::new(0.95)?)
)?;
// Add supporting evidence
let evidence = graph.add_node(
CreateNode::new("Borrow checker", NodeProps::Evidence(EvidenceProps {
description: "Borrow checker prevents dangling pointers".into(),
..Default::default()
}))
)?;
// Connect with a typed edge (evidence supports claim)
graph.add_edge(CreateEdge::new(
evidence.uid.clone(),
claim.uid.clone(),
EdgeProps::Supports { strength: Some(0.9), support_type: Some("empirical".into()) },
))?;
// Update using the builder pattern
graph.update(&claim.uid)
.confidence(Confidence::new(0.99)?)
.changed_by("agent-1")
.reason("strong supporting evidence")
.apply()?;
// Traverse the reasoning chain (includes start node at depth 0)
let chain = graph.reasoning_chain(&claim.uid, 5)?;
assert_eq!(chain[0].node_uid, claim.uid); // start node
assert_eq!(chain[0].depth, 0);
Ok(())
}
```
## Async Usage
Enable the `async` feature for tokio integration:
```toml
[dependencies]
mindgraph = { version = "0.6", features = ["async"] }
```
```rust
use mindgraph::*;
#[tokio::main]
async fn main() -> Result<()> {
let graph = AsyncMindGraph::open_in_memory().await?;
let node = graph.add_node(
CreateNode::new("Async claim", NodeProps::Claim(ClaimProps {
content: "Works in async contexts".into(),
..Default::default()
}))
).await?;
// AsyncMindGraph is Clone (wraps Arc<MindGraph>),
// so it can be shared across tasks
let g = graph.clone();
let handle = tokio::spawn(async move {
g.count_nodes(NodeType::Claim).await
});
assert_eq!(handle.await.unwrap()?, 1);
// For updates, use update_node/update_edge directly
// (builder types hold references and can't cross await points)
graph.update_node(
node.uid,
Some("Updated claim".into()),
None, None, None, None,
"agent".into(), "async update".into(),
).await?;
Ok(())
}
```
## Custom Types
Define your own node types without forking the crate:
```rust
use mindgraph::*;
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
struct CodeSnippet {
language: String,
source: String,
}
impl CustomNodeType for CodeSnippet {
fn type_name() -> &'static str { "CodeSnippet" }
fn layer() -> Layer { Layer::Reality }
}
let graph = MindGraph::open_in_memory().unwrap();
let node = graph.add_custom_node("hello.rs", CodeSnippet {
language: "rust".into(),
source: "fn main() {}".into(),
}).unwrap();
// Type-safe deserialization
let snippet: CodeSnippet = node.custom_props().unwrap();
assert_eq!(snippet.language, "rust");
```
**Breaking change in v0.6:** `NodeType` and `EdgeType` no longer implement `Copy` (they implement `Clone`). Add `.clone()` where needed.
## Multi-Agent Support
Use `AgentHandle` to scope operations to a specific agent identity:
```rust
use std::sync::Arc;
use mindgraph::*;
let graph = Arc::new(MindGraph::open_in_memory().unwrap());
let alice = graph.agent("alice");
// All mutations auto-set changed_by to "alice"
let node = alice.add_entity("My Entity", "test").unwrap();
let my_nodes = alice.my_nodes().unwrap();
assert_eq!(my_nodes.len(), 1);
// Sub-agents for hierarchical systems
let sub = alice.sub_agent("alice-summarizer");
assert_eq!(sub.parent_agent(), Some("alice"));
```
## Event Streaming
Filter and stream graph events (requires `async` feature):
```rust
use mindgraph::*;
let graph = MindGraph::open_in_memory().unwrap();
// Sync filtered callback
let filter = EventFilter::new().event_kinds(vec![EventKind::NodeAdded]);
});
```
With async streaming:
```rust
// AsyncMindGraph::watch() returns a WatchStream
let stream = async_graph.watch(
EventFilter::new()
.event_kinds(vec![EventKind::NodeAdded])
.layers(vec![Layer::Epistemic])
);
// stream.recv().await returns filtered events
```
## Tracing
Enable the `tracing` feature for observability:
```toml
[dependencies]
mindgraph = { version = "0.6", features = ["tracing"] }
```
Key methods (`add_node`, `search`, `find_nodes`, `reachable`, `stats`, etc.) are instrumented with `tracing::instrument`. Combine with `tracing-subscriber` to get structured logs.
## API Reference
### MindGraph
The main entry point. All operations go through this struct. It is `Send + Sync` and can be shared across threads via `Arc<MindGraph>`.
**Construction:**
| `MindGraph::open(path)` | Open a persistent SQLite-backed graph |
| `MindGraph::open_in_memory()` | Create an in-memory graph (for testing) |
| `into_shared()` | Wrap in `Arc<MindGraph>` for sharing across threads |
| `set_default_agent(name)` | Set default agent identity for builder fallbacks |
| `default_agent()` | Get current default agent identity |
| `storage()` | Access the underlying `CozoStorage` for advanced Datalog queries |
| `agent(name)` | Create a scoped `AgentHandle` (requires `Arc<MindGraph>`) |
| `nodes_by_agent(agent_id)` | Get all live nodes created by a specific agent |
**Convenience constructors:**
| `add_claim(label, content, confidence)` | Add a Claim node with defaults |
| `add_entity(label, entity_type)` | Add an Entity node with defaults |
| `add_goal(label, priority)` | Add a Goal node with defaults |
| `add_observation(label, description)` | Add an Observation node with defaults |
| `add_session(label, focus)` | Add a Session node with defaults |
| `add_preference(label, key, value)` | Add a Preference node with defaults |
| `add_summary(label, content)` | Add a Summary node with defaults |
| `add_memory(label, content)` | **Deprecated** -- use `add_session()` instead |
| `add_link(from, to, edge_type)` | Add an edge with default props for the edge type |
| `add_custom_node::<T>(label, props)` | Add a node with a user-defined custom type |
**Node operations:**
| `add_node(CreateNode)` | Add a new node (auto-assigns UID, version 1) |
| `add_nodes_batch(Vec<CreateNode>)` | Bulk insert multiple nodes (multi-row, chunked at 100) |
| `get_node(uid)` | Get a node by UID, returns `None` if not found |
| `get_live_node(uid)` | Get a node, errors if not found or tombstoned |
| `update_node(uid, ...)` | Update fields directly (increments version) |
| `update(uid)` | Begin a builder-pattern update, finalize with `.apply()` |
| `node_exists(uid)` | Check if a live node exists (O(1), no deserialization) |
| `count_nodes(node_type)` | Count live nodes of a given type |
| `count_nodes_in_layer(layer)` | Count live nodes in a given layer |
**Edge operations:**
| `add_edge(CreateEdge)` | Add a new edge (validates both endpoints are live) |
| `add_edges_batch(Vec<CreateEdge>)` | Bulk insert edges (validates all endpoints first) |
| `get_edge(uid)` | Get an edge by UID, returns `None` if not found |
| `get_live_edge(uid)` | Get an edge, errors if not found or tombstoned |
| `update_edge(uid, ...)` | Update fields directly (increments version) |
| `update_edge_builder(uid)` | Begin a builder-pattern update, finalize with `.apply()` |
| `edges_from(uid, edge_type?)` | Get all live edges from a node, optionally filtered by type |
| `edges_to(uid, edge_type?)` | Get all live edges to a node, optionally filtered by type |
| `count_edges(edge_type)` | Count live edges of a given type |
| `get_edge_between(from, to, edge_type?)` | Find edges between two nodes, optionally by type |
**Traversal:**
| `reachable(uid, opts)` | BFS to find all nodes reachable through filtered edge types |
| `reasoning_chain(uid, max_depth)` | Traverse epistemic edges; returns start node at depth 0 |
| `neighborhood(uid, depth)` | Get all nodes within `depth` hops in any direction |
| `find_path(from, to, opts)` | Find the actual shortest path between two nodes |
| `subgraph(uid, opts)` | Extract all reachable nodes and their interconnecting edges |
**Tombstone operations:**
| `tombstone(uid, reason, by)` | Soft-delete a node with audit trail |
| `restore(uid)` | Restore a tombstoned node |
| `tombstone_edge(uid, reason, by)` | Soft-delete an edge with audit trail |
| `restore_edge(uid)` | Restore a tombstoned edge |
| `tombstone_cascade(uid, reason, by)` | Tombstone a node **and** all connected edges |
**Version history:**
| `node_history(uid)` | Get full version history (create, updates, tombstone) |
| `edge_history(uid)` | Get full version history for an edge |
| `node_at_version(uid, version)` | Get the JSON snapshot at a specific version number |
**Search & filtering:**
| `search(query, opts)` | Full-text search across labels/summaries with FTS scoring |
| `find_nodes(filter)` | Structured filtering by type, layer, label, props, confidence |
| `find_nodes_paginated(filter)` | Same as above with `Page<GraphNode>` pagination metadata |
**Data lifecycle:**
| `purge_tombstoned(older_than)` | Hard-delete tombstoned data (and associated versions/aliases/provenance) |
| `export()` | Export entire graph as a `GraphSnapshot` |
| `import(snapshot)` | Import a graph snapshot (additive merge) |
| `backup(path)` | Backup database to a file |
| `restore_backup(path)` | Restore database from a backup file |
**Provenance & entity resolution:**
| `add_provenance(record)` | Link a node to its extraction source |
| `add_alias(text, canonical_uid, score)` | Register an alias for entity resolution |
| `resolve_alias(text)` | Resolve text to a canonical entity UID |
| `aliases_for(uid)` | List all aliases for a canonical entity, sorted by score |
| `merge_entities(keep, merge, reason, by)` | Merge two entities: retarget edges/aliases, tombstone duplicate |
| `fuzzy_resolve(text, limit)` | Substring match on alias text |
**Embedding/vector search:**
| `configure_embeddings(dimension)` | Initialize HNSW index for semantic search |
| `embedding_dimension()` | Get configured embedding dimension (None if not configured) |
| `set_embedding(uid, vec)` | Store an embedding vector for a node |
| `get_embedding(uid)` | Retrieve a node's embedding vector |
| `delete_embedding(uid)` | Remove a node's embedding |
| `semantic_search(query_vec, k)` | Find k nearest neighbors by cosine distance (auto-compensates for tombstoned nodes) |
| `embed_node(uid, provider)` | Generate and store embedding via `EmbeddingProvider` |
| `embed_nodes(uids, provider)` | Bulk embed multiple nodes via `embed_batch()`, skips tombstoned |
| `semantic_search_text(query, k, provider)` | Embed query text and search |
**Salience decay:**
| `decay_salience(half_life_secs)` | Apply exponential decay to all live nodes |
| `auto_tombstone(min_salience, min_age_secs)` | Tombstone old nodes below salience threshold |
**Event subscriptions:**
| `on_change(callback)` | Subscribe to graph mutation events, returns `SubscriptionId` |
| `on_change_filtered(filter, callback)` | Subscribe with `EventFilter` for selective events |
| `watch(filter)` | (async feature) Create a `WatchStream` for async event streaming |
| `unsubscribe(id)` | Remove a subscription |
**Statistics:**
| `stats()` | Get comprehensive `GraphStats` (counts by type, layer, embeddings, etc.) |
**Utility:**
| `list_nodes(pagination)` | List all live nodes with pagination |
| `clear()` | Delete all data from all relations (for testing/reset) |
**Typed export/import:**
| `export_typed()` | Export live graph as `TypedSnapshot` with structured nodes/edges/embeddings |
| `import_typed(snapshot)` | Import a typed snapshot (additive merge, skips existing UIDs, restores embeddings) |
**Batch operations (GraphOp):**
| `batch_apply(ops)` | Execute a batch of AddNode/AddEdge/Tombstone operations |
| `validate_batch(ops)` | Pre-validate a batch (auto-assigns UIDs, tracks cross-refs), returns `ValidatedBatch` |
| `apply_validated_batch(batch)` | Apply a pre-validated batch |
**Query patterns (server-side filtered via CozoDB Datalog):**
| `active_goals()` | Goals with `status == "active"`, ranked by priority |
| `pending_approvals()` | Approvals with `status == "pending"`, sorted by requested_at |
| `unresolved_contradictions()` | CONTRADICTS edges with `resolution_status == "unresolved"` |
| `open_decisions()` | Decisions with status `"open"` or `"deliberating"` |
| `open_questions()` | OpenQuestions with status `"open"` or `"partially_addressed"` |
| `weak_claims(threshold)` | Claims with `confidence < threshold`, sorted ascending |
| `nodes_in_layer(layer)` | All live nodes in a given layer |
**Paginated variants:**
| `nodes_in_layer_paginated(layer, page)` | Paginated nodes in a layer |
| `edges_from_paginated(uid, edge_type?, page)` | Paginated edges from a node |
| `edges_to_paginated(uid, edge_type?, page)` | Paginated edges to a node |
| `weak_claims_paginated(threshold, page)` | Paginated weak claims |
| `active_goals_paginated(page)` | Paginated active goals, sorted by priority in DB |
### AsyncMindGraph
Available behind the `async` feature flag. Wraps `Arc<MindGraph>` and exposes async versions of all methods via `tokio::task::spawn_blocking`.
| `AsyncMindGraph::open(path)` | Async open |
| `AsyncMindGraph::open_in_memory()` | Async in-memory open |
| `AsyncMindGraph::from_sync(graph)` | Wrap an existing `MindGraph` |
| `inner()` | Access the underlying `&MindGraph` |
`AsyncMindGraph` is `Clone` and can be shared across tokio tasks. All methods from `MindGraph` are available as async variants, taking owned arguments instead of references.
**Note:** The builder types (`NodeUpdate`, `EdgeUpdate`) hold references and cannot cross `.await` points. Use `update_node()` / `update_edge()` directly in async code.
### Builders
**CreateNode** -- built with `CreateNode::new(label, props)`, with optional chained methods:
- `.summary(text)` -- set the node summary
- `.confidence(Confidence)` -- set epistemic certainty (default 1.0)
- `.salience(Salience)` -- set contextual relevance (default 0.5)
- `.privacy(PrivacyLevel)` -- set privacy level (default Private)
- `.with_uid(Uid)` -- pre-assign a UID (for cross-referencing in `validate_batch`)
**CreateEdge** -- built with `CreateEdge::new(from_uid, to_uid, props)`, with optional chained methods:
- `.confidence(Confidence)` -- set edge confidence (default 1.0)
- `.weight(f64)` -- set edge weight (default 0.5)
**NodeUpdate** -- started with `graph.update(uid)`:
```rust
graph.update(&uid)
.label("Updated label")
.summary("New summary")
.confidence(Confidence::new(0.9)?)
.salience(Salience::new(0.8)?)
.changed_by("agent-1")
.reason("new evidence")
.apply()?;
```
**EdgeUpdate** -- started with `graph.update_edge_builder(uid)`:
```rust
graph.update_edge_builder(&edge_uid)
.weight(0.95)
.confidence(Confidence::new(0.9)?)
.changed_by("agent-2")
.reason("re-evaluated")
.apply()?;
```
### Traversal
Control traversal behavior with `TraversalOptions`:
```rust
use mindgraph::*;
let opts = TraversalOptions {
direction: Direction::Both, // Outgoing, Incoming, or Both
edge_types: Some(vec![ // None = follow all edge types
EdgeType::Supports,
EdgeType::Refutes,
]),
max_depth: 5, // BFS depth limit
weight_threshold: Some(0.5), // None = no weight filter
};
let steps = graph.reachable(&start_uid, &opts)?;
for step in &steps {
// node_type is NodeType enum, edge_type is Option<EdgeType>
println!("depth {}: {} ({:?}) via {:?}, parent: {:?}",
step.depth, step.label, step.node_type, step.edge_type, step.parent_uid);
}
```
`PathStep` includes `parent_uid` for backtracking. `find_path` uses this to return only the nodes on the actual shortest path (not all reachable nodes).
### Pagination
Use `Pagination` for bounded result sets:
```rust
use mindgraph::*;
// First page of 10 items
let page1 = graph.nodes_in_layer_paginated(Layer::Epistemic, Pagination::first(10))?;
assert!(page1.items.len() <= 10);
// Next page
if page1.has_more {
let page2 = graph.nodes_in_layer_paginated(
Layer::Epistemic,
Pagination { limit: 10, offset: 10 },
)?;
}
```
### Core Types
| `Uid` | UUID v4 identifier for nodes and edges (inner field is private) |
| `Confidence` | Validated f64 in 0.0-1.0 (epistemic certainty) |
| `Salience` | Validated f64 in 0.0-1.0 (contextual relevance, decays over time) |
| `PrivacyLevel` | `Private`, `Shared`, or `Public` |
| `Timestamp` | Unix timestamp as f64 |
| `NodeProps` | Discriminated union of all 48 node property structs |
| `EdgeProps` | Discriminated union of all 70 edge property structs |
### Schema
**48 node types** across 6 layers:
| Reality (4) | Source, Snippet, Entity, Observation |
| Epistemic (24) | Claim, Evidence, Warrant, Argument, Hypothesis, Theory, Paradigm, Anomaly, Method, Experiment, Concept, Assumption, Question, OpenQuestion, Analogy, Pattern, Mechanism, Model, ModelEvaluation, InferenceChain, SensitivityAnalysis, ReasoningStrategy, Theorem, Equation |
| Intent (6) | Goal, Project, Decision, Option, Constraint, Milestone |
| Action (5) | Affordance, Flow, FlowStep, Control, RiskAssessment |
| Memory (5) | Session, Trace, Summary, Preference, MemoryPolicy |
| Agent (8) | Agent, Task, Plan, PlanStep, Approval, Policy, Execution, SafetyBudget |
**70 edge types** across categories:
| Structural (5) | ExtractedFrom, PartOf, HasPart, InstanceOf, Contains |
| Epistemic (31) | Supports, Refutes, Justifies, HasPremise, HasConclusion, HasWarrant, Rebuts, Assumes, Tests, Produces, UsesMethod, Addresses, Generates, Extends, Supersedes, Contradicts, AnomalousTo, AnalogousTo, Instantiates, TransfersTo, Evaluates, Outperforms, FailsOn, HasChainStep, PropagatesUncertaintyTo, SensitiveTo, RobustAcross, Describes, DerivedFrom, ReliesOn, ProvenBy |
| Provenance (5) | ProposedBy, AuthoredBy, CitedBy, BelievedBy, ConsensusIn |
| Intent (9) | DecomposesInto, MotivatedBy, HasOption, DecidedOn, ConstrainedBy, Blocks, Informs, RelevantTo, DependsOn |
| Action (5) | AvailableOn, ComposedOf, StepUses, RiskAssessedBy, Controls |
| Memory (5) | CapturedIn, TraceEntry, Summarizes, Recalls, GovernedBy |
| Agent (10) | AssignedTo, PlannedBy, HasStep, Targets, RequiresApproval, ExecutedBy, ExecutionOf, ProducesNode, GovernedByPolicy, BudgetFor |
## Architecture
```
mindgraph
├── graph.rs -- MindGraph: the main public API + NodeUpdate/EdgeUpdate builders
├── async_graph.rs -- AsyncMindGraph: tokio wrapper (behind "async" feature)
├── storage/
│ ├── cozo.rs -- CozoStorage: CozoDB CRUD, traversal, pagination, batch ops
│ └── migrations.rs -- Schema DDL (CozoDB :create statements + indices)
├── schema/
│ ├── mod.rs -- Layer, NodeType (48), EdgeType (70) enums
│ ├── node.rs -- GraphNode, CreateNode
│ ├── edge.rs -- GraphEdge, CreateEdge
│ ├── node_props.rs -- NodeProps discriminated union
│ ├── edge_props.rs -- EdgeProps discriminated union
│ └── props/ -- Per-layer property structs
│ ├── reality.rs (4 structs)
│ ├── epistemic.rs (24 structs)
│ ├── intent.rs (6 structs)
│ ├── action.rs (5 structs)
│ ├── memory.rs (5 structs)
│ └── agent.rs (8 structs)
├── traversal.rs -- Direction, TraversalOptions, PathStep
├── query.rs -- Pagination, Page<T>, GraphStats, DecayResult, TypedSnapshot, etc.
├── types.rs -- Uid, Confidence, Salience, PrivacyLevel, Timestamp
├── provenance.rs -- ProvenanceRecord, ExtractionMethod
├── embeddings.rs -- EmbeddingProvider (sync) + AsyncEmbeddingProvider traits
├── events.rs -- GraphEvent, EventKind, EventFilter, SubscriptionId
├── watch.rs -- WatchStream (async filtered event stream, behind "async")
├── agent.rs -- AgentHandle (scoped per-agent graph access)
├── openai.rs -- OpenAIEmbeddings (behind "openai" feature)
└── error.rs -- Error types + Result alias
```
## Storage
CozoDB is used as the embedded storage engine. It runs Datalog queries over relations stored in SQLite (persistent) or in-memory (testing). The schema defines six core relations:
| `node` | All graph nodes with universal metadata | `uid` |
| `edge` | All graph edges with typed properties | `uid` |
| `node_version` | Append-only node version snapshots | `(node_uid, version)` |
| `edge_version` | Append-only edge version snapshots | `(edge_uid, version)` |
| `provenance` | Extraction lineage records | `(node_uid, source_uid)` |
| `alias` | Entity resolution mappings | `(alias_text, canonical_uid)` |
| `mg_meta` | Key-value config store (e.g., embedding dimension) | `key` |
| `node_embedding` | Vector embeddings with HNSW index (created on demand) | `uid` |
Indices are created for edge traversal (`from_uid`, `to_uid`), node lookup (`node_type`, `layer`), provenance queries, and alias resolution.
## Design Decisions
- **Props as JSON columns** -- Node and edge properties are stored as JSON in CozoDB, with `NodeProps`/`EdgeProps` Rust enums providing type safety at the API boundary. This allows CozoDB Datalog to filter on props fields using `get(props, 'field', default)` without schema migration.
- **Tombstoning over deletion** -- Soft-delete preserves audit trails. Tombstoned entities are excluded from live queries but remain accessible for forensic review. `tombstone_cascade` removes a node and all its edges atomically.
- **Append-only versioning** -- Every mutation to a node or edge creates a new version snapshot, enabling full history reconstruction and point-in-time queries via `node_at_version`.
- **2-query BFS traversal** -- Graph traversal fetches all live edges in one query, runs BFS in-memory, then batch-fetches node metadata in a second query. This reduces traversal from O(N) database queries to exactly 2, regardless of graph size. Recursive CozoDB Datalog was tested but found unreliable across versions.
- **Server-side filtering** -- Query patterns like `active_goals()` and `weak_claims()` push filtering into CozoDB Datalog rather than loading all nodes into memory. Paginated variants (e.g., `active_goals_paginated`) sort in the database before applying `:limit`/`:offset`.
- **Tombstone sentinel** -- `tombstone_at` uses `0.0` as the sentinel value for "not tombstoned" since CozoDB columns use fixed types. All live-query filters check `tombstone_at == 0.0`.
- **Thread safety** -- `MindGraph` is `Send + Sync`. CozoDB's `DbInstance` uses internal locking, so `Arc<MindGraph>` works safely across threads.
- **Async via spawn_blocking** -- `AsyncMindGraph` wraps `Arc<MindGraph>` and delegates all operations to `tokio::task::spawn_blocking`. This avoids blocking the tokio runtime while leveraging CozoDB's synchronous API.
- **Private Uid inner field** -- `Uid(String)` keeps its inner field private to prevent accidental construction of invalid UIDs. Use `Uid::new()`, `Uid::from()`, or `Uid::as_str()`.
## License
MIT