indra_db
A content-addressed graph database for versioned thoughts. Think git for knowledge graphs.
Why?
Most agent memory systems are state-based: here's what I know now. But understanding isn't a snapshot—it's a trajectory. When an agent rewrites a note, it loses:
- Why the understanding changed
- What the previous understanding was
- The branching paths not taken
- The confidence evolution
indra_db solves this by combining:
- Git-like versioning: Content-addressed storage, commits, branches
- Graph semantics: Thoughts as nodes, typed/weighted relationships as edges
- Semantic search: Embeddings stored with nodes, vector similarity queries
Installation
As a Rust library
[]
= "0.1"
# Optional: Enable embedding features
= { = "0.1", = ["hf-embeddings"] } # Local models
= { = "0.1", = ["api-embeddings"] } # API providers
= { = "0.1", = ["hf-embeddings", "api-embeddings"] } # Both
As a CLI
Via cargo:
Via prebuilt binary:
Download the latest release for your platform from GitHub Releases:
# macOS (Apple Silicon)
|
# Linux x86_64
|
# Windows (PowerShell)
# Download from releases page and add to PATH
Binaries are available for:
- macOS (Intel + Apple Silicon)
- Linux (x86_64, ARM64, ARMv7, musl variants)
- Windows (x86_64 + ARM64)
Quick Start
CLI Usage
# Initialize a new database
# Create thoughts
# Create relationships
# Search semantically
# View neighbors
# View history
# Branch for experimentation
Library Usage
use ;
With HuggingFace Models (Local)
use HFEmbedder;
async
With OpenAI API
use ;
See EMBEDDINGS.md for detailed embedding configuration.
CLI Reference
indra [OPTIONS] <COMMAND>
Commands:
init Initialize a new database
create Create a new thought
get Get a thought by ID
update Update a thought's content
delete Delete a thought
list List all thoughts
relate Create a relationship between thoughts
unrelate Remove a relationship
neighbors Get neighbors of a thought
search Search thoughts by semantic similarity
commit Commit current changes
log Show commit history
branch Create a new branch
checkout Switch to a branch
branches List all branches
diff Show diff between commits
status Show database status
Options:
-d, --database <DATABASE> Path to database file [default: thoughts.indra]
-f, --format <FORMAT> Output format: json or text [default: json]
--no-auto-commit Disable auto-commit (for batch operations)
-h, --help Print help
-V, --version Print version
Examples
# JSON output (default) - great for scripting
# {"count":3,"thoughts":[{"id":"cats","content":"Cats are furry",...}]}
# Pretty-printed output
# Custom database path
# Batch operations without auto-commit
Using with MCP (Model Context Protocol)
The CLI outputs JSON by default, making it easy to wrap as an MCP server. A typical TypeScript wrapper would look like:
import { Server } from "@modelcontextprotocol/server";
import { spawn } from "child_process";
async function indra(args: string[]): Promise<any> {
return new Promise((resolve, reject) => {
const proc = spawn("indra", ["-d", "agent.indra", ...args]);
let stdout = "";
proc.stdout.on("data", (d) => stdout += d);
proc.on("close", (code) => {
if (code === 0) resolve(JSON.parse(stdout));
else reject(new Error(`indra exited with ${code}`));
});
});
}
const server = new Server({ name: "indra-mcp", version: "0.1.0" });
server.tool("create_thought", { content: "string", id: "string?" }, async ({ content, id }) => {
const args = ["create", content];
if (id) args.push("--id", id);
return await indra(args);
});
server.tool("search_thoughts", { query: "string", limit: "number?" }, async ({ query, limit }) => {
return await indra(["search", query, "-l", String(limit ?? 10)]);
});
// ... more tools
An MCP server implementation is planned as a separate npm package.
Architecture
thoughts.indra (single file)
├── Header (64 bytes)
│ ├── Magic: "INDRA_DB"
│ ├── Version, flags
│ ├── Object count, index offset
│ └── Refs offset, HEAD
├── Objects (content-addressed, zstd compressed)
│ ├── Thoughts (id, content, embedding, metadata)
│ ├── Edges (source, target, type, weight)
│ ├── Commits (tree hash, parents, message)
│ └── Tree nodes (merkle trie)
├── Index (hash → offset mapping)
└── Refs (branch names → commit hashes)
Key design decisions:
- BLAKE3 for content hashing (fast, secure)
- Merkle trie for structural sharing across commits
- Edges float to latest node version (not pinned to hashes)
- Embeddings stored with nodes (content-addressed, deduplicated)
- Pluggable embedder trait (bring your own model)
Edge Types
Built-in edge type constants:
relates_to- General relationshipsupports- Evidence/supportcontradicts- Contradictionderives_from- Derivationpart_of- Hierarchysimilar_to- Similaritycauses- Causationprecedes- Temporal ordering
Custom types are strings—use whatever makes sense for your domain.
Embeddings
indra_db uses a pluggable embedding system. The built-in MockEmbedder generates deterministic embeddings from text hashes (good for testing). For production, implement the Embedder trait:
use ;
Performance
Current implementation uses brute-force vector search, which is fine for <10k thoughts (~10-50ms). For larger graphs, HNSW indexing is on the roadmap.
| Operation | ~1k thoughts | ~10k thoughts |
|---|---|---|
| Create | <1ms | <1ms |
| Search | ~5ms | ~50ms |
| Commit | ~10ms | ~50ms |
| Get by ID | <1ms | <1ms |
Roadmap
- HNSW index for HEAD (faster search at scale)
- Merge operations (three-way merge for branches)
- Export/import (JSON, GEXF)
- Python bindings (PyO3)
- Remote embedder support (OpenAI, Cohere, etc.)
License
MIT
Etymology
Named after Indra's net, a Buddhist metaphor for the interconnectedness of all phenomena—a net of jewels where each jewel reflects all others.