vaultdb
Markdown vaults, queryable everywhere you want to use them. vaultdb is a Rust library for treating folders of .md files with YAML frontmatter as a queryable database, plus the frontends that sit on it: a CLI (vaultdb), an MCP server for LLM agents (vaultdb-mcp), and a stable library API (vaultdb-core) that any markdown-vault tool can build on.
The thesis: a markdown vault is both a relational table (frontmatter is rows × columns) and a graph ([[wikilinks]] are edges). vaultdb's query AST treats both as first-class — you filter records by frontmatter, by graph predicates ("links to anything tagged X"), or by any combination.
use ;
let vault = discover?;
let records = vault.query?;
Or from the CLI:
$ vaultdb query 3-Notes --where "tags contains topic/ai" --select "_name,_backlink_count" --sort _backlink_count --desc --limit 5
+-----------------------------+-----------------+
| _name | _backlink_count |
+==============================================+
| BERT | 43 |
| Machine Learning | 39 |
| Transformer Architecture | 38 |
| Natural Language Processing | 35 |
| Deep Learning | 29 |
+-----------------------------+-----------------+
What it does
- Treats folders of
.mdfiles as database tables - YAML frontmatter fields are queryable columns
[[wiki-links]]form a citation graph with backlink tracking- Supports relational joins across the link graph
- Graph traversal (BFS) with depth limits and filtering
- Bulk mutations (set fields, add/remove tags) with
--dry-runsafety - Rename with automatic wiki-link updates across the vault
- Schema inference and validation
- Library + CLI + MCP server in one workspace — pick whichever fits
No daemon, no cache, no state files. Every command reads the current .md files directly. Edit in Obsidian, query with vaultdb — they coexist without conflict.
Workspace
| Crate | What it is | Use for |
|---|---|---|
vaultdb-core |
Library: parse, query, link graph, mutation builders | Building a markdown-vault tool in Rust |
vaultdb |
CLI binary (this is what cargo install vaultdb ships) |
Command-line use over an existing vault |
vaultdb-mcp |
Model Context Protocol server (stdio) | Letting LLM agents (Claude, Cursor, etc.) query a vault |
See ARCHITECTURE.md for the design rules these frontends follow (library scope discipline, state boundaries, public API contract).
Install
# From crates.io
# Or from source
Requires Rust 1.75+. Published at https://crates.io/crates/vaultdb.
Quick start
# Auto-detects vault root by finding .obsidian/ directory
# Or specify explicitly
Data model
Folder = Database / Table
.md file = Record / Row
Frontmatter fields = Columns
[[wiki-links]] = Relations / Edges
Every record automatically has virtual fields:
| Field | Description |
|---|---|
_name |
Filename without .md |
_path |
Relative path from vault root |
_folder |
Parent folder name |
_modified |
File modification time |
_created |
File creation time |
_links |
Outgoing wiki-link targets |
_link_count |
Number of outgoing links |
_backlinks |
Notes that link to this note |
_backlink_count |
Number of incoming links |
_body |
The full body text (everything after the closing --- of the frontmatter) — use with contains, matches, etc. for body search |
_length |
Total file size in bytes |
_body_length |
Body length in bytes (excluding frontmatter) |
Commands
Query
# Basic query with filtering, sorting, limiting
# Multiple --where flags are AND-ed
# OR within a single --where using ||
# NOT with ! prefix
# Output formats: table (default), json, csv, yaml
Where expression syntax
FIELD = VALUE # exact match
FIELD != VALUE # not equal
FIELD > VALUE # numeric/string comparison
FIELD < VALUE
FIELD >= VALUE
FIELD <= VALUE
FIELD contains VALUE # list membership or substring
FIELD !contains VALUE # negated
FIELD startswith VALUE
FIELD endswith VALUE
FIELD matches REGEX # regex match
FIELD IN (a, b, c) # SQL-style list membership
FIELD NOT IN (a, b, c) # negated
FIELD IS NULL # alias for `missing`
FIELD IS NOT NULL # alias for `exists`
FIELD exists # field is present and non-null
FIELD missing # field is absent or null
FIELD !exists # negated exists (same as missing)
Boolean composition inside a single --where:
# AND with && (SQL-conventional: binds tighter than ||)
# OR with ||
# Mixed: AND binds tighter, so a || b && c parses as a || (b && c)
# Parenthesised grouping for explicit precedence
# Word-prefix NOT for negating a whole sub-expression
# Quoted string values for needles with spaces or special chars
Multiple --where flags are also AND-ed:
Body search
Use the _body virtual field to search inside note bodies (the text after the frontmatter):
# Find every note that mentions "Stanford" in its body
# Combined with frontmatter filtering — runs through the streaming
# query path, so it's cheap on large vaults.
# Regex on the body
Body content is loaded only when a body predicate is referenced; queries that don't need it stay on the fast frontmatter-only path.
Create
# Create a note from a template
# Create without a template (minimal frontmatter)
# Batch create from unresolved links
| | while ; do
done
The --template path is relative to vault root. Any .md file works as a template — vaultdb reads it, applies --set overrides to frontmatter, and writes the result.
Count, Fields, Tags
# Count matching records
# List all frontmatter fields with types and frequencies
# List all tags with usage counts
Graph: Links, Traverse, Unresolved
# Show outgoing and incoming links for a note
# Find the most referenced notes
# Find orphan notes (no links in or out)
# BFS traversal from a starting note
# Filter traversal results
# Find [[wiki-links]] pointing to non-existent files
# Scoped to a neighborhood
# Verbose: show which notes reference each unresolved link
Relational joins
# Notes that link to React
# Notes that React links to
# Notes linking to ANY note tagged topic/ai (the join)
# Notes linked from any movie note
# Notes linking to both React AND Node.js
Mutations
All write operations support --dry-run to preview changes without writing.
# Set a field
# Add/remove tags
# Remove a field
# Move files
# Delete (moves to .trash/ by default, --force for permanent)
# Rename with automatic wiki-link updates across the vault
Mutations require at least one --where condition to prevent accidental bulk changes.
Schema
# Infer a schema from existing data
# Validate records against a schema file (vaultdb-schema.yaml)
# Show the current schema
Performance
No caching, no indexing — reads files fresh on every command.
Numbers below are best-of-3 from cargo run --release --example bench -- <N>,
measured on an Intel i7-14700K desktop (full host details and
methodology in BENCHMARKS.md):
| Scale | Frontmatter query | Graph query | link_graph(All) |
|---|---|---|---|
| 1 000 notes | 5 ms | 7 ms | 6 ms |
| 10 000 notes | 59 ms | 88 ms | 70 ms |
| 100 000 notes | 651 ms | 1 032 ms | 819 ms |
Scaling is roughly linear in vault size — 10× the records costs about 10–12× the time, with no superlinear cliff up through 100k. Every operation finishes in under 1.1 seconds at 100k notes.
Reproduce with:
Two-parser architecture: serde_yaml for fast reads, line-by-line string manipulation for formatting-preserving writes.
Safety
--dry-runpreviews all mutations before writingupdate,move,deleterefuse to run without--wheredeletewarns about dangling wiki-links before proceedingdeletemoves to.trash/by default (with collision-safe naming)renameauto-updates all[[wiki-links]]across the vault- Writer detects and refuses to modify flow-style YAML (
[a, b]) or multiline scalars (|,>) - Files without frontmatter are loaded with empty fields (queryable by virtual fields, never silently skipped)
Library usage (vaultdb-core)
Add to your Cargo.toml:
[]
= { = "https://github.com/rusenbb/vaultdb" }
The full public surface lives at the crate root: Vault, Record, Value, Query, Expr, Predicate, LinkPredicate, LinkGraph, GraphScope, Direction, UpdateBuilder, DeleteBuilder, MoveBuilder, RenameBuilder, MutationReport, LoadResult, ParseError, VaultdbError. All public data types are Serialize/Deserialize-able.
use ;
let vault = discover?;
// Records that link to anything tagged topic/ai
let q = Query ;
let hits = vault.query?;
// Plan-only mutation: see what would change without writing
let filter = parse?;
let plan = new
.set
.plan?;
for change in &plan.changes
Every mutation builder exposes a plan(&vault) and an execute(self, &vault). plan is read-only; execute runs the same computation and writes the result. The CLI's --dry-run flag is just plan() + render.
MCP server (vaultdb-mcp)
vaultdb-mcp exposes the library as a Model Context Protocol server over stdio, so Claude, Cursor, and other MCP-aware clients can query and reason about a vault.
Wire it into Claude Desktop's config (~/.config/claude/claude_desktop_config.json on Linux, ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
Tools exposed: query, find_by_name, list_folders, links, traverse, unresolved, schema_show, schema_infer, plus four plan-only mutation tools (plan_update, plan_delete, plan_move, plan_rename) that show what a change would do without writing — agents propose, you (or the host) decide whether to apply.
There are intentionally no execute_* tools. Mutations go through the CLI or your own application code, with you in the loop.
Claude Code integration
vaultdb ships with a Claude Code skill so LLM agents can use it directly. To install:
# Copy the skill to your personal skills directory
Then in any Claude Code session, the agent can invoke /vaultdb or use it automatically when you ask about your vault.
Not Obsidian-specific
Despite being designed for Obsidian vaults, vaultdb works with any folder of .md files with YAML frontmatter. Hugo, Jekyll, Astro, Zola, or any static site generator's content directory is a valid target.
License
MIT