mdvs — Markdown Validation & Search
:x: A Document Database
:white_check_mark: A Database for Documents
mdvs infers a schema from your frontmatter, validates it, and gives you semantic search with SQL filtering. Single binary, no cloud, no setup.
Install
Prebuilt binary (macOS / Linux)
|
From crates.io
From source
Quick Start
# Initialize: scans your files, infers a schema, builds a search index
# Search with natural language
# Filter results with SQL on frontmatter fields
# Validate frontmatter against the inferred schema
That's it. No config files to write, no models to download manually, no services to start.
Features
Schema inference
mdvs scans your markdown files and infers a typed schema from frontmatter — field names, types (boolean, integer, float, string, arrays, nested objects), which directories they appear in, and which ones are required. The schema is written to mdvs.toml and can be customized.
# Discovered 10 fields across 496 files
# tags String[] (required in ["**"])
# draft Boolean (allowed in ["blog/**"])
# year Integer (required in ["articles/**"])
# ...
Frontmatter validation
Check your files against the schema — catch missing required fields, wrong types, and fields that appear where they shouldn't.
# blog/draft.md: missing required field 'tags'
# blog/old-post.md: field 'year' expected Integer, got String
Semantic search
Instant vector search using lightweight static embeddings (Model2Vec). The default model is 8MB — no GPU, no API keys, no network access needed at query time.
All commands support --output json for scripting and pipelines:
SQL filtering
Filter search results on any frontmatter field using SQL syntax, powered by DataFusion.
Incremental builds
Only changed files are re-embedded. Unchanged files keep their existing chunks and embeddings. If nothing changed, the model isn't even loaded.
# Built index: 3 new, 1 edited, 492 unchanged, 0 removed (4 files embedded)
Commands
| Command | Description |
|---|---|
init |
Scan files, infer schema, write mdvs.toml, optionally build index |
check |
Validate frontmatter against schema |
update |
Re-scan and update field definitions |
build |
Validate + embed + write search index |
search |
Semantic search with optional SQL filtering |
info |
Show config and index status |
clean |
Delete search index |
How it works
mdvs treats your markdown directory like a database:
initscans your files and infers a schema from frontmatter — likeCREATE TABLEcheckvalidates every file against that schema — like constraint checkingupdatedetects new fields as your files evolve — likeALTER TABLEbuildchunks and embeds your content into a local Parquet indexsearchqueries that index with SQL filtering on metadata — likeSELECT ... WHERE ... ORDER BY similarity
Two artifacts: mdvs.toml (committed, your schema) and .mdvs/ (gitignored, the search index).
Documentation
Full documentation at edochi.github.io/mdvs.