MDQL
A database where every entry is a markdown file and every change is a readable diff.
MDQL turns folders of markdown files into a schema-validated, queryable database. Frontmatter fields are metadata columns. H2 sections are content columns. The files are the database — there is nothing else. Every file reads like a normal markdown document, but you get full SQL: SELECT, INSERT, UPDATE, DELETE, JOINs across multiple tables, ORDER BY, and aggregation.
Your database lives in git. Every insert, update, and migration is a readable diff. Branching, merging, and rollback come free.
Quick start
# All 100 files valid in table 'strategies'
title composite
------------------------------------------------------------------- ---------
Bridge Inflow to Destination Chain → DEX Liquidity Pressure 500
DeFi Protocol TVL Step-Change → Governance Token Repricing Lag 500
Lending Protocol Daily Interest Accrual Liquidation Threshold Creep 500
USDC Circle Business-Day Redemption Queue — Weekend Premium Decay 490
Cascading Liquidation Chain — Second-Order Collateral Asset Short 480
Why MDQL
- Zero infrastructure. No server, no Docker, no connection strings.
git cloneand you have the database.rm -rfand it's gone. - Data review via pull requests. Data changes go through the same PR review process as code. A reviewer reads the diff of an INSERT the way they read a code change.
- Branch-level isolation. An agent works on a feature branch, inserts and updates entries freely, and the main database is untouched until merge. Multiple agents work in parallel without coordination.
- No serialization boundary. The storage format is the readable format. An LLM sees a well-structured markdown document, not a JSON blob or SQL dump.
- Graceful degradation. If you stop using MDQL tomorrow, you still have a folder of valid markdown files. No proprietary format to export from.
- Section-level content columns. Long-form structured prose — a hypothesis, a methodology, kill criteria — is a first-class queryable column.
SELECT Hypothesis FROM strategies WHERE status = 'LIVE'. - Every unix tool still works.
grep -r "funding" strategies/works.wc -l strategies/*.mdworks.diffworks. - Self-documenting schemas. The schema file is a markdown document. Its body explains the fields, conventions, and rationale. An LLM reading
_mdql.mdgets both the machine-readable schema and the human context for why fields exist. - Schema migrations are diffs.
ALTER TABLE RENAME FIELDrewrites every file. The migration shows up as a git diff. - Audit trail for free.
git blame strategies/bad-debt-socialization-event-token-short.mdtells you who changed what and when.
Directory structure
my-project/
_mdql.md # type: database — config + foreign keys
strategies/
_mdql.md # type: schema — table schema + docs
bad-debt-socialization-event-token-short.md
aave-utilization-kink-rate-spike-borrow-unwind-short.md
...
backtests/
_mdql.md # type: schema
bt-bad-debt-socialization-binance.md
...
src/ # no _mdql.md — invisible to MDQL
docs/ # no _mdql.md — invisible to MDQL
A _mdql.md file marks a directory as part of an MDQL database. The type field in frontmatter determines what it is — database at the root, schema in each table folder. Directories without _mdql.md are ignored, so MDQL coexists with any project structure.
How it works
One folder = one table. One markdown file = one row.
A row file looks like this:
title: "Bad Debt Socialization Event — Token Short"
status: HYPOTHESIS
mechanism: 7
categories:
- -
When an on-chain lending protocol accumulates bad debt that exceeds
its reserve buffer, the smart contract mints governance tokens...
The protocol's shortfall module triggers an auction...
- YAML frontmatter fields are metadata columns (
title,status,mechanism, ...) - H2 sections are content columns (
Hypothesis,Structural Mechanism, ...) - The
path(filename) is the implicit primary key createdandmodifiedare reserved timestamp fields, auto-managed bymdql stamp- All columns are queryable with SQL
_mdql.md files
Every MDQL-managed directory has a _mdql.md file. The type field in frontmatter says what kind.
Table schema (type: schema)
type: schema
table: strategies
primary_key: path
frontmatter:
title:
mechanism:
categories:
h1:
required: false
sections: {}
rules:
reject_unknown_frontmatter: true
reject_unknown_sections: false
reject_duplicate_sections: true
Documentation about this table goes here.
Supported types: string, int, float, bool, date, string[]
Database config (type: database)
type: database
name: zunid
foreign_keys:
-
Trading strategy research database.
The markdown body in both cases is documentation — ignored by the engine, useful for humans and LLMs.
Foreign key validation
Foreign keys defined in the database config are validated automatically. No setup required.
At load time: Every call to load_database() checks all FK constraints. If backtests.strategy references a file that does not exist in strategies.path, the error is returned alongside the data. CLI commands (query, validate, repl) print FK warnings to stderr.
In the REPL: A filesystem watcher runs in the background. If you rename or delete a file in another terminal, the REPL detects the change within 500ms and prints any new FK violations.
In the web UI: Same filesystem watcher runs as a background task. FK errors are available at GET /api/fk-errors.
With mdql validate: When pointed at a database directory (not just a single table), reports per-table schema validation summaries followed by FK violations:
Table 'strategies': 100 files valid
Table 'backtests': 18 files valid
Foreign key violations:
backtests/bt-broken.md: strategy = 'nonexistent.md' not found in strategies
NULL FK values are not violations — a backtest with no strategy set is valid.
Python API
Database and Table
=
=
SELECT with JOINs
Database.query() runs SQL across all tables in the database, including multi-table JOINs.
, =
# rows: list of dicts, one per result row
# columns: list of column names
Single-table queries
Table.query() runs a SELECT query on one table and returns structured results.
, =
# rows: list of dicts
# columns: list of column names
Load rows with filtering
Table.load() returns all rows, optionally filtered by a dict of field values.
# All rows
, =
# Filtered by dict — equality matching
, =
# Filtered by SQL WHERE string — full operator support
, =
, =
The where parameter accepts a dict (equality matching) or a SQL WHERE string (supports =, !=, <, >, <=, >=, LIKE, IN, IS NULL, AND, OR). errors contains any schema validation issues found during loading.
INSERT
# Create a new row — filename derived from title
# Returns: Path to created file (e.g. my-new-strategy.md)
# created/modified timestamps set automatically
# required sections scaffolded as empty ## headings
# validated against schema before writing
# With pre-formatted body (e.g. from Claude output)
# Overwrite existing file, preserve created timestamp
UPDATE
# Partial merge — only the fields you pass are changed
# Update body only
Bulk UPDATE
Table.update_many() updates the same fields across multiple files.
=
# Returns: list of paths that were updated
DELETE
Schema operations
=
# section or frontmatter
# section or frontmatter
# sections only
Validation
=
# Returns: list of validation errors (schema + FK)
All writes are validated against the schema and rolled back on failure. The created timestamp is always preserved on replace and update; modified is always set to today.
CLI commands
mdql query <folder> "<sql>"
Run SQL against a table or database. Supports SELECT, INSERT INTO, UPDATE SET, DELETE FROM, ALTER TABLE, and JOIN.
# Filter and sort
# Query section content
# Category search (LIKE works on arrays)
# Output as JSON
Supported WHERE operators: =, !=, <, >, <=, >=, LIKE, IN, IS NULL, IS NOT NULL, AND, OR
Column names with spaces use backticks: SELECT `Structural Mechanism` FROM strategies
JOINs
Point at the database directory (parent of table folders) for cross-table queries. Supports two or more tables:
# Two-table JOIN
# Multi-table JOIN
SQL write operations
# INSERT
# UPDATE
# DELETE
For string[] columns, pass comma-separated values in a single string: 'funding-rates,defi'.
ALTER TABLE — field migrations
Rename, drop, or merge fields across all files in a table. Works for both frontmatter fields and sections. The schema _mdql.md is updated automatically.
# ALTER TABLE — renamed 'Summary' to 'Overview' in 42 files
Field names can be single-quoted ('Name'), backtick-quoted (`Name With Spaces`), or bare identifiers.
mdql rename <db-folder> <table> <old-name> <new-name>
Rename a file within a table. Automatically updates all foreign key references in other tables that point to the old filename.
# Renamed strategies/bad-debt-socialization-event-token-short.md → bad-debt-token-short.md
# Updated 3 references in backtests
mdql create <folder> --set key=value
Create a new row file. Field types are coerced from the schema (e.g. --set mechanism=5 becomes int).
For string[] fields, use comma-separated values: -s 'categories=funding-rates,defi'
mdql validate <folder>
Validate all markdown files against the schema. Works on a single table or a database directory.
# All 100 files valid in table 'strategies'
Invalid files get clear error messages:
missing-field.md: Missing required frontmatter field 'count'
wrong-type-date.md: Field 'created' expected date, got string 'yesterday'
duplicate-section.md: Duplicate section 'Body' (appears 2 times)
When pointed at a database directory, also reports foreign key violations (see Foreign key validation).
mdql inspect <folder>
Show normalized rows.
mdql stamp <folder>
Add or update created and modified timestamps in all data files.
# Stamped 100 files: 0 created set, 100 modified updated
createdis set to today's date if missing, never overwrittenmodifiedis always updated to today's date- Both are ISO date strings (
"YYYY-MM-DD") in frontmatter - These fields are reserved — schemas don't need to declare them, and they are never rejected as unknown fields
mdql schema <folder>
Print the effective schema. Works on a single table or the whole database.
mdql repl <folder>
Open an interactive REPL for running queries. Supports tab completion for table names, column names, and SQL keywords.
When pointed at a database directory, runs a background filesystem watcher that prints FK violations to stderr if files change on disk while the REPL is open.
mdql client <folder>
Open a browser-based UI for running queries. Starts a local web server with a query editor.
The web server exposes a REST API:
POST /api/query— execute SQLGET /api/fk-errors— current foreign key violations (updated by background watcher)
Pandas integration
One-liner
=
Two-step (when you already have rows)
, , =
=
Schema types map to pandas dtypes:
| MDQL type | pandas dtype |
|---|---|
string |
string |
int |
Int64 (nullable) |
float |
Float64 (nullable) |
bool |
boolean (nullable) |
date |
datetime64[ns] |
string[] |
Python lists |
Validation errors are handled via the errors parameter: "warn" (default), "raise", or "ignore".
ACID compliance
All write operations are process-safe. Three layers of protection:
Atomic writes. Every file write goes through a temp-file-then-rename path. If the process crashes mid-write, the original file is untouched.
Table locking. Write operations acquire an exclusive fcntl.flock per table. Two processes writing to the same table serialize rather than corrupt each other's files.
Write-ahead journal. Multi-file operations (ALTER TABLE, batch UPDATE/DELETE, stamp) write a journal before making changes. If the process crashes mid-operation, the next Table() construction detects the journal and rolls back all partial changes automatically.
# Safe even if the process is killed mid-way:
# touches 100 files + schema
# On crash: next Table("strategies/") auto-recovers from journal
Running tests
# Rust tests
# Python tests (requires maturin develop first)
Project structure
crates/
mdql-core/ # core library: parser, schema, validator, query engine,
# indexes, caching, full-text search, ACID transactions,
# FK validation, filesystem watcher
mdql/ # CLI binary: validate, query, create, inspect, schema,
# stamp, rename, repl (with autocomplete), client (web UI)
mdql-web/ # browser UI: axum REST server + embedded SPA
python/
src/lib.rs # PyO3 bindings (Rust → Python)
mdql/ # Python wrapper package (thin layer over Rust)
tests/ # Python test suite
examples/ # example data (strategies, backtests)
License
AGPL-3.0. Commercial licenses available — see LICENSE.md.