Markdown Base CLI (markbase)
A high-performance CLI tool for indexing and querying Markdown files for AI agent. Obsidian-compatible.
Installation
From Source
# Clone the repository
# Build release binary
# The binary will be at target/release/markbase
Prerequisites
- Rust 1.85+ (2024 edition)
- DuckDB (bundled with the
duckdbcrate)
Quick Start
# Index notes
# Query notes
Properties
Every indexed markdown file has two types of properties: native file metadata and frontmatter properties.
Field Resolution: Reserved fields are checked first, then frontmatter properties.
Reserved Fields (Native Properties)
| Field | Type | Description |
|---|---|---|
path |
TEXT | File path relative to base-dir (primary key) |
folder |
TEXT | Directory path relative to base-dir |
name |
TEXT | File name (without extension) |
ext |
TEXT | File extension (e.g., md) |
size |
INTEGER | File size in bytes |
ctime |
TIMESTAMP | Created time |
mtime |
TIMESTAMP | Modified time |
content |
TEXT | Full file content |
tags |
VARCHAR[] | Array of #tags (from content AND frontmatter) |
links |
VARCHAR[] | Array of [[wiki-links]] |
backlinks |
VARCHAR[] | Files linking to this file |
embeds |
VARCHAR[] | Array of ![[embeds]] |
# Query reserved fields
Frontmatter Properties
Properties defined in YAML frontmatter are also available:
---
title: My Note
author: John
category: project
status: in-progress
tags:
date:
---
# Query frontmatter properties (resolved automatically)
Note: If a frontmatter field conflicts with a reserved field (except tags), a warning will be shown during indexing and the frontmatter value will be ignored.
Property Types
| Frontmatter Type | Query Example |
|---|---|
| String | author == 'John' |
| Number | year >= 2024 |
| Boolean | published == true |
| Array | has(tags, 'design') |
| Date | date > '2024-01-01' |
| Exists | exists(author) |
Commands
index
Scans Markdown files and indexes to DuckDB.
query
Query indexed files with SQL-like expressions.
# Query reserved fields
# Query frontmatter properties
# Nested properties
# Output formats
# Query results include count in output (table/list) or metadata (JSON)
# Select fields (default: path, mtime)
new
Create a new markdown note with optional template.
With template: Returns path: and content: for agent workflow integration:
path: /home/user/notes/today.md
content: ---
date: ""
mood: ""
summary: ""
tags: []
---
## 今日记录
---
template
Manage templates (MKS schema-based templates).
MKS (Markdown Knowledge Schema) is a protocol for connecting unstructured conversation flow with structured knowledge bases. See spec/schema.md for the complete specification.
Note: Templates are expected in the templates/ directory under base-dir. Default fields shown: name, _schema.description, path.
Fields: Reserved fields (path, folder, name, ext, size, ctime, mtime, content, tags, links, backlinks, embeds) and frontmatter properties (e.g., author, category). Nested properties supported (e.g., _schema.strict).
Operators: ==, !=, >, <, >=, <=, =~ (LIKE), and, or
Functions: has(field, value) - array containment | exists(field) - property existence check
Note: Reserved fields are checked first, then frontmatter properties. If a frontmatter field conflicts with a reserved field (except tags), a warning will be shown during indexing.
Note: Timestamps are displayed in human-readable format (YYYY-MM-DD HH:MM:SS)
Environment Variables
| Variable | Description | Default |
|---|---|---|
MARKBASE_BASE_DIR |
Base directory for indexing | . |
MARKBASE_OUTPUT_FORMAT |
Output format for query and template list | table |
Priority: CLI arguments > Environment variables > Defaults
# Set environment variables
# Use environment variables
# CLI arguments override environment variables
Features
- Fast indexing with DuckDB
- SQL-like query language
- Obsidian support (wiki-links, embeds, frontmatter, tags)
- Incremental updates
- File watching mode for auto-reindexing
- Multiple output formats (table, json, list)
- Human-readable timestamps
- Shorthand field notation for conciseness
- Note creation with templates
- Template listing with MKS schema support
Development
# Build debug version
# Run in development
# Run tests
# Run tests with output
# Build release
# Run with verbose output
Testing
The project includes comprehensive unit tests covering all major components:
- 127 total tests across all modules
- Query System: Tokenizer, parser, compiler, and SQL generation
- Content Extraction: Frontmatter, tags, wiki-links, embeds
- Database: CRUD operations, queries, and filtering
- Scanner: File discovery, indexing, and backlink tracking
- Watcher: File monitoring and incremental indexing
- Output: Table, JSON, and list formatting
Run tests with: cargo test
Tech Stack
- Language: Rust 1.85+ (2024 edition)
- CLI Framework: clap v4.5 (derive feature)
- Database: DuckDB via
duckdbcrate (bundled feature) - File Discovery: walkdir v2.5
- Parser: gray_matter (frontmatter), regex (wiki-links/tags)
- Serialization: serde, serde_json
Project Structure
markbase/
├── Cargo.toml # Rust dependencies and metadata
├── Cargo.lock # Dependency lock file
├── README.md # User documentation
├── AGENTS.md # This file - agent specification
├── src/
│ ├── main.rs # CLI entry point with clap
│ ├── db.rs # DuckDB database operations
│ ├── scanner.rs # File discovery and indexing
│ ├── extractor.rs # Markdown content extraction
│ ├── creator.rs # Note creation with templates
│ ├── describe.rs # Template description
│ ├── lib.rs # Library exports
│ └── query/ # Query system
│ ├── mod.rs # Output formatting (table/json/list)
│ ├── tokenizer.rs # Query tokenization
│ ├── parser.rs # AST parsing
│ └── compiler.rs # SQL compilation
└── target/ # Build output
License
MIT