markbase 0.1.0-alpha.2

A high-performance CLI tool for indexing and querying Markdown files for AI agent. Obsidian-compatible.
Documentation

Markdown Base CLI (markbase)

A high-performance CLI tool for indexing and querying Markdown files for AI agent. Obsidian-compatible.

Ask DeepWiki

Installation

From Source

# Clone the repository
git clone <repository-url>
cd markbase

# Build release binary
cargo build --release

# The binary will be at target/release/markbase
./target/release/markbase --help

Prerequisites

  • Rust 1.85+ (2024 edition)
  • DuckDB (bundled with the duckdb crate)

Quick Start

# Index notes
markbase index --base-dir ./my-notes

# Query notes
markbase query "has(tags, 'todo')"

Properties

Every indexed markdown file has two types of properties: native file metadata and frontmatter properties.

Field Resolution: Reserved fields are checked first, then frontmatter properties.

Reserved Fields (Native Properties)

Field Type Description
path TEXT File path relative to base-dir (primary key)
folder TEXT Directory path relative to base-dir
name TEXT File name (without extension)
ext TEXT File extension (e.g., md)
size INTEGER File size in bytes
ctime TIMESTAMP Created time
mtime TIMESTAMP Modified time
content TEXT Full file content
tags VARCHAR[] Array of #tags (from content AND frontmatter)
links VARCHAR[] Array of [[wiki-links]]
backlinks VARCHAR[] Files linking to this file
embeds VARCHAR[] Array of ![[embeds]]
# Query reserved fields
markbase query "folder == './notes'"
markbase query "mtime > '2024-01-01'"
markbase query "size > 10000"
markbase query "has(tags, 'todo')"
markbase query "has(links, 'target-page')"

Frontmatter Properties

Properties defined in YAML frontmatter are also available:

---
title: My Note
author: John
category: project
status: in-progress
tags: [design, research]
date: 2024-01-15
---
# Query frontmatter properties (resolved automatically)
markbase query "author == 'John'"
markbase query "category == 'project'"
markbase query "status == 'in-progress'"
markbase query "has(tags, 'design')"

Note: If a frontmatter field conflicts with a reserved field (except tags), a warning will be shown during indexing and the frontmatter value will be ignored.

Property Types

Frontmatter Type Query Example
String author == 'John'
Number year >= 2024
Boolean published == true
Array has(tags, 'design')
Date date > '2024-01-01'
Exists exists(author)

Commands

index

Scans Markdown files and indexes to DuckDB.

markbase index --base-dir ./notes        # Index base directory
markbase index --base-dir ./notes --force     # Force re-index
markbase index --base-dir ./notes -v     # Verbose

query

Query indexed files with SQL-like expressions.

# Query reserved fields
markbase query "has(tags, 'project')"
markbase query "folder =~ '%projects%'"
markbase query "mtime > '2024-01-01'"
markbase query "size > 1000"

# Query frontmatter properties
markbase query "category == 'work'"
markbase query "author == 'John'"

# Nested properties
markbase query "_schema.strict == 'true'"

# Output formats
markbase query "has(tags, 'todo')" -o json
markbase query "has(tags, 'todo')" -o list
# Query results include count in output (table/list) or metadata (JSON)

# Select fields (default: path, mtime)
markbase query "name == 'readme'" -F "path,name,size"
markbase query "category == 'project'" -F "path,author,category"

new

Create a new markdown note with optional template.

markbase new my-note                    # Create note in base-dir
markbase new notes/my-note              # Create in subdirectory
markbase new my-note --template daily   # Create with template (outputs path + content)

With template: Returns path: and content: for agent workflow integration:

path: /home/user/notes/today.md
content: ---
date: ""
mood: ""
summary: ""
tags: []
---

## 今日记录

---

template

Manage templates (MKS schema-based templates).

MKS (Markdown Knowledge Schema) is a protocol for connecting unstructured conversation flow with structured knowledge bases. See spec/schema.md for the complete specification.

markbase template list                  # List all templates (default: table format)
markbase template list -o json          # List in JSON format
markbase template list -o list         # List in list format
markbase template list -F "tags,type"  # List with additional fields
markbase template describe daily        # Show template content

Note: Templates are expected in the templates/ directory under base-dir. Default fields shown: name, _schema.description, path.

Fields: Reserved fields (path, folder, name, ext, size, ctime, mtime, content, tags, links, backlinks, embeds) and frontmatter properties (e.g., author, category). Nested properties supported (e.g., _schema.strict).

Operators: ==, !=, >, <, >=, <=, =~ (LIKE), and, or

Functions: has(field, value) - array containment | exists(field) - property existence check

Note: Reserved fields are checked first, then frontmatter properties. If a frontmatter field conflicts with a reserved field (except tags), a warning will be shown during indexing.

Note: Timestamps are displayed in human-readable format (YYYY-MM-DD HH:MM:SS)

Environment Variables

Variable Description Default
MARKBASE_BASE_DIR Base directory for indexing .
MARKBASE_OUTPUT_FORMAT Output format for query and template list table

Priority: CLI arguments > Environment variables > Defaults

# Set environment variables
export MARKBASE_BASE_DIR=/path/to/notes
export MARKBASE_OUTPUT_FORMAT=json

# Use environment variables
markbase query "has(tags, 'design')"

# CLI arguments override environment variables
markbase index --base-dir /other/dir
markbase --output-format json query "..."

Features

  • Fast indexing with DuckDB
  • SQL-like query language
  • Obsidian support (wiki-links, embeds, frontmatter, tags)
  • Incremental updates
  • File watching mode for auto-reindexing
  • Multiple output formats (table, json, list)
  • Human-readable timestamps
  • Shorthand field notation for conciseness
  • Note creation with templates
  • Template listing with MKS schema support

Development

# Build debug version
cargo build

# Run in development
cargo run -- index --base-dir ./notes
cargo run -- query "file.name == 'readme'"

# Run tests
cargo test

# Run tests with output
cargo test -- --nocapture

# Build release
cargo build --release

# Run with verbose output
cargo run -- index --base-dir ./notes -v

Testing

The project includes comprehensive unit tests covering all major components:

  • 127 total tests across all modules
  • Query System: Tokenizer, parser, compiler, and SQL generation
  • Content Extraction: Frontmatter, tags, wiki-links, embeds
  • Database: CRUD operations, queries, and filtering
  • Scanner: File discovery, indexing, and backlink tracking
  • Watcher: File monitoring and incremental indexing
  • Output: Table, JSON, and list formatting

Run tests with: cargo test

Tech Stack

  • Language: Rust 1.85+ (2024 edition)
  • CLI Framework: clap v4.5 (derive feature)
  • Database: DuckDB via duckdb crate (bundled feature)
  • File Discovery: walkdir v2.5
  • Parser: gray_matter (frontmatter), regex (wiki-links/tags)
  • Serialization: serde, serde_json

Project Structure

markbase/
├── Cargo.toml           # Rust dependencies and metadata
├── Cargo.lock           # Dependency lock file
├── README.md            # User documentation
├── AGENTS.md            # This file - agent specification
├── src/
│   ├── main.rs          # CLI entry point with clap
│   ├── db.rs            # DuckDB database operations
│   ├── scanner.rs       # File discovery and indexing
│   ├── extractor.rs     # Markdown content extraction
│   ├── creator.rs       # Note creation with templates
│   ├── describe.rs      # Template description
│   ├── lib.rs           # Library exports
│   └── query/           # Query system
│       ├── mod.rs       # Output formatting (table/json/list)
│       ├── tokenizer.rs # Query tokenization
│       ├── parser.rs    # AST parsing
│       └── compiler.rs  # SQL compilation
└── target/              # Build output

License

MIT