docbert
A blazing-fast semantic search CLI for your documents. Combines BM25 full-text search with ColBERT neural reranking to find exactly what you're looking for.
Features
- Two-stage search pipeline: Fast BM25 retrieval followed by ColBERT semantic reranking
- Semantic-only search: ColBERT-only full scan when you want pure semantic ranking
- Collection-based organization: Group documents into named collections
- Incremental indexing: Only re-index changed files
- Multiple output formats: Human-readable, JSON, or plain file paths
- GPU acceleration: CUDA and Metal support for faster embeddings
- Fuzzy matching: Tolerates typos in search queries
- Zero configuration: Just point it at a directory and search
Quick Start
# Add a collection of markdown notes
# Synchronize indexes and embeddings
# Search across all collections
# Search with semantic understanding (default)
# Semantic-only full scan (ColBERT only)
# Fast BM25-only search (no neural reranking)
# Output as JSON for scripting
# Get file paths for piping to other tools
|
MCP Server
docbert exposes an MCP (Model Context Protocol) server for AI agent integrations.
Tools exposed:
docbert_search- Keyword + semantic search (supports collection filters)semantic_search- Semantic-only search across all documentsdocbert_get- Retrieve a document by path or#doc_iddocbert_multi_get- Retrieve multiple documents by glob patterndocbert_status- Index health and collection summary
Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):
Claude Code (~/.claude/settings.json):
Installation
With Nix (recommended)
# CPU version
# CUDA version (for NVIDIA GPUs)
Shell completions for bash, zsh, and fish are automatically installed with the Nix package.
From source
# CPU build
# With CUDA support
# With Metal support (macOS)
Usage
Managing Collections
# Add a directory as a collection
# List all collections
# Remove a collection
Searching
# Basic search (returns top 10 results)
# More results
# Search specific collection
# All results above a score threshold
# Disable fuzzy matching for exact searches
# Semantic-only full scan (no BM25 or fuzzy matching, slower for large corpora)
Retrieving Documents
# Get document by collection:path
# Get by document ID
# Output with metadata
# Get multiple documents with glob patterns
Maintenance
# Show system status
# Sync changes incrementally (fast, only processes changed files)
# Sync specific collection
# Full rebuild (deletes everything and re-indexes from scratch)
# Rebuild specific collection
How It Works
docbert uses a two-stage retrieval pipeline:
-
BM25 Retrieval (via Tantivy): Fast full-text search with fuzzy matching retrieves the top 1000 candidates
-
ColBERT Reranking (via pylate-rs): Neural semantic scoring reranks candidates using ColBERT-Zero
This approach gives you the speed of traditional search with the semantic understanding of neural models.
For cases where you want pure semantic ranking, docbert ssearch (and the MCP semantic_search tool) skip Tantivy entirely and score every stored embedding. This is slower but avoids any BM25 or fuzzy matching influence.
Configuration
docbert stores its data in ~/.local/share/docbert/ (or $XDG_DATA_HOME/docbert/).
Override with --data-dir:
Data directory resolution order:
--data-dirCLI flagDOCBERT_DATA_DIRenvironment variable- XDG default (
$XDG_DATA_HOME/docbert/or~/.local/share/docbert/)
Environment Variables
DOCBERT_DATA_DIR: Override the data directory (lower priority than--data-dir)DOCBERT_MODEL: Override the ColBERT model (default:lightonai/ColBERT-Zero)DOCBERT_LOG: Set log level (e.g.,debug,info,warn)
Model Selection
Persist a default model in config.db:
Override per command:
Alternative Models
The default model (lightonai/ColBERT-Zero) works out of the box. To use a
different pylate-rs-compatible model:
# or
DOCBERT_MODEL=/path/to/model
Supported File Types
- Markdown (
.md) - Plain text (
.txt)
Performance Tips
- Use
--bm25-onlyfor fast searches when semantic understanding isn't needed - The ColBERT model is lazy-loaded on first semantic search
- GPU acceleration significantly speeds up embedding computation
- Incremental indexing means only changed files are re-processed
License
MIT OR Apache-2.0