rlm-rs
Recursive Language Model (RLM) CLI for Claude Code - handles long-context tasks via chunking and recursive sub-LLM calls.
Based on the RLM pattern from arXiv:2512.24601, enabling analysis of documents up to 100x larger than typical context windows.
Features
- Hybrid Semantic Search: Combined semantic + BM25 search with RRF fusion
- Auto-Embedding: Embeddings generated automatically during load (BGE-M3 model)
- Pass-by-Reference: Retrieve chunks by ID for efficient subagent processing
- Multiple Chunking Strategies: Fixed, semantic, code-aware, and parallel chunking
- Code-Aware Chunking: Language-aware chunking at function/class boundaries
- HNSW Vector Index: Optional scalable approximate nearest neighbor search
- Incremental Embedding: Efficient partial re-embedding for updated content
- Agentic Workflow Support: dispatch/aggregate commands for parallel subagent processing
- SQLite State Persistence: Reliable buffer management across sessions
- Regex Search: Fast content search with context windows
- Memory-Mapped I/O: Efficient handling of large files
- JSON/NDJSON Output: Machine-readable output for integration
How It Works
Installation
Via Cargo (Recommended)
Via Homebrew
From Source
Verify a Release Binary
Every release binary carries SLSA build provenance, and releases only publish after fail-closed attestation verification. To verify a download yourself:
See SECURITY.md for details.
Quick Start
# Initialize the database
# Load a large document (auto-generates embeddings)
# Search with hybrid semantic + BM25
# Retrieve chunk by ID (pass-by-reference)
# Check status
# Regex search content
# View content slice
Commands
| Command | Description |
|---|---|
init |
Initialize the RLM database |
status |
Show current state (buffers, chunks, DB info) |
load |
Load a file into a buffer with chunking (auto-embeds) |
search |
Hybrid semantic + BM25 search across chunks |
update-buffer |
Update buffer content with re-chunking |
dispatch |
Split chunks into batches for parallel subagent processing |
aggregate |
Combine findings from analyst subagents |
chunk get |
Retrieve chunk by ID (pass-by-reference) |
chunk list |
List chunks for a buffer |
chunk embed |
Generate embeddings (or re-embed with --force) |
chunk status |
Show embedding status |
list |
List all buffers |
show |
Show buffer details |
delete |
Delete a buffer |
peek |
View a slice of buffer content |
grep |
Search buffer content with regex |
write-chunks |
Write chunks to individual files |
add-buffer |
Add text to a new buffer |
export-buffers |
Export all buffers to JSON |
var |
Get/set context variables |
global |
Get/set global variables |
reset |
Delete all RLM state |
Chunking Strategies
| Strategy | Best For | Description |
|---|---|---|
semantic |
Markdown, prose | Splits at natural boundaries (headings, paragraphs) |
code |
Source code | Language-aware chunking at function/class boundaries |
fixed |
Logs, plain text | Splits at exact byte boundaries |
parallel |
Large files (>10MB) | Multi-threaded fixed chunking |
# Semantic chunking (default)
# Code-aware chunking for source files
# Fixed chunking with overlap
# Parallel chunking for speed
Supported Languages (Code Chunker)
Rust, Python, JavaScript, TypeScript, Go, Java, C/C++, Ruby, PHP
Claude Code Integration
rlm-cli is designed to work with the rlm-rs Claude Code plugin, implementing the RLM architecture:
| RLM Concept | Implementation |
|---|---|
| Root LLM | Main Claude Code conversation (Opus/Sonnet) |
| Sub-LLM | rlm-subcall agent (Haiku) |
| External Environment | rlm-cli CLI + SQLite |
Development
Prerequisites
- Rust 1.95+ (2024 edition)
- cargo-deny for supply chain security
Build
# Using Makefile
# Or using Cargo directly
Project Structure
src/
├── lib.rs # Library entry point
├── main.rs # CLI entry point
├── error.rs # Error types
├── core/ # Core types (Buffer, Chunk, Variable)
├── chunking/ # Chunking strategies
├── storage/ # SQLite persistence
├── io/ # File I/O with mmap
└── cli/ # Command implementations
tests/
└── integration_test.rs
MSRV Policy
The Minimum Supported Rust Version (MSRV) is 1.95.
License
MIT License - see LICENSE for details.
Documentation
📚 Complete Documentation - Full documentation hub with tutorials, guides, and reference materials
Quick Links
- Getting Started - 5-minute tutorial for new users
- Examples - Practical examples and workflows
- CLI Reference - Complete command documentation
- FAQ - Frequently asked questions
- Troubleshooting - Common issues and solutions
- Glossary - RLM and chunking terminology
Advanced Topics
- Features Guide - Feature flags and build options
- Plugin Integration - Integration with Claude Code
- Architecture - Internal design and architecture
- RLM-Inspired Design - Connection to RLM paper
- API Reference - Rust library documentation
- ADRs - Architectural Decision Records
Citing This Project
If you use rlm-cli in your research or projects, please cite it. You can use GitHub's built-in "Cite this repository" button, or use the following BibTeX:
Acknowledgments
This project builds on prior work in recursive language model architectures:
- claude_code_RLM - Original Python RLM implementation by Brainqub3 that inspired the creation of this project
- RLM Paper (arXiv:2512.24601) - Recursive Language Model pattern by Zhang, Kraska, and Khattab (MIT CSAIL)
- Claude Code - AI-powered development environment
Adeojo, John. claude_code_RLM. GitHub, 2026. https://github.com/brainqub3/claude_code_RLM
Zhang, Alex L., Tim Kraska, and Omar Khattab. "Recursive Language Models." arXiv:2512.24601, 2025. MIT CSAIL. https://arxiv.org/abs/2512.24601