rlm-cli 1.2.4

Recursive Language Model (RLM) REPL for Claude Code - handles long-context tasks via chunking and recursive sub-LLM calls
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
# Plugin Integration Guide

This guide explains how to integrate rlm-rs with AI coding assistants through plugins, skills, and commands. While the examples focus on Claude Code, the patterns apply to any AI assistant that can execute shell commands.

## Overview

rlm-rs is designed as a **CLI-first tool** that AI assistants invoke via shell execution. This architecture enables:

- **Universal Compatibility**: Any assistant with shell access can use rlm-rs
- **No Custom APIs**: Standard stdin/stdout/stderr communication
- **JSON Output**: Machine-readable format for programmatic integration
- **Stateless Commands**: Each invocation is independent (state lives in SQLite)

---

## Claude Code Integration

### Plugin Architecture

The [rlm-rs Claude Code plugin](https://github.com/zircote/rlm-plugin) implements the RLM pattern:

![Claude Code Integration](images/claude-code-integration.png)

### Components

#### 1. Slash Commands (Skills)

User-invocable shortcuts for common operations:

| Command | Description | Maps To |
|---------|-------------|---------|
| `/rlm-load` | Load file into RLM | `rlm-rs load <file>` |
| `/rlm-search` | Search loaded content | `rlm-rs search <query>` |
| `/rlm-status` | Show RLM state | `rlm-rs status` |
| `/rlm-analyze` | Full RLM analysis workflow | Orchestrated multi-step |

**Example Skill Definition** (`.claude/skills/rlm-load.md`):

```markdown
---
name: rlm-load
description: Load a file or directory into RLM for analysis
arguments:
  - name: path
    description: File or directory to load
    required: true
  - name: name
    description: Buffer name (defaults to filename)
    required: false
---

Load content into RLM for semantic search and chunk-based analysis.

## Workflow

1. Check if rlm-rs is installed: `which rlm-rs`
2. Initialize if needed: `rlm-rs init`
3. Load the content: `rlm-rs load {{path}} --name {{name}} --chunker semantic`
4. Report status: `rlm-rs status --format json`

## Output

Report the number of chunks created and confirm embeddings were generated.
```

#### 2. Subagents

Specialized agents for chunk-level processing:

**rlm-subcall Agent** (`.claude/agents/rlm-subcall.md`):

```markdown
---
name: rlm-subcall
model: haiku
description: Efficient chunk-level analysis for RLM workflow
tools:
  - Bash
  - Read
---

You are a focused analysis agent processing individual chunks from large documents.

## Instructions

1. Retrieve the chunk: `rlm-rs chunk get <chunk_id>`
2. Analyze according to the prompt
3. Return structured JSON findings:

```json
{
  "chunk_id": <id>,
  "findings": [...],
  "relevance": "high|medium|low",
  "summary": "Brief summary"
}
```

Keep responses concise. You're part of a larger workflow.
```

**rlm-synthesizer Agent** (`.claude/agents/rlm-synthesizer.md`):

```markdown
---
name: rlm-synthesizer
model: sonnet
description: Synthesize findings from multiple chunk analyses
tools:
  - Read
  - Bash
---

You aggregate results from multiple rlm-subcall analyses.

## Instructions

1. Review all chunk findings
2. Identify patterns and connections
3. Synthesize into coherent narrative
4. Highlight key insights and recommendations
```

#### 3. Hooks

Automated triggers for RLM operations:

**Auto-load on large files** (`.claude/hooks/large-file-rlm.md`):

```markdown
---
event: PostToolUse
tool: Read
---

If the file read was larger than 50KB, suggest loading it into RLM:

"This is a large file. Consider using `/rlm-load {{file_path}}` for semantic search."
```

### Typical Workflow

![RLM Workflow](images/rlm-workflow.png)

---

## Portable Integration Patterns

### Generic CLI Integration

Any AI assistant can integrate with rlm-rs using these patterns:

#### Pattern 1: Search-Then-Retrieve

```bash
# 1. Load content (one-time setup)
rlm-rs load large-document.md --name docs

# 2. Search for relevant chunks
RESULTS=$(rlm-rs --format json search "your query" --top-k 5)

# 3. Extract chunk IDs
CHUNK_IDS=$(echo "$RESULTS" | jq -r '.results[].chunk_id')

# 4. Retrieve and process each chunk
for ID in $CHUNK_IDS; do
    CONTENT=$(rlm-rs chunk get $ID)
    # Process $CONTENT...
done
```

#### Pattern 2: Grep-Based Analysis

```bash
# Find specific patterns
rlm-rs grep docs "TODO|FIXME|HACK" --format json --max-matches 50

# Get context around matches
rlm-rs grep docs "error.*handling" --window 200
```

#### Pattern 3: Progressive Refinement

```bash
# Broad search first
rlm-rs search "authentication" --top-k 20

# Narrow down
rlm-rs search "JWT token validation" --top-k 5 --mode semantic

# Exact match
rlm-rs search "validateToken function" --mode bm25
```

### JSON Output Schema

All commands with `--format json` return structured data:

**Search Results**:
```json
{
  "count": 3,
  "mode": "hybrid",
  "query": "authentication",
  "results": [
    {
      "chunk_id": 42,
      "buffer_id": 1,
      "buffer_name": "auth.rs",
      "score": 0.0328,
      "semantic_score": 0.0499,
      "bm25_score": 0.0000016
    }
  ]
}
```

**Status**:
```json
{
  "initialized": true,
  "db_path": ".rlm/rlm-state.db",
  "db_size_bytes": 245760,
  "buffer_count": 3,
  "chunk_count": 42,
  "total_content_bytes": 125000,
  "embeddings_count": 42
}
```

**Chunk**:
```json
{
  "id": 42,
  "buffer_id": 1,
  "buffer_name": "auth.rs",
  "index": 3,
  "byte_range": [12000, 15000],
  "size": 3000,
  "content": "...",
  "has_embedding": true
}
```

---

## Platform-Specific Notes

### GitHub Copilot

Copilot can invoke rlm-rs through its terminal integration:

```
@terminal rlm-rs load src/ --name code
@terminal rlm-rs search "error handling"
```

### Codex CLI

Codex can execute rlm-rs commands directly:

```bash
codex "Load the documentation and find sections about API authentication"
# Codex runs: rlm-rs load docs/ && rlm-rs search "API authentication"
```

### OpenCode / Aider

These tools can use rlm-rs as an external helper:

```bash
# In .aider.conf.yml or similar
tools:
  - name: rlm-search
    command: rlm-rs --format json search "$QUERY"
```

### VS Code Extensions

Extensions should use `execFile` instead of `exec` for security (avoids shell injection):

```typescript
import { execFile } from 'child_process';
import { promisify } from 'util';

const execFileAsync = promisify(execFile);

interface SearchResult {
    chunk_id: number;
    score: number;
}

interface SearchResponse {
    results: SearchResult[];
}

async function searchRLM(query: string): Promise<SearchResult[]> {
    // Using execFile (not exec) prevents shell injection
    const { stdout } = await execFileAsync('rlm-rs', [
        '--format', 'json',
        'search', query
    ]);
    const response: SearchResponse = JSON.parse(stdout);
    return response.results;
}
```

---

## Best Practices

### 1. Use Semantic Chunking for Code

```bash
rlm-rs load src/ --chunker semantic --chunk-size 3000
```

Semantic chunking respects function and class boundaries.

### 2. Name Buffers Meaningfully

```bash
rlm-rs load src/auth/ --name auth-module
rlm-rs load src/api/ --name api-handlers
rlm-rs load docs/ --name documentation
```

This makes search results more interpretable.

### 3. Use Hybrid Search by Default

```bash
rlm-rs search "query" --mode hybrid
```

Hybrid combines semantic understanding with keyword matching.

### 4. Batch Subagent Calls

Instead of sequential calls, use parallel Task invocations:

```
# Good: Parallel
Task(rlm-subcall, chunk 12) || Task(rlm-subcall, chunk 27) || Task(rlm-subcall, chunk 33)

# Avoid: Sequential
Task(rlm-subcall, chunk 12)
Task(rlm-subcall, chunk 27)
Task(rlm-subcall, chunk 33)
```

### 5. Store Intermediate Results

```bash
# After subcall analysis
rlm-rs add-buffer auth-analysis "$(cat subcall-results.json)"

# Later retrieval
rlm-rs show auth-analysis
```

---

## Error Handling for AI Assistants

When integrating rlm-rs into AI workflows, proper error handling ensures graceful recovery and good user experience. This section provides structured patterns for handling common errors.

### Error Detection

All rlm-rs commands return:
- **Exit code 0**: Success
- **Exit code 1**: Error (details in stderr)

With JSON format, errors are structured:

```json
{
  "error": "storage error: RLM not initialized. Run: rlm-rs init",
  "code": "NOT_INITIALIZED"
}
```

### Common Errors and Recovery Strategies

| Error Message | Cause | Recovery Strategy |
|---------------|-------|-------------------|
| `RLM not initialized` | Database not created | Run `rlm-rs init` |
| `buffer not found: <name>` | Buffer doesn't exist | Run `rlm-rs list` to verify |
| `chunk not found: <id>` | Invalid chunk ID | Re-run search to get valid IDs |
| `No results found` | Query too specific | Broaden query or lower threshold |
| `embedding error` | Model loading issue | Check disk space, retry once |
| `file not found` | Invalid path | Verify path exists before load |

### Structured Error Handling Pattern

```bash
# Robust error handling for AI assistants
RESULT=$(rlm-rs --format json search "$QUERY" 2>&1)
EXIT_CODE=$?

if [ $EXIT_CODE -ne 0 ]; then
    # Parse error
    ERROR=$(echo "$RESULT" | jq -r '.error // empty')

    case "$ERROR" in
        *"not initialized"*)
            rlm-rs init
            # Retry original command
            RESULT=$(rlm-rs --format json search "$QUERY")
            ;;
        *"buffer not found"*)
            echo "Buffer not found. Available buffers:"
            rlm-rs list
            ;;
        *"No results"*)
            echo "No results. Try broader query or: --threshold 0.1"
            ;;
        *)
            echo "Error: $ERROR"
            ;;
    esac
fi
```

### Retry Logic

For transient errors (embedding model loading, database locks):

```bash
MAX_RETRIES=3
RETRY_DELAY=1

for i in $(seq 1 $MAX_RETRIES); do
    RESULT=$(rlm-rs --format json chunk embed "$BUFFER" 2>&1)
    if [ $? -eq 0 ]; then
        break
    fi

    if [ $i -lt $MAX_RETRIES ]; then
        sleep $RETRY_DELAY
        RETRY_DELAY=$((RETRY_DELAY * 2))  # Exponential backoff
    fi
done
```

### Pre-flight Checks

Before complex workflows, verify prerequisites:

```bash
# Check 1: rlm-rs is installed
if ! command -v rlm-rs &> /dev/null; then
    echo "rlm-rs not found. Install with: cargo install rlm-rs"
    exit 1
fi

# Check 2: Database is initialized
if ! rlm-rs status &> /dev/null; then
    rlm-rs init
fi

# Check 3: Content is loaded
BUFFER_COUNT=$(rlm-rs --format json status | jq '.buffer_count')
if [ "$BUFFER_COUNT" -eq 0 ]; then
    echo "No content loaded. Use: rlm-rs load <file>"
    exit 1
fi

# Check 4: Embeddings exist for semantic search
EMBED_COUNT=$(rlm-rs --format json chunk status | jq '.embedded_chunks')
if [ "$EMBED_COUNT" -eq 0 ]; then
    echo "No embeddings. Generating..."
    rlm-rs chunk embed --all
fi
```

### Graceful Degradation

When semantic search fails, fall back to BM25:

```bash
# Try semantic first
RESULT=$(rlm-rs --format json search "$QUERY" --mode semantic 2>&1)

if echo "$RESULT" | jq -e '.error' > /dev/null 2>&1; then
    # Fall back to BM25 (keyword search, no embeddings required)
    RESULT=$(rlm-rs --format json search "$QUERY" --mode bm25)
fi
```

### Error Messages for Users

When reporting errors to users, provide actionable guidance:

```markdown
**Good**: "Buffer 'config' not found. Available buffers: main, auth. Did you mean one of these?"

**Bad**: "Error: buffer not found: config"
```

---

## Troubleshooting

### Command Not Found

```bash
# Check installation
which rlm-rs

# Install if missing
cargo install rlm-rs
# or
brew install zircote/tap/rlm-rs
```

### Database Not Initialized

```bash
rlm-rs init
```

### No Search Results

1. Check if content is loaded: `rlm-rs list`
2. Verify embeddings exist: `rlm-rs chunk status`
3. Try broader query or lower threshold: `--threshold 0.1`

### JSON Parsing Errors

Ensure you're using `--format json`:

```bash
rlm-rs --format json search "query"  # Correct
rlm-rs search "query" --format json  # Also correct
```

---

## System Prompt Templates

Ready-to-use system prompts for AI assistants integrating with rlm-rs are available in the `prompts/` directory:

| Template | Purpose | Recommended Model |
|----------|---------|-------------------|
| [rlm-orchestrator.md]prompts/rlm-orchestrator.md | Coordinates search, dispatch, and synthesis | sonnet |
| [rlm-analyst.md]prompts/rlm-analyst.md | Analyzes individual chunks | haiku |
| [rlm-synthesizer.md]prompts/rlm-synthesizer.md | Aggregates analyst findings | sonnet |

### Quick Start

1. **Orchestrator** receives user request and searches for relevant chunks
2. **Analysts** (parallel) process individual chunks and return structured findings
3. **Synthesizer** aggregates findings into a coherent report

```
User Request
┌─────────────┐
│ Orchestrator │──▶ rlm-rs search "query"
└─────────────┘
     ▼ dispatch
┌─────────────────────────────────┐
│  ┌─────────┐ ┌─────────┐ ┌─────────┐  │
│  │Analyst 1│ │Analyst 2│ │Analyst N│  │  (parallel)
│  └─────────┘ └─────────┘ └─────────┘  │
└─────────────────────────────────┘
     │ collect
┌─────────────┐
│ Synthesizer │──▶ Final Report
└─────────────┘
```

---

## See Also

- [RLM-Inspired Design]rlm-inspired-design.md - Architectural philosophy
- [CLI Reference]cli-reference.md - Complete command documentation
- [Architecture]architecture.md - Internal implementation details
- [Prompt Templates]prompts/ - System prompts for AI integration