catenary-mcp 1.3.5

A high-performance multiplexing bridge between MCP (Model Context Protocol) and LSP (Language Server Protocol). Enables LLMs to access IDE-grade code intelligence across multiple languages simultaneously with smart routing and UTF-8 accuracy.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
# Archive: CLI Design

> **Status: Abandoned (2026-02-06)**
>
> This design was abandoned because subscription plans ($20/month Pro tier) are
> tied to official CLI tools (Claude Code, Gemini CLI). A custom CLI would
> require pay-per-token API access — wrong billing model for individual
> developers.
>
> See [CLI Integration]../cli-integration.md for the current approach: disable
> built-in tools in existing CLIs, replace with catenary-mcp.

---

Original design document for `catenary-cli` — an AI coding assistant that owns
the model interaction loop.

## Problem

Existing AI coding tools (Claude Code, Gemini CLI) provide LSP tools but models
bypass them. They default to grep/read patterns from training data. Writes are
silent — no immediate feedback on errors.

The tools exist. Models don't use them.

**Root cause:** MCP tools are opt-in. The model chooses whether to use them.
Nothing enforces efficient patterns.

**Secondary issue:** These tools are built by companies that bill by usage.
Efficiency isn't incentivized.

## Solution

Catenary owns the outer loop. The model can't skip the feedback loop because
catenary-cli controls what tools exist and what results come back.

```
User → catenary-cli → Model API
            Tool execution (LSP-first)
            Feedback to model
```

## Design Principles

### Simple

One loop. No orchestrated modes. No sub-agents created and disposed
automatically. No "planning mode" that creates fresh contexts and forces
re-reading everything when it ends.

Planning happens in conversation — like any terminal session. The tool doesn't
impose structure.

### Fast

Execute immediately. Stream output. No artificial delays.

### Minimal

Expose tools. Let the model work. We control what tools exist and what feedback
comes back — not the model's reasoning process.

### Efficient

- LSP-first: hover instead of file read, symbols instead of grep
- Diagnostics on write: catch errors immediately, not 5 requests later
- Every token counts — users are on Pro tier ($20/month), not unlimited
- No throwaway contexts that need to be rebuilt

## Architecture

```
catenary-core/
├── LSP client management
├── Tool implementations
└── MCP type definitions (schema, not transport)

catenary-mcp/
└── MCP transport wrapper (JSON-RPC, stdio)

catenary-cli/
├── REPL loop
├── Model API client
└── Tool dispatch (calls core directly)
```

**MCP types as interface:** Core exposes tools using MCP type definitions. This
means:

- catenary-mcp wraps them for MCP transport
- catenary-cli uses them directly (no serialization overhead)
- Future tools just implement the MCP interface

**Open/closed:** Open to extension, closed to modification. Want a new tool?
Add it via MCP types. Core doesn't change.

## MVP Requirements

### REPL Loop

```
┌─────────────────────────────────────┐
│ catenary-cli (claude-sonnet-4-...) │
├─────────────────────────────────────┤
│ > user prompt                       │
│                                     │
│ [model streaming response...]       │
│                                     │
│ Tool: write_file                    │
│ Path: src/main.rs                   │
│ ┌─────────────────────────────────┐ │
│ │ - old line                      │ │
│ │ + new line                      │ │
│ └─────────────────────────────────┘ │
│ Allow? [y/n/e]:                     
│                                     │
│ > _                                 │
└─────────────────────────────────────┘
```

**Core loop:**

1. Read user input
2. Send to model (stream response)
3. On tool call:
   - Display tool + args (diff for write/edit)
   - Await approval (single keypress)
   - Execute via catenary-core
   - Return result to model
   - Repeat if more tool calls
4. Display final response
5. Return to prompt

### Tool Approval

Every tool call requires explicit approval. No auto-approve mode.

- `y` — approve and execute
- `n` — reject, return rejection to model
- `e` — edit (for write/edit: open diff in $EDITOR)
- `?` — show explanation of what tool will do

**Why no auto-approve:** It's a trap. Models burn through tokens when
unchecked — reading 10 files when 1 would do, trying 5 command variants when
the first failed. The approval gate is a rate limiter and course-correction
point.

### Interrupt Handling

Ctrl+C cancels in-flight API request and returns to prompt cleanly.

### Minimum Tools

| Tool | Behavior |
|------|----------|
| `read_file` | Read file contents |
| `write_file` | Write + return diagnostic summary |
| `edit_file` | Edit + return diagnostic summary |
| `search` | LSP-backed, grep fallback (see below) |
| `build` | Run project build command |
| `test` | Run project tests |
| `git` | Status, diff, commit, push |
| `web_search` | Search the web |

**Write/edit feedback:** No silent writes. Every write returns diagnostic
summary (errors, warnings). The model can't proceed unaware that it broke
something.

### No Arbitrary Shell

No `shell` tool. Every action goes through a targeted MCP tool.

**Why:**

- Model can't bypass `search` with raw `grep`
- Model can't `cat` files instead of using `read_file`
- No accidental `rm -rf` or destructive commands
- Every action is intentional and auditable
- Token efficient — no parsing noisy shell output

**What shell typically does → MCP alternative:**

| Shell use case | MCP tool |
|----------------|----------|
| Build/compile | `build()` |
| Run tests | `test()` |
| Git operations | `git()` |
| Package install | `add_dependency()` |
| Run scripts | `run_script(path)` — curated list |
| File ops (mkdir, mv) | `mkdir()`, `move()`, `delete()` |
| Docker/k8s | User-configured MCP |
| Ansible | User-configured MCP |

**The long tail:** Users configure additional MCP tools for their workflow
(post-MVP scope). Model uses what's available, can't escape to raw shell.

The "limitation" is the feature. Intentionality over flexibility.

**Enforces good practices:**

Without shell, model can't run one-off validation scripts. It has to write
proper tests.

Old pattern (with shell):
1. Model writes code
2. Model runs `python test_quick.py` to validate
3. Model deletes `test_quick.py`
4. No trace, not repeatable

New pattern (no shell):
1. Model writes code
2. Model can only run `test()` — needs actual tests
3. Model writes proper test in test suite
4. Test is permanent, documented, repeatable

**Denial as teaching:**

```
Tool: delete("test_quick.py")
Allow? [y/n/e]: n

> Refactor this into a proper test

Model: "I'll add this to the test suite..."
```

User guides model toward better practices in real-time. The tool approval
isn't just safety — it's a feedback loop.

### Smart Search

`search(path, query)` — one tool, catenary handles routing.

**When LSP available:**

```
search("src/", "parse_config")
→ Results (via rust-analyzer):
  src/config.rs:42 — fn parse_config()  [definition]
```

Pinpoint accuracy. Definition vs usage distinguished.

**When LSP unavailable:**

```
search("src/", "parse_config")
→ Results (via grep — LSP unavailable):
  Note: grep cannot distinguish definition from usage.
  Results may include call sites. Definition may be in
  files outside search path.

  src/config.rs:42: fn parse_config()
  src/main.rs:15: parse_config()
  src/main.rs:89: parse_config()
  ...
```

Model sees the degradation, knows results are noisy. No silent fallback.

### LSP Monitoring

LSP session monitoring in MVP — essential for debugging when LSPs crash or
return unexpected results.

**Subcommands:**

```bash
catenary list      # show active LSP sessions
catenary monitor   # real-time event stream
```

**TUI integration:**

- `Ctrl+L` — toggle LSP monitor panel
- Status bar shows active LSP count/status
- See requests/responses in real-time

**Implementation:** Monitoring logic lives in catenary-core. Both CLI and MCP
binaries expose it. Core already has event broadcasting from Phase 4.5.

### LSP Recovery

User controls LSP failure recovery — no automatic retry loops.

**Crash during tool call:**

```
┌─────────────────────────────────────┐
│ ⚠ rust-analyzer crashed             │
│ [r]estart  [d]isable                │
└─────────────────────────────────────┘
```

- **Restart** — catenary restarts LSP, retries tool
- **Disable** — LSP disabled for session

**Background crash:**

- Status bar shows crash
- Non-blocking notification
- User addresses when ready

**Fallback mode (break glass):**

When model calls an LSP tool and LSP is unavailable:

1. **Skip user approval** — don't prompt for a broken tool
2. Return error immediately to model:
   ```
   LSP unavailable for rust. Use grep/glob for text search.
   Write/edit will work but diagnostics unavailable.
   ```
3. Model self-corrects and reaches for available tools

No silent tool swapping. No wasted user prompts. Model sees the limitation,
adapts its approach. Tool behavior stays consistent throughout session.

### Editor Integration

Full `$EDITOR` integration (neovim, vim, etc.) — no janky "vim mode" emulation.

**For prompt input:**

`Ctrl+G` opens `$EDITOR` with current input. User writes prompt with full editor
power, saves/quits, content returns to input box.

**For diff editing:**

`e` during tool approval opens `$EDITOR` with proposed changes. User edits,
saves/quits, edited content becomes the approved change.

**Implementation pattern:**

```
1. Write current content to temp file
2. Suspend TUI (LeaveAlternateScreen)
3. Spawn $EDITOR with temp file
4. Wait for editor to exit
5. Resume TUI (EnterAlternateScreen)
6. Read temp file, use as new content
```

Your editor, your config, your plugins.

### Display Requirements

- Show which model is active (in header/prompt)
- Show diff for write/edit before approval
- Stream model output as it arrives

## Future Scope (Post-MVP)

### Token/Request Monitoring

Real-time display of token usage and request count. Helps users stay within
tier limits.

### Additional MCP Tools

Allow configuration of external MCP servers for extended functionality.

### Context Management

When context window fills:

- Summarize conversation history
- Compact context
- Use local model (ollama/llama.cpp) for this — no API cost

**Model routing consideration:** Can't share tokens between Claude and Gemini.
Parallel contexts would double cost. If we add model routing, local models
handle the context bridge.

### Local Model Integration

Local models for supporting roles — not primary reasoning:

**Use cases:**

- **Embeddings** — semantic search over codebase
- **Context compression** — summarize history before API call
- **Context sanitization** — strip noise/secrets before sending to API

**Requirements:**

- **Transparent** — user sees when local compute is running, not hidden
- **Optional** — user can disable local compute entirely
- **Configurable** — works with 70B models (64GB RAM) or 300M models (8GB RAM)
- **Graceful degradation** — if no local model, skip the stage

```
User prompt
    [Local: sanitize/compress] ← optional, visible
    Claude API ← sees clean/small context
    Tool calls via catenary-core
    [Local: embed for search] ← optional, visible
```

Not everyone has 64GB unified memory. The tool works without local models but
benefits from them when available.

### Model Routing

Different models for different tasks:

- Claude: complex reasoning
- Gemini Flash: fast execution

Requires local model for context management. Not MVP scope.

## Implementation

### TUI Framework

**ratatui** — immediate-mode terminal UI framework.

- Widget-based: composable, reusable components
- Immediate-mode rendering: redraw from state each frame, no buffer accumulation
- Avoids the lag problem (Claude Code gets slow with long history)
- Already have `crossterm` in deps; ratatui uses it as backend

### Widgets (MVP)

| Widget | Purpose |
|--------|---------|
| Input | User prompt entry, Ctrl+G to $EDITOR |
| Conversation | Scrollable message history |
| Diff | Unified diff for write/edit approval |
| Tool approval | Tool name, args, y/n/e/? prompt |
| Status bar | Model name, connection status |

**Layout:**

```
┌─────────────────────────────────────┐
│ Status: claude-sonnet-4-...        │
├─────────────────────────────────────┤
│                                     │
│ [conversation / streaming output]   │
│                                     │
├─────────────────────────────────────┤
│ > user input                        │
└─────────────────────────────────────┘
```

**Tool approval replaces main area:**

```
┌─────────────────────────────────────┐
│ Tool: write_file                    │
│ Path: src/main.rs                   │
├─────────────────────────────────────┤
│ - fn old()                          │
│ + fn new()                          │
├─────────────────────────────────────┤
│ [y]es [n]o [e]dit [?]help           │
└─────────────────────────────────────┘
```

### Markdown Rendering

**tui-markdown** — converts markdown to ratatui `Text` type.

- Model outputs plain markdown
- `tui-markdown` parses and styles (headers, code blocks, bold, etc.)
- Includes `syntect` for code syntax highlighting
- Render result in `Paragraph` widget

### Alternate Screen Buffer

Use `crossterm::terminal::{EnterAlternateScreen, LeaveAlternateScreen}`.

- Like vim/less — enter alternate buffer, exit cleanly
- Shell history untouched
- Suspend for $EDITOR, resume after

### Session Logging

```
~/.local/state/catenary/
├── sessions/
│   ├── 2026-02-06_103045.jsonl
│   └── 2026-02-06_142312.jsonl
└── current -> sessions/...
```

- XDG-compliant (`~/.local/state/`)
- JSONL format: one JSON object per message, easy to parse
- Full history in logs, viewport shows recent context

## Dependencies

**Required:**

- `ratatui` — TUI framework (MIT)
- `tui-markdown` — markdown to ratatui (MIT, includes syntect)
- `crossterm` — terminal backend (already in catenary)
- `reqwest` — HTTP client for model APIs
- `similar` or `diffy` — diff generation

**Future:**

- `ollama` client — local model management (MIT)
- Or `llama.cpp` bindings — raw inference (MIT)

## Open Decisions

Design questions to resolve before implementation.

### Model API

- [ ] Which model provider first? (Claude, Gemini, OpenAI)
- [ ] Use SDK crate or raw reqwest?
- [ ] Streaming response handling approach

### Authentication

- [ ] Where do API keys live? (env var, config file, keyring)
- [ ] Support multiple providers simultaneously?

### Configuration

- [ ] Config file location (`~/.config/catenary/cli.toml`?)
- [ ] What's user-configurable? (model, keybindings, theme)
- [ ] Runtime config changes or restart required?

### System Prompt

- [ ] Hardcoded base prompt?
- [ ] User-configurable additions?
- [ ] Per-session overrides?

### Context Management

- [ ] When to truncate conversation? (token limit)
- [ ] MVP: simple truncation or summarization?
- [ ] How to handle tool results in context?

### Diff Display

- [ ] Unified or side-by-side format?
- [ ] Which diff library? (`similar`, `diffy`)
- [ ] Syntax highlighting in diffs?

### Keybindings

- [ ] Fixed keybindings or customizable?
- [ ] Vim-style navigation in conversation?
- [ ] Document default keybindings

### Error Handling

- [ ] Network/API errors: inline, modal, or status bar?
- [ ] Tool execution errors: how to display?
- [ ] Retry logic for transient failures?

### Tool Interface

- [ ] How does catenary-core expose tools to CLI?
- [ ] Tool result format (structured or text?)
- [ ] Timeout handling for long-running tools

## Prototype

Validate the concept before building catenary-cli. Zero new code.

### Stack

```
mcphost (MIT)
├── disable built-in tools (omit from config)
├── catenary-mcp (already exists)
└── gemini-flash-lite (cheap, fast)
```

### Configuration

```json
{
  "mcpServers": {
    "catenary": {
      "command": "catenary-mcp"
    }
  }
}
```

No `fs`, no `bash`, no `http`. Model only has catenary tools.

### What We're Testing

- [ ] Model can only use catenary tools (no escape)
- [ ] Search uses LSP when available
- [ ] Search falls back to grep with degradation notice
- [ ] Write returns diagnostics
- [ ] Model adapts when LSP unavailable
- [ ] No shell bypass attempts

### Run It

```bash
mcphost --config catenary-only.json -- gemini-flash-lite
```

Give it a coding task. Watch behavior. Does it work? Does it try to escape?
Does it adapt?

### Success Criteria

If the model:
1. Uses catenary tools for file/search operations
2. Receives LSP-backed results (or graceful degradation)
3. Can't bypass to raw shell/grep
4. Completes coding tasks successfully

Then catenary-cli is just a polished TUI on top of this pattern.

### Why gemini-flash-lite

- Cheap (test iterations without cost concern)
- Fast (quick feedback loop)
- "Doer not thinker" — executes without overthinking
- If it works with flash-lite, it works with better models

## Non-Goals

- Pretty UI/animations
- Auto-approve mode
- Orchestrated modes (planning mode, proposal mode) that create/dispose contexts
- Automatic sub-agents that run in fresh contexts
- VSCode integration
- Mac-first design

This is a terminal tool for terminal users. Planning happens in conversation,
not in a special mode.