koda-cli 0.1.3

A high-performance AI coding agent built in Rust
Documentation
# Koda ๐Ÿป

A high-performance AI coding agent built in Rust.

Single compiled binary. Multi-provider LLM support. Zero runtime dependencies.

## Philosophy

**Koda is a personal AI assistant.** Coding is the starting point, but the platform
will expand to support email, calendar, knowledge management, and more โ€” all
powered by the same engine. This focus drives every design decision:

- **Everything just works.** `cargo install koda-cli` and you're done.
  No Node.js, no Python, no Docker. Core tools (file ops, search, shell, web
  fetch, memory, agents) are compiled in โ€” always available, zero config.
- **Auto-provisioned capabilities.** Beyond the core, koda auto-installs
  MCP servers on demand. Ask about your email? Koda installs the email
  integration transparently. You never configure plumbing.
- **MCP is the extension model.** Need GitHub API, databases, Slack? Connect
  external MCP servers via `.mcp.json`, or let koda auto-discover them.
  Koda stays lean; the ecosystem handles the long tail.
- **Ask Koda what it can do.** Just ask โ€” "what can you do?" Koda's
  capabilities are embedded in its system prompt, so it can always describe
  its own tools, commands, and features accurately.

## Install

```bash
# From crates.io
cargo install koda-cli

# From source
git clone https://github.com/lijunzh/koda.git
cd koda && cargo build --release -p koda-cli
# Binary is at target/release/koda
```

On first run, an onboarding wizard guides you through provider and API key setup.

## Quick Start

```bash
koda                              # Interactive REPL (auto-detects LM Studio)
koda --provider anthropic         # Use a cloud provider
koda --model-tier strong          # Force a specific tier (usually auto-adapts)
koda -p "fix the bug in auth.rs"  # Headless one-shot
echo "explain this" | koda        # Piped input
```

## Features

- **20+ built-in tools** โ€” file ops, search, shell, web fetch, memory, agents, AST analysis, email, context recall
- **MCP support** โ€” connect to any [MCP server]https://modelcontextprotocol.io via `.mcp.json` (same format as Claude Code / Cursor)
- **14 LLM providers** โ€” LM Studio, OpenAI, Anthropic, Gemini, Groq, Grok, Ollama, DeepSeek, Mistral, MiniMax, OpenRouter, Together, Fireworks, vLLM
- **6 built-in agents** โ€” default, test writer, release engineer, codebase scout, planner, verifier
- **Model-adaptive** โ€” starts all models at Standard tier, then promotes to Strong or demotes to Lite based on observed tool-use quality
- **Lazy tool loading** โ€” Strong models get 9 core tools; discover more on demand via `DiscoverTools`
- **Smart context** โ€” queries context window from provider API at startup (falls back to lookup table), rate limit retry with backoff, auto-compact
- **Approval modes** โ€” auto (default) / strict (confirm writes) / safe (read-only) via `Shift+Tab`
- **Diff preview** โ€” see exactly what changes before approving Edit, Write, Delete
- **Loop detection** โ€” catches repeated tool calls with configurable iteration caps
- **Parallel execution** โ€” concurrent tool calls and sub-agent orchestration
- **Extended thinking** โ€” structured thinking block display with configurable budgets
- **Image analysis** โ€” `@image.png` or drag-and-drop for multi-modal input
- **Git integration** โ€” `/diff` review, commit message generation
- **Headless mode** โ€” `koda -p "prompt"` with JSON output for CI/CD
- **Persistent memory** โ€” project (`MEMORY.md`) and global (`~/.config/koda/memory.md`)
- **Cost tracking** โ€” per-turn and per-session cost estimation including thinking tokens

### ๐ŸŒณ AST Code Analysis

Koda natively understands the structure of your codebase using embedded `tree-sitter` parsers.
- **Built-in languages:** Rust, Python, JavaScript, TypeScript โ€” instant function/class extraction and call graphs.
- **Extending with MCP:** Need Go, C++, or Java? Connect a community Tree-sitter MCP server via `.mcp.json`.

## REPL Commands

| Command | Description |
|---------|-------------|
| `/help` | Command palette (select & execute) |
| `/agent` | List available sub-agents |
| `/compact` | Summarize conversation to reclaim context |
| `/cost` | Show token usage for this session |
| `/diff` | Show/review uncommitted changes |
| `/mcp` | MCP servers: status, add, remove, restart |
| `/memory` | View/save project & global memory |
| `/model` | Pick a model (โ†‘โ†“ arrow keys) |
| `/provider` | Switch LLM provider |
| `/sessions` | List, resume, or delete sessions |
| `/exit` | Quit Koda |

**Tips:** `@file` to attach context ยท Tab to autocomplete ยท `Shift+Tab` to cycle mode ยท `Alt+Enter` for multi-line

### Keyboard Shortcuts

| Key | Context | Action |
|-----|---------|--------|
| **Tab** | At prompt | Autocomplete (`/commands`, `@files`, `/model names`) |
| **Alt+Enter** | At prompt | Insert newline (multi-line input) |
| **Ctrl+C** | During inference | Cancel the current turn |
| **Ctrl+C ร—2** | During inference | Force quit Koda |
| **Ctrl+C** | At prompt (with text) | Clear the line |
| **Esc** | At prompt | Clear the line |
| **Shift+Tab** | At prompt | Cycle mode (auto โ†’ strict โ†’ safe) |
| **Ctrl+D** | At prompt (empty) | Exit Koda |
| **โ†‘/โ†“** | At prompt | Browse command history |

## MCP (Model Context Protocol)

Koda connects to external [MCP servers](https://modelcontextprotocol.io) for additional tools.
Create a `.mcp.json` in your project root (same format as Claude Code / Cursor):

```json
{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@upstash/context7-mcp"]
    },
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "$GITHUB_TOKEN" }
    }
  }
}
```

Servers auto-connect on startup. MCP tools appear alongside built-in tools with
namespaced names (e.g. `github.create_issue`). Manage at runtime with `/mcp`.

User-level servers go in `~/.config/koda/mcp.json` (merged, project overrides).

## Architecture

Koda is a Cargo workspace with four crates:

```
koda/
โ”œโ”€โ”€ koda-core/     # Engine library (providers, tools, inference, DB) โ€” zero terminal deps
โ”œโ”€โ”€ koda-cli/      # CLI binary (REPL, display, approval UI)
โ”œโ”€โ”€ koda-ast/      # MCP server: tree-sitter AST analysis
โ””โ”€โ”€ koda-email/    # MCP server: email via IMAP/SMTP
```

The engine communicates through `EngineEvent` (output) and `EngineCommand` (input) enums
over async channels. See [DESIGN.md](DESIGN.md) for architectural decisions.

### Model-Adaptive Architecture

Koda auto-detects your model's capabilities and adapts its behavior:

| Tier | Models | Behavior |
|------|--------|----------|
| **Strong** | Promoted at runtime after 3 successful tool-use turns | Minimal prompts, lazy tool loading, parallel execution |
| **Standard** | Default for all models | Full prompts, all tools, balanced |
| **Lite** | Demoted at runtime after 2+ hallucinated/malformed tool calls | Verbose prompts, step-by-step guidance |

Tier is observed at runtime, not guessed from model names. Override with `--model-tier strong|standard|lite` or `"model_tier": "strong"` in agent config.

## Getting the Most Out of Koda

### Model tiers adapt automatically

Koda starts every model at Standard and adapts based on observed behavior:
- **Promotion to Strong** โ€” after 3 turns of valid tool calls (correct names, parseable JSON)
- **Demotion to Lite** โ€” if 2+ tool calls hallucinate names or send malformed JSON

You can force a tier if needed:

```bash
koda --model-tier strong    # Minimal prompts, lazy tools (saves ~57% token overhead)
koda --model-tier lite      # Verbose prompts, step-by-step guidance for small models
```

The status bar shows your current tier: `claude-sonnet-4-6 [Standard]` (then `[Strong]` after promotion)

### Delegate with sub-agents

Koda ships with specialized agents. Use them for focused tasks:

| Agent | Purpose | Tools |
|-------|---------|-------|
| **scout** | Codebase exploration (read-only) | Read, List, Grep, Glob |
| **testgen** | Test generation | All tools |
| **planner** | Task decomposition (read-only) | Read, List, Grep, Glob |
| **verifier** | Quality verification | Read, Grep, Bash |
| **releaser** | Release engineering | All tools |

Koda's intent classifier suggests agents automatically: "find all uses of X" โ†’ scout, "write tests" โ†’ testgen.

Sub-agents can run on different models for cost optimization:
```json
// agents/scout.json โ€” use cheap model for exploration
{
  "name": "scout",
  "provider": "gemini",
  "model": "gemini-2.5-flash",
  "allowed_tools": ["Read", "List", "Grep", "Glob"],
  "max_iterations": 10
}
```

### Context window management

Koda auto-detects your model's context window and manages it:

| Model | Context | Auto-compact at |
|-------|---------|----------------|
| Claude Opus/Sonnet | 200K tokens | 90% (Strong) |
| Gemini 2.5 | 1M tokens | 80% (Standard) |
| GPT-4o | 128K tokens | 90% (Strong) |
| Local models | 4Kโ€“128K | 70% (Lite) |

Use `/compact` manually, or let auto-compact handle it. The `/cost` command shows token usage and estimated cost.

### Save tokens with DiscoverTools

Strong-tier models load only core tools (Read, Write, Edit, etc.) by default. When the model needs agents, skills, or other capabilities, it calls `DiscoverTools` to load them on demand โ€” saving ~57% per-turn tool overhead.

### Recall older context

If context was dropped from the sliding window, the model can use `RecallContext` to search or retrieve specific turns from conversation history.

## Documentation

- **[DESIGN.md]DESIGN.md** โ€” Design decisions and rationale
- **[CHANGELOG.md]CHANGELOG.md** โ€” Release history
- **[CLAUDE.md]CLAUDE.md** โ€” Developer guide for AI assistants
- **[GitHub Issues]https://github.com/lijunzh/koda/issues** โ€” Roadmap and release tracking

## Development

```bash
cargo test --workspace --features koda-core/test-support  # Run all 489 tests
cargo clippy --workspace      # Lint
cargo run -p koda-cli         # Run locally
```

## License

MIT