# Koda ๐ป
A high-performance AI coding agent built in Rust.
Single compiled binary. Multi-provider LLM support. Zero runtime dependencies.
## Philosophy
**Koda is a personal AI assistant.** Coding is the starting point, but the platform
will expand to support email, calendar, knowledge management, and more โ all
powered by the same engine. This focus drives every design decision:
- **Everything just works.** `cargo install koda-cli` and you're done.
No Node.js, no Python, no Docker. Core tools (file ops, search, shell, web
fetch, memory, agents) are compiled in โ always available, zero config.
- **Auto-provisioned capabilities.** Beyond the core, koda auto-installs
MCP servers on demand. Ask about your email? Koda installs the email
integration transparently. You never configure plumbing.
- **MCP is the extension model.** Need GitHub API, databases, Slack? Connect
external MCP servers via `.mcp.json`, or let koda auto-discover them.
Koda stays lean; the ecosystem handles the long tail.
- **Ask Koda what it can do.** Just ask โ "what can you do?" Koda's
capabilities are embedded in its system prompt, so it can always describe
its own tools, commands, and features accurately.
## Install
```bash
# From crates.io
cargo install koda-cli
# From source
git clone https://github.com/lijunzh/koda.git
cd koda && cargo build --release -p koda-cli
# Binary is at target/release/koda
```
On first run, an onboarding wizard guides you through provider and API key setup.
## Quick Start
```bash
koda # Interactive REPL (auto-detects LM Studio)
koda --provider anthropic # Use a cloud provider
koda --model-tier strong # Force a specific tier (usually auto-adapts)
koda -p "fix the bug in auth.rs" # Headless one-shot
## Features
- **20+ built-in tools** โ file ops, search, shell, web fetch, memory, agents, AST analysis, email, context recall
- **MCP support** โ connect to any [MCP server](https://modelcontextprotocol.io) via `.mcp.json` (same format as Claude Code / Cursor)
- **14 LLM providers** โ LM Studio, OpenAI, Anthropic, Gemini, Groq, Grok, Ollama, DeepSeek, Mistral, MiniMax, OpenRouter, Together, Fireworks, vLLM
- **6 built-in agents** โ default, test writer, release engineer, codebase scout, planner, verifier
- **Model-adaptive** โ starts all models at Standard tier, then promotes to Strong or demotes to Lite based on observed tool-use quality
- **Lazy tool loading** โ Strong models get 9 core tools; discover more on demand via `DiscoverTools`
- **Smart context** โ queries context window from provider API at startup (falls back to lookup table), rate limit retry with backoff, auto-compact
- **Approval modes** โ auto (default) / strict (confirm writes) / safe (read-only) via `Shift+Tab`
- **Phase-aware gating** โ six-phase state machine (Understanding โ Executing โ Verifying) gates write approval based on where the agent is in its process
- **Folder-scoped permissions** โ writes outside `project_root` always require confirmation; bash commands with path escapes are flagged
- **Learned autonomy** โ `InterventionObserver` tracks human override patterns at phase gates and adapts autonomy over time
- **Diff preview** โ see exactly what changes before approving Edit, Write, Delete
- **Loop detection** โ catches repeated tool calls with configurable iteration caps
- **Parallel execution** โ concurrent tool calls and sub-agent orchestration
- **Extended thinking** โ structured thinking block display with configurable budgets
- **Image analysis** โ `@image.png` or drag-and-drop for multi-modal input
- **Git integration** โ `/diff` review, commit message generation
- **Headless mode** โ `koda -p "prompt"` with JSON output for CI/CD
- **Persistent memory** โ project (`MEMORY.md`) and global (`~/.config/koda/memory.md`)
- **Cost tracking** โ per-turn and per-session cost estimation including thinking tokens
### ๐ณ AST Code Analysis
Koda natively understands the structure of your codebase using embedded `tree-sitter` parsers.
- **Built-in languages:** Rust, Python, JavaScript, TypeScript โ instant function/class extraction and call graphs.
- **Extending with MCP:** Need Go, C++, or Java? Connect a community Tree-sitter MCP server via `.mcp.json`.
## REPL Commands
| `/help` | Command palette (select & execute) |
| `/agent` | List available sub-agents |
| `/compact` | Summarize conversation to reclaim context |
| `/cost` | Show token usage for this session |
| `/diff` | Show/review uncommitted changes |
| `/mcp` | MCP servers: status, add, remove, restart |
| `/memory` | View/save project & global memory |
| `/model` | Pick a model (โโ arrow keys) |
| `/provider` | Switch LLM provider |
| `/sessions` | List, resume, or delete sessions |
| `/exit` | Quit Koda |
**Tips:** `@file` to attach context ยท Tab to autocomplete ยท `Shift+Tab` to cycle mode ยท `Alt+Enter` for multi-line
### Keyboard Shortcuts
| **Tab** | At prompt | Autocomplete (`/commands`, `@files`, `/model names`) |
| **Alt+Enter** | At prompt | Insert newline (multi-line input) |
| **Ctrl+C** | During inference | Cancel the current turn |
| **Ctrl+C ร2** | During inference | Force quit Koda |
| **Ctrl+C** | At prompt (with text) | Clear the line |
| **Esc** | At prompt | Clear the line |
| **Shift+Tab** | At prompt | Cycle mode (auto โ strict โ safe) |
| **Ctrl+D** | At prompt (empty) | Exit Koda |
| **โ/โ** | At prompt | Browse command history |
## MCP (Model Context Protocol)
Koda connects to external [MCP servers](https://modelcontextprotocol.io) for additional tools.
Create a `.mcp.json` in your project root (same format as Claude Code / Cursor):
```json
{
"mcpServers": {
"context7": {
"command": "npx",
"args": ["-y", "@upstash/context7-mcp"]
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "$GITHUB_TOKEN" }
}
}
}
```
Servers auto-connect on startup. MCP tools appear alongside built-in tools with
namespaced names (e.g. `github.create_issue`). Manage at runtime with `/mcp`.
User-level servers go in `~/.config/koda/mcp.json` (merged, project overrides).
## Architecture
Koda is a Cargo workspace with four crates:
```
koda/
โโโ koda-core/ # Engine library (providers, tools, inference, DB) โ zero terminal deps
โโโ koda-cli/ # CLI binary (REPL, display, approval UI)
โโโ koda-ast/ # MCP server: tree-sitter AST analysis
โโโ koda-email/ # MCP server: email via IMAP/SMTP
```
The engine communicates through `EngineEvent` (output) and `EngineCommand` (input) enums
over async channels. See [DESIGN.md](DESIGN.md) for architectural decisions.
### Model-Adaptive Architecture
Koda auto-detects your model's capabilities and adapts its behavior:
| **Strong** | Promoted at runtime after 3 successful tool-use turns | Minimal prompts, lazy tool loading, parallel execution |
| **Standard** | Default for all models | Full prompts, all tools, balanced |
| **Lite** | Demoted at runtime after 2+ hallucinated/malformed tool calls | Verbose prompts, step-by-step guidance |
Tier is observed at runtime, not guessed from model names. Override with `--model-tier strong|standard|lite` or `"model_tier": "strong"` in agent config.
## Getting the Most Out of Koda
### Model tiers adapt automatically
Koda starts every model at Standard and adapts based on observed behavior:
- **Promotion to Strong** โ after 3 turns of valid tool calls (correct names, parseable JSON)
- **Demotion to Lite** โ if 2+ tool calls hallucinate names or send malformed JSON
You can force a tier if needed:
```bash
koda --model-tier strong # Minimal prompts, lazy tools (saves ~57% token overhead)
koda --model-tier lite # Verbose prompts, step-by-step guidance for small models
```
The status bar shows your current tier: `claude-sonnet-4-6 [Standard]` (then `[Strong]` after promotion)
### Delegate with sub-agents
Koda ships with specialized agents. Use them for focused tasks:
| **scout** | Codebase exploration (read-only) | Read, List, Grep, Glob |
| **testgen** | Test generation | All tools |
| **planner** | Task decomposition (read-only) | Read, List, Grep, Glob |
| **verifier** | Quality verification | Read, Grep, Bash |
| **releaser** | Release engineering | All tools |
Koda's intent classifier suggests agents automatically: "find all uses of X" โ scout, "write tests" โ testgen.
Sub-agents can run on different models for cost optimization:
```json
// agents/scout.json โ use cheap model for exploration
{
"name": "scout",
"provider": "gemini",
"model": "gemini-2.5-flash",
"allowed_tools": ["Read", "List", "Grep", "Glob"],
"max_iterations": 10
}
```
### Context window management
Koda auto-detects your model's context window and manages it:
| Claude Opus/Sonnet | 200K tokens | 90% (Strong) |
| Gemini 2.5 | 1M tokens | 80% (Standard) |
| GPT-4o | 128K tokens | 90% (Strong) |
| Local models | 4Kโ128K | 70% (Lite) |
Use `/compact` manually, or let auto-compact handle it. The `/cost` command shows token usage and estimated cost.
### Save tokens with DiscoverTools
Strong-tier models load only core tools (Read, Write, Edit, etc.) by default. When the model needs agents, skills, or other capabilities, it calls `DiscoverTools` to load them on demand โ saving ~57% per-turn tool overhead.
### Recall older context
If context was dropped from the sliding window, the model can use `RecallContext` to search or retrieve specific turns from conversation history.
## Documentation
- **[DESIGN.md](DESIGN.md)** โ Design decisions and rationale
- **[CHANGELOG.md](CHANGELOG.md)** โ Release history
- **[CLAUDE.md](CLAUDE.md)** โ Developer guide for AI assistants
- **[GitHub Issues](https://github.com/lijunzh/koda/issues)** โ Roadmap and release tracking
## Development
```bash
cargo test --workspace --features koda-core/test-support # Run all 432 tests
cargo clippy --workspace # Lint
cargo run -p koda-cli # Run locally
```
## License
MIT