Mermaid
An open-source AI coding assistant with computer use for the terminal. Multi-provider — Ollama (local), Anthropic, Gemini, OpenAI, Groq, OpenRouter, and any OpenAI-compatible endpoint — with native tool calling, subagents, desktop control, and a clean TUI.
Features
- Multi-Provider — Ollama (local/cloud), Anthropic Claude, Google Gemini, OpenAI, Groq, OpenRouter, Cerebras, DeepInfra, Together, plus fully-custom OpenAI-compatible endpoints
- Native Tool Calling — read, write, edit, execute commands, search the web, manage MCP servers
- Computer Use — screenshot, click, type, scroll — full desktop control via vision models
- Subagents — spawn parallel autonomous agents for independent tasks
- Agent Loop — model calls tools autonomously, sees results, and continues until done
- Image Paste — Ctrl+V to attach images for vision models (X11/Wayland/macOS/Windows)
- Reasoning Levels — seven tiers (
none/minimal/low/medium/high/xhigh/max); cycle with Alt+T or set via/reasoning; persisted per-model - MERMAID.md — auto-loaded project-level instructions; edits take effect on the next turn
- MCP Servers — stdio JSON-RPC client with a built-in registry of 16 popular servers (
mermaid add <name>) - Session Persistence — conversations auto-save and resume with
--continue - Message Queuing — type while the model generates, messages send in order
- Non-Interactive Mode — script with
mermaid run "prompt"for CI/automation
Quick Start
# Install from crates.io
# Or from source
Local inference requires Ollama (models auto-pull if not found locally). Cloud providers are optional — see Remote Providers below.
Computer Use Dependencies (optional)
For desktop control via screenshot/click/type tools:
# Linux / X11
# Linux / Wayland
# Screenshot downscaling (optional, for high-res displays)
macOS and Windows are supported through screencapture/pngpaste and PowerShell respectively. See src/agents/computer_use.rs for the full platform matrix.
Usage
mermaid add <name> resolves the name through a built-in registry of 16 popular MCP servers (context7, playwright, memory, git, fetch, time, filesystem, notion, slack, postgres, brave-search, supabase, perplexity, docker, sequential-thinking, everything), prompts for any required env vars, validates by spawning the server, and saves it to ~/.config/mermaid/config.toml.
Keyboard Shortcuts
| Key | Action |
|---|---|
| Enter | Send message (or queue while the model is generating) |
| Esc | Stop generation / clear input / dismiss command palette |
| Ctrl+C | Quit (auto-saves the session) |
| Alt+T | Cycle reasoning level: None → Low → Medium → High → Max → None |
| Ctrl+V | Paste image or text from clipboard |
| Ctrl+O | Preview attached image |
| Ctrl+Click | Open image from chat history |
/ |
Open slash-command palette (filter-as-you-type) |
| Tab | In palette: complete highlighted command name |
| Up/Down | Navigate input history; palette navigation; scroll chat |
| Page Up/Down | Scroll chat |
| Mouse Wheel | Scroll chat |
Slash Commands
Type / to open the command palette (shows all commands with live filter); type /<name> to invoke directly.
| Command | Description |
|---|---|
/model <name> |
Switch model; auto-pulls Ollama models if needed |
/reasoning <level> |
Set reasoning: none, minimal, low, medium, high, xhigh, max |
/clear |
Clear chat history and model context for this session |
/save [name] |
Save the current conversation |
/load [id] |
Load a saved conversation by id |
/list |
List saved conversations |
/cloud-setup |
Show Ollama Cloud API-key setup instructions |
/help (/h) |
Show all commands |
/quit (/q) |
Exit |
Reasoning choices persist per-model: setting /reasoning high on Claude Opus 4.7 and /reasoning low on Ollama is remembered across sessions.
Tools
The model uses these autonomously via native tool calling:
| Tool | Description |
|---|---|
read_file |
Read files (text, PDF, images) |
write_file |
Create or overwrite files (timestamped backup if file exists) |
edit_file |
Targeted text replacement with diff |
delete_file |
Delete files (timestamped backup) |
create_directory |
Create directories |
execute_command |
Run any command — terminal, GUI apps, scripts, servers |
web_search |
Search the web (Ollama Cloud) |
web_fetch |
Fetch URL content as markdown (Ollama Cloud) |
agent |
Spawn autonomous sub-agent for parallel tasks |
screenshot |
Capture the screen (fullscreen, focused window, monitor, region, or window by title) |
list_windows |
List visible window titles (discovery for window-mode screenshots) |
click |
Click at screen coordinates (auto-screenshot after) |
type_text |
Type text at cursor position (auto-screenshot after) |
press_key |
Press key combos (ctrl+s, alt+tab, etc.) |
scroll |
Scroll up or down |
mouse_move |
Move mouse cursor without clicking |
MCP servers contribute additional tools under the mcp__<server>__<tool> prefix when configured.
Project Instructions (MERMAID.md)
Create a MERMAID.md at your project root with conventions, tool versions, naming patterns, and run commands — Mermaid loads it automatically at session start and auto-reloads when the file changes (one stat per turn, no filesystem watcher). The walk stops at the .git root or $HOME.
- --
-
File size is capped at ~10k tokens; oversized content is truncated with a marker so the model knows context was elided.
Configuration
Config file: ~/.config/mermaid/config.toml (Linux) or platform equivalent via directories crate.
Run mermaid init to create a default config. Full surface:
# Last model picked via `--model` — used by bare `mermaid` on next start
= "ollama/qwen3-coder:30b"
[]
= "ollama"
= "qwen3-coder:30b"
= 0.7
= 4096
= "medium" # none | minimal | low | medium | high | xhigh | max
[]
= "localhost"
= 11434
# cloud_api_key = "your-key" # for :cloud models + web_search/web_fetch
# num_gpu = 10
# num_ctx = 8192
[]
= "text"
= 4096
= false
# Per-model reasoning preferences (remembered across sessions)
[]
= "high"
= "low"
# Remote providers — override env-var name, base URL, or extra headers
[]
# api_key_env = "MY_ANTHROPIC_KEY" # default: ANTHROPIC_API_KEY
[]
# api_key_env = "MY_GOOGLE_KEY" # default: GOOGLE_API_KEY
[]
# api_key_env = "MY_GROQ_KEY" # default: GROQ_API_KEY
# Custom OpenAI-compatible provider (e.g., self-hosted vLLM)
[]
= "http://192.168.1.42:8000/v1"
= "VLLM_KEY"
= "openai-effort" # openai | openai-effort | openrouter
# default_model = "Qwen/Qwen2.5-Coder-32B-Instruct"
# MCP servers — usually managed via `mermaid add <name>`
[]
= "npx"
= ["-y", "@upstash/context7-mcp"]
Remote Providers
Set the appropriate environment variable (or override via [providers.<name>].api_key_env in config):
| Provider | Env var | Example model |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY |
anthropic/claude-opus-4-7 |
| Google Gemini | GOOGLE_API_KEY |
gemini/gemini-3.1-pro-preview |
| OpenAI | OPENAI_API_KEY |
openai/gpt-5 |
| Groq | GROQ_API_KEY |
groq/qwen-qwq-32b |
| OpenRouter | OPENROUTER_API_KEY |
openrouter/anthropic/claude-3.7-sonnet |
| Cerebras | CEREBRAS_API_KEY |
cerebras/gpt-oss-120b |
| DeepInfra | DEEPINFRA_API_KEY |
deepinfra/deepseek-ai/DeepSeek-R1 |
| Together | TOGETHER_API_KEY |
together/deepseek-ai/DeepSeek-R1 |
| Ollama Cloud | OLLAMA_API_KEY |
ollama/kimi-k2-thinking:cloud |
Web search and web fetch tools require an Ollama Cloud API key — set OLLAMA_API_KEY or cloud_api_key under [ollama]. Use /cloud-setup in the TUI for the full instructions.
License
MIT OR Apache-2.0
Built with Ratatui and Ollama. Inspired by Aider and Claude Code.