do_it 0.3.2

Autonomous coding agent powered by local LLMs via Ollama. Cross-platform, no shell dependency, no cloud APIs required.
Documentation

do_it

Crates.io License: MIT

An autonomous coding agent powered by local or cloud LLMs. Reads, writes, and fixes code in your repositories. Works on Windows and Linux with no shell dependency, no Python.

Supports Ollama (local), OpenAI-compatible, and Anthropic-compatible backends — including self-hosted services and third-party providers such as MiniMax.

Inspired by mini-swe-agent — a minimal, transparent approach to software engineering agents.

do_it extends that foundation with persistent memory, multi-role orchestration, sub-agents, a live terminal UI, and an optional tool surface controlled per role.

Most of the new features were designed and implemented by Claude Sonnet 4.6.


Features

  • Pluggable LLM backends — Ollama (local), OpenAI-compatible, Anthropic-compatible; configure per project or per action type
  • Local-first option — runs entirely on your machine via Ollama, no cloud required
  • Cross-platform — Windows (MSVC) and Linux, no shell operators, no Python
  • Agent roles — focused tool sets and prompts per task type: boss, research, developer, navigator, qa, reviewer, memory
  • Role budgets — each named role has ≤ 12–14 core tools; smaller models stay focused and produce better output
  • Optional tool groupsbrowser, background, github added only when configured in config.toml
  • Sub-agent orchestrationboss delegates to specialised sub-agents via spawn_agent / spawn_agents; results flow through shared .ai/knowledge/ memory
  • Live terminal UI — three-panel Ratatui TUI: progress stats, scrollable step log, status bar; falls back to plain text in CI
  • Persistent memory.ai/ hierarchy: session notes, task plan, knowledge base, architectural decisions, lessons learned
  • Session artifacts — markdown session reports plus structured session-NNN.trace.json traces for lightweight replay, inspection, and safety diagnostics; sensitive tokens in task/summary text and write-tool output are redacted before any artifact is written
  • Telegram integrationask_human for blocking questions (TUI suspends cleanly), notify for non-blocking updates
  • GitHub integrationgithub_api tool for issues, PRs, branches, commits (optional group)
  • Browser toolsscreenshot, browser_get_text, browser_action, browser_navigate via CDP (optional group)
  • Sandboxed scripting — experimental run_script tool for quick parsing, JSON inspection, and lightweight automation
  • Model routing — different models per action type (thinking, coding, search, execution, vision)
  • Vision support — pass an image as --task for visual debugging
  • Agent self-improvement — Boss records missing capabilities to ~/.do_it/tool_wishlist.md

For Those Who Like Surprises

This is a program that writes itself.

At the beginning, it needed help — ideas and the first steps. Once everything started working, the model began improving and evolving the system on its own.

All you need is proper configuration and a sufficiently capable model, for example qwen3.5:cloud. And of course — constraints and oversight.

Just configure it and tell it what you would like to add, improve, or change.

And don't forget to check tool_wishlist.md.


Quick Start

# 1. Install
cargo install do_it

# 2. Initialise a project (interactive — choose backend, URL, model, API key)
cd /path/to/project
do_it init

# 3. Run
do_it run --task "Find and fix the bug in src/parser.rs"

# With a role (recommended for smaller models)
do_it run --task "Add input validation to handlers.rs" --role developer

# Orchestrate a complex task with sub-agents
do_it run --task "Add OAuth2 login to the API" --role boss --max-steps 80

For Ollama specifically:

ollama pull qwen3.5:cloud
do_it init --backend ollama --model qwen3.5:cloud --yes

For OpenAI or compatible service:

do_it init --backend openai --llm-url https://api.openai.com --model gpt-4o
# prompts for API key interactively

Roles

Each role restricts the agent to a focused set of tools and a role-specific system prompt. This is critical for smaller models — 12 tools instead of 30+ significantly improves output quality and reduces hallucinations.

Role Purpose Core tools
default No restrictions all tools
boss Orchestration — plans, delegates, never writes code directly memory, tree, project_map, web_search, ask_human, notify, spawn_agent/s, tool_request, capability_gap
research Information gathering web_search, fetch_url, memory, ask_human
developer Write and run code — uses navigator sub-agent for exploration read_file, write_file, str_replace, apply_patch_preview, run_command, run_targeted_test, format_changed_files_only, run_script, git_*, memory, notify
navigator Explore codebase structure — read-only read_file, list_dir, find_files, search_in_files, tree, get_symbols, outline, find_references, project_map, trace_call_path, memory
qa Run tests, coverage, check diffs, find regressions read_file, search_in_files, run_command, run_script, test_coverage, diff_repo, read_test_failure, git_*, memory, notify
reviewer Static code review — no execution read_file, search_in_files, diff_repo, git_log, get_symbols, outline, get_signature, find_references, ask_human, memory
memory Managing .ai/ state memory_read, memory_write, memory_delete
do_it roles   # list all roles and their tool allowlists

Optional tool groups

Enable additional tool sets in config.toml:

tool_groups = ["browser", "github"]   # add browser tools and GitHub API
# tool_groups = ["browser", "background", "github"]  # all optional groups
Group Tools added Roles
browser screenshot, browser_get_text, browser_action, browser_navigate boss, developer, qa, reviewer
background run_background, process_status, process_list, process_kill boss, developer
github github_api developer, qa

Sub-agent Orchestration

The boss role delegates all technical work to specialised sub-agents:

do_it run --task "Add OAuth2 login" --role boss --max-steps 80
boss: reads last_session, plan, decisions, user_profile
  │
  ├─ spawn_agents([
  │    { role: "research",  task: "find best OAuth crates for Axum",  key: "knowledge/oauth"     }
  │    { role: "navigator", task: "locate existing auth middleware",   key: "knowledge/structure" }
  │  ])                                              ← parallel, independent
  │
  ├─ spawn_agent("developer", "implement OAuth per the plan",          key: "knowledge/impl")
  ├─ spawn_agent("reviewer",  "review the OAuth implementation",       key: "knowledge/review")
  ├─ spawn_agent("qa",        "verify all tests pass",                 key: "knowledge/qa")
  └─ notify("OAuth complete") → finish

Sub-agents run in-process with isolated history and tool allowlists. The boss only reads results — it never writes code directly.


Live TUI

When running in an interactive terminal, do_it shows a three-panel live view:

┌─────────────────────────────────────────────────────────────┐
│  do_it   task: "Add OAuth2 login"   role: boss   0:03:21    │
├──────────────────────┬──────────────────────────────────────┤
│  PROGRESS            │  STEP LOG                            │
│  Step:  7 / 50  ████ │  step 1  ✓  project_map  → found    │
│  Role:  boss         │  step 2  ✓  spawn_agents → started  │
│  Elapsed: 0:03:21    │  step 3  ✓  memory_read  → loaded   │
│  ETA:   ~0:08:00     │  step 4  ·  spawn_agent  running... │
│  Tokens step:        │                                      │
│    in  2,847 out 312 │                                      │
│  Tokens total:       │                                      │
│    in 18,442 out 2k  │                                      │
├──────────────────────┴──────────────────────────────────────┤
│  boss → spawning developer for OAuth implementation  q=quit │
└─────────────────────────────────────────────────────────────┘

Keys: q / Ctrl-C = graceful stop, ↑↓ = scroll step log. Falls back to plain text output in CI or when stdout is not a TTY.


Persistent Memory

.ai/
├── project.toml           ← auto-scaffolded, edit freely
├── prompts/               ← custom role prompt overrides per project
├── state/
│   ├── current_plan.md
│   ├── last_session.md    ← agent reads this on startup
│   ├── task_state.json    ← structured working memory, survives interruption
│   └── external_messages.md  ← external inbox, cleared on startup
├── logs/
│   ├── history.md
│   ├── session-NNN.md         ← per-session markdown report: steps, tools, outcome, path-sensitivity summary
│   └── session-NNN.trace.json ← structured session trace: start, turns, finish, path-sensitivity diagnostics
└── knowledge/
    ├── lessons_learned.md
    ├── decisions.md
    └── qa_report.md

do_it status surfaces these artifacts directly:

  • session report count and latest report filenames
  • structured trace count and latest trace path
  • compact path-sensitivity summary from the latest trace, when present
  • last session note, current plan, wishlist summary, and knowledge keys

Final plain-text session close-out also shows a compact Safety : ... line when the session performed path-sensitive writes.

Global memory in ~/.do_it/ persists across all projects:

File Purpose
user_profile.md Your preferences: language, stack, workflow style
boss_notes.md Cross-project insights accumulated by Boss
tool_wishlist.md Missing capabilities recorded via tool_request / capability_gap

Configuration

# config.toml

# ── LLM backend ───────────────────────────────────────────────────────────────
# llm_backend: "ollama" | "openai" | "anthropic"
llm_backend      = "ollama"
llm_url          = "http://localhost:11434"
# llm_api_key    = ""          # or set LLM_API_KEY env var

model            = "qwen3.5:cloud"
temperature      = 0.0
max_tokens       = 4096
history_window   = 8
max_output_chars = 6000
log_level        = "info"    # error | warn | info | debug | trace
log_format       = "text"    # text | json

# Optional: enable additional tool groups
# tool_groups = ["browser", "github"]

# Optional: different models per action type
[models]
coding    = "qwen3-coder-next:cloud"
search    = "qwen3.5:9b"
execution = "qwen3.5:9b"

# Optional: Telegram
# telegram_token   = "..."
# telegram_chat_id = "..."

# Optional: browser backend
# [browser]
# cdp_url = "ws://127.0.0.1:9222"

Config priority: --config./config.toml~/.do_it/config.toml → built-in defaults.

The llm_api_key field can also be supplied via the LLM_API_KEY environment variable — useful for CI or when you don't want keys in config files.

do_it config   # show resolved config
do_it roles    # list roles and their tool counts

CLI

do_it run  --task <text|file|image>
           --repo <path>           (default: .)
           --role <role>           (default: unrestricted)
           --config <path>
           --system-prompt <text|file>
           --max-steps <n>         (default: 30)

do_it config [--config <path>]
do_it roles
do_it status [--repo <path>]

Current Status

Version: 0.3.2

Real sub-agent delegation, live TUI, role budgets, optional tool groups, and pluggable LLM backends are all working. Extended testing in progress before next crates.io publish.

Recent hardening (2026-03-28):

  • Pluggable LLM backendllm_backend = "ollama" | "openai" | "anthropic" in config.toml; API key via llm_api_key or LLM_API_KEY env var; compatible with any OpenAI- or Anthropic-compatible service
  • do_it init updated — now prompts for backend, URL, model, and API key; generates a correct config.toml for any backend
  • boss role now reliably orchestrates multi-step tasks — loop detection threshold raised
  • UTF-8 safe throughout — no panics on non-ASCII task text or file contents
  • memory_write supports append=true and namespaced keys (knowledge/decisions etc.)
  • memory_delete added
  • HTTP timeouts on all LLM backends — a hung model no longer blocks the agent forever
  • run_command output always includes stderr — cargo warnings and rustfmt diagnostics are visible
  • TUI panic hook — terminal is restored cleanly on crash

Planned next

  • More iteration helpers around the Rust-first edit/test loop (smarter diff helpers, patch-shaping)
  • Node/Python fallback branches for helper tools after the Rust-first path is solid

run_command accepts only bare executable names from PATH, enforces timeout/arg/env limits, and blocks risky environment overrides.