do_it 0.2.0

Autonomous coding agent powered by local LLMs via Ollama. Cross-platform, no shell dependency, no cloud APIs required.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
# do_it — Documentation

An autonomous coding agent powered by local LLMs via [Ollama](https://ollama.com). Reads, writes, and fixes code in your repositories. Runs on Windows and Linux with no shell dependency, no Python, no cloud APIs.

---

## Table of Contents

1. [Quick Start]#quick-start
2. [Installation]#installation
3. [Configuration]#configuration
4. [CLI Reference]#cli-reference
5. [Agent Roles]#agent-roles
6. [Tools]#tools
7. [Telegram — ask_human]#telegram--ask_human
8. [How the Agent Works]#how-the-agent-works
9. [Model Selection and Routing]#model-selection-and-routing
10. [Sub-agent Architecture (Roadmap)]#sub-agent-architecture-roadmap
11. [Limitations]#limitations
12. [Tips and Recommendations]#tips-and-recommendations
13. [Troubleshooting]#troubleshooting
14. [Project Structure]#project-structure

---

## Quick Start

```bash
# 1. Pull a model
ollama pull qwen3.5:9b

# 2. Build
cargo build --release

# 3. Run
./target/release/do_it run \
  --task "Find and fix the bug in src/parser.rs" \
  --repo /path/to/project

# With a role (recommended for smaller models)
./target/release/do_it run \
  --task "Add input validation to handlers.rs" \
  --role developer
```

Windows:
```powershell
.\target\release\do_it.exe run `
  --task "Find and fix the bug in src/parser.rs" `
  --repo C:\Projects\my-project `
  --role developer
```

---

## Installation

### Requirements

- [Rust]https://rustup.rs/ 1.85+ (edition 2024)
- [Ollama]https://ollama.com running locally

### Build from source

```bash
git clone https://github.com/oleksandrpublic/doit
cd doit
cargo build --release
```

Binary: `target/release/do_it` (Linux) / `target\release\do_it.exe` (Windows).

### Install from crates.io

```bash
cargo install do_it
```

### Ollama setup

```bash
ollama pull qwen3.5:9b   # recommended default
ollama list              # verify installed models
curl http://localhost:11434/api/tags  # verify Ollama is running
```

---

## Configuration

All configuration lives in `config.toml`. If the file is not found, the agent runs with built-in defaults.

```toml
# Ollama endpoint
ollama_base_url  = "http://localhost:11434"

# Default model — used when no role-specific override is set
model            = "qwen3.5:9b"

# Sampling temperature: 0.0 = deterministic
temperature      = 0.0

# Max tokens per LLM response
max_tokens       = 4096

# Number of recent steps to include in full in context; older steps are collapsed to one line
history_window   = 8

# Max characters in tool output before truncation
max_output_chars = 6000

# System prompt (overridden by --role and --system-prompt)
system_prompt = """..."""

# Optional: Telegram for ask_human / notifications
# telegram_token   = "1234567890:ABCdef..."
# telegram_chat_id = "123456789"

# Optional: per-role model overrides
[models]
# thinking  = "qwen3.5:9b"
# coding    = "qwen3-coder-next"
# search    = "qwen3.5:4b"
# execution = "qwen3.5:4b"
# vision    = "qwen3.5:9b"
```

### Defaults

| Field | Default |
|---|---|
| `ollama_base_url` | `http://localhost:11434` |
| `model` | `qwen3.5:9b` |
| `temperature` | `0.0` |
| `max_tokens` | `4096` |
| `history_window` | `8` |
| `max_output_chars` | `6000` |

```bash
do_it config              # print resolved config
do_it config --config custom.toml
```

---

## CLI Reference

```
do_it run     Run the agent on a task
do_it config  Print resolved config and exit
do_it roles   List all roles with their tool allowlists
```

### `run` arguments

```
--task, -t        Task text, path to a .md file, or path to an image
--repo, -r        Repository / working directory (default: .)
--config, -c      Path to config.toml (default: ./config.toml)
--role            Agent role: boss | research | developer | navigator | qa | memory
--system-prompt   Override system prompt: inline text or path to a file
--max-steps       Maximum agent steps (default: 30)
```

### Examples

```bash
# Plain task
do_it run --task "Refactor the database module"

# With role (recommended)
do_it run --task "Add rate limiting" --role developer
do_it run --task "What does this project do?" --role navigator
do_it run --task "Find docs for tower-http middleware" --role research

# Task from file
do_it run --task tasks/issue-42.md --role developer --repo ~/projects/my-app

# Screenshot with an error (vision mode)
do_it run --task error-screenshot.png --role developer

# Custom prompt for a specific task
do_it run --task "Review only, do not edit" --system-prompt prompts/reviewer.md

# More steps for complex tasks
do_it run --task "Refactor auth module" --role developer --max-steps 50
```

`--task` and `--system-prompt`: if the value is a path to an existing file, its contents are read; if the extension is an image (`.png`, `.jpg`, `.webp`, etc.), vision mode is activated.

### Logging

```bash
RUST_LOG=info  do_it run --task "..."   # default
RUST_LOG=debug do_it run --task "..."   # includes raw LLM responses
RUST_LOG=error do_it run --task "..."   # errors only
```

---

## Agent Roles

Roles restrict the agent to a focused tool set and apply a role-specific system prompt. This is critical for smaller models — 6–8 tools instead of 20+ significantly improves output quality and reduces hallucinations.

```bash
do_it roles   # print all roles and their allowlists
```

### Role table

| Role | Purpose | Key tools |
|---|---|---|
| `default` | No restrictions | all tools |
| `boss` | Orchestration and planning | `memory_read/write`, `tree`, `web_search`, `ask_human` |
| `research` | Information gathering | `web_search`, `fetch_url`, `memory_read/write`, `ask_human` |
| `developer` | Reading and writing code | `read/write_file`, `str_replace`, `run_command`, `diff_repo`, `git_*`, AST tools |
| `navigator` | Exploring codebase structure | `tree`, `list_dir`, `find_files`, `search_in_files`, `find_references`, AST tools |
| `qa` | Testing and verification | `run_command`, `read_file`, `search_in_files`, `diff_repo`, `git_status`, `git_log` |
| `memory` | Managing `.ai/` state | `memory_read`, `memory_write` |

### System prompt priority

```
--system-prompt  (highest)
  ↓
--role           → looks for .ai/prompts/<role>.md, falls back to built-in
  ↓
system_prompt in config.toml
  ↓
built-in DEFAULT_SYSTEM_PROMPT  (lowest)
```

### Custom role prompts

Create `.ai/prompts/<role>.md` in the repository root to override the built-in prompt for that role:

```bash
mkdir -p .ai/prompts
cat > .ai/prompts/developer.md << 'EOF'
You are a Rust expert working on this codebase.
Always run `cargo clippy -- -D warnings` after edits.
Use thiserror for error types, never anyhow in library code.
All public functions require doc comments.
EOF
```

The file is picked up automatically — no restart needed.

---

## Tools

All tools are implemented in native Rust (`src/tools.rs`) with no shell dependency.

### Filesystem

| Tool | Arguments | Description |
|---|---|---|
| `read_file` | `path`, `start_line?`, `end_line?` | Read file with line numbers (default: first 100 lines) |
| `write_file` | `path`, `content` | Overwrite file, create directories if needed |
| `str_replace` | `path`, `old_str`, `new_str` | Replace a unique string in a file |
| `list_dir` | `path?` | List directory contents (one level) |
| `find_files` | `pattern`, `dir?` | Find files by name: `*.rs`, `test*`, substring |
| `search_in_files` | `pattern`, `dir?`, `ext?` | Regex search across file contents |
| `tree` | `dir?`, `depth?`, `ignore?` | Recursive directory tree (ignores `target`, `.git`, etc. by default) |

### Execution

| Tool | Arguments | Description |
|---|---|---|
| `run_command` | `program`, `args[]`, `cwd?` | Run a program with explicit args array (no shell) |
| `diff_repo` | `base?`, `staged?`, `stat?` | Git diff vs HEAD or any ref |

### Git

| Tool | Arguments | Description |
|---|---|---|
| `git_status` | `short?` | Working tree status and branch info |
| `git_commit` | `message`, `files?`, `allow_empty?` | Stage files and commit |
| `git_log` | `n?`, `path?`, `oneline?` | Commit history, optionally filtered by path |
| `git_stash` | `action`, `message?`, `index?` | Stash management: `push`, `pop`, `list`, `drop`, `show` |

### Internet

| Tool | Arguments | Description |
|---|---|---|
| `web_search` | `query`, `max_results?` | Search via DuckDuckGo (no API key required) |
| `fetch_url` | `url`, `selector?` | Fetch a web page and return readable text |

### Code Intelligence

Regex-based, supports Rust, TypeScript/JavaScript, Python, C++, Kotlin. Detected by file extension.

| Tool | Arguments | Description |
|---|---|---|
| `get_symbols` | `path`, `kinds?` | List all symbols: fn, struct, class, impl, enum, trait, type, const |
| `outline` | `path` | Structural overview with signatures and line numbers |
| `get_signature` | `path`, `name`, `lines?` | Full signature + doc comment for a named symbol |
| `find_references` | `name`, `dir?`, `ext?` | All usages of a symbol across the codebase |

`kinds` filter examples: `"fn,struct"`, `"class,interface"`, `"fn,method"`.

### Memory (`.ai/` hierarchy)

| Tool | Arguments | Description |
|---|---|---|
| `memory_read` | `key` | Read a memory entry |
| `memory_write` | `key`, `content`, `append?` | Write or append to a memory entry |

**Logical key mapping:**

| Key | File |
|---|---|
| `plan` | `.ai/state/current_plan.md` |
| `last_session` | `.ai/state/last_session.md` |
| `session_counter` | `.ai/state/session_counter.txt` |
| `external` | `.ai/state/external_messages.md` |
| `history` | `.ai/logs/history.md` |
| `knowledge/<n>` | `.ai/knowledge/<n>.md` |
| `prompts/<n>` | `.ai/prompts/<n>.md` |
| any other key | `.ai/knowledge/<key>.md` |

### Communication

| Tool | Arguments | Description |
|---|---|---|
| `ask_human` | `question` | Send a question via Telegram (if configured) or console |
| `finish` | `summary`, `success` | Signal task completion |

---

## Telegram — ask_human

### Setup

1. Create a bot via [@BotFather]https://t.me/BotFather → copy the token
2. Send `/start` to your bot
3. Find your `chat_id` via [@userinfobot]https://t.me/userinfobot
4. Add to `config.toml`:

```toml
telegram_token   = "1234567890:ABCdef-ghijklmnop"
telegram_chat_id = "123456789"
```

### How it works

When the agent calls `ask_human`:
1. Sends the question to Telegram
2. Polls every 5 seconds, waits up to 5 minutes for a reply
3. Your reply is returned to the agent as the tool result
4. If Telegram is not configured or unreachable — falls back to console stdin

---

## How the Agent Works

### Main loop (`src/agent.rs`)

```
session_init():
  → increment .ai/state/session_counter.txt
  → read last_session.md → inject as step 0 in history

if --task is an image:
  → vision model describes it → description becomes effective_task

for each step 1..max_steps:
  1. thinking model → JSON { thought, tool, args }
  2. check role tool allowlist (if role != Default)
  3. if specialist model differs → re-call with specialist
  4. execute tool
  5. record in history
  6. if tool == "finish" → done
```

### Context window (`src/history.rs`)

- Last `history_window` steps in full
- Older steps collapsed to one line: `step N ✓ [tool] → first line of output`

### LLM response parsing

Finds the first `{` and last `}` — handles models that add prose before or after JSON. Strips ` ```json ` fences automatically.

### Tool error handling

A tool error does not stop the agent — it is recorded in history as a failed step. The agent can recover and try a different approach on the next step.

---

## Model Selection and Routing

### Recommended models (qwen3.5 family)

| Model | Size | Use case |
|---|---|---|
| `qwen3.5:4b` | 3.4 GB | Roles with ≤8 tools (navigator, research) |
| `qwen3.5:9b` | 6.6 GB | Default — good balance |
| `qwen3.5:27b` | 17 GB | Complex multi-step tasks |
| `qwen3-coder-next` | ~52 GB | Developer role, best code quality |

### Per-role model routing

```toml
model = "qwen3.5:9b"   # fallback for all roles

[models]
coding    = "qwen3-coder-next"   # used for write_file, str_replace
search    = "qwen3.5:4b"        # used for read_file, find_files, etc.
execution = "qwen3.5:4b"        # used for run_command
```

### Remote Ollama

```toml
ollama_base_url = "http://192.168.1.100:11434"
```

---

## Sub-agent Architecture (Roadmap)

The current role system is the foundation for full sub-agent orchestration.

### Planned: `spawn_agent(role, task)`

A new tool that lets the `boss` agent delegate subtasks to specialised sub-agents. Each sub-agent runs with its own history, role prompt, and tool allowlist. The result is returned as a string to the boss's history.

### Execution flow (planned)

```
do_it run --task "Add OAuth" --role boss
  │
  ├─ boss: reads plan, decomposes task
  │
  ├─ spawn_agent(role="research", task="find best OAuth crates for Axum")
  │    └─ research agent → memory_write("knowledge/oauth_crates", ...)
  │
  ├─ spawn_agent(role="navigator", task="find current auth middleware location")
  │    └─ navigator agent → returns structure summary
  │
  ├─ spawn_agent(role="developer", task="implement OAuth per the plan")
  │    └─ developer agent → writes code, runs tests
  │
  └─ spawn_agent(role="qa", task="verify all tests pass")
       └─ qa agent → runs tests → writes report
```

### `.ai/` memory hierarchy (already implemented)

```
.ai/
├── prompts/               ← custom role prompts (override built-ins per project)
│   ├── boss.md
│   ├── developer.md
│   └── qa.md
├── state/
│   ├── current_plan.md    ← boss writes the task plan here
│   ├── last_session.md    ← read on startup, written at end of session
│   ├── session_counter.txt
│   └── external_messages.md
├── logs/
│   └── history.md
└── knowledge/             ← agent-written notes about the project
```

---

## Limitations

**Context window** — for long sessions reduce `history_window` to 4–5 and `max_output_chars` to 3000.

**`find_files`** — simple patterns only: `*.rs`, `test*`, substring. No `**`, `{a,b}`, or `?`.

**`run_command`** — no `|`, `&&`, `>`. Each command is a separate call with an explicit args array.

**`fetch_url`** — public URLs only, no authentication, no JavaScript rendering. For GitHub use `raw.githubusercontent.com`.

**`web_search`** — DuckDuckGo HTML endpoint, no API key required. Rate limiting possible under heavy use.

**Code intelligence** — regex-based, covers ~95% of real-world cases. Does not handle macros, conditional compilation, or dynamically generated code.

**Vision** — qwen3.5 supports images, but current GGUF files in Ollama may not include the mmproj component. Use llama.cpp directly for guaranteed vision support.

---

## Tips and Recommendations

### Choose the right role

```bash
do_it run --task "What does this project do?"            --role navigator
do_it run --task "Find examples of using axum extractors" --role research
do_it run --task "Add validation to src/handlers.rs"     --role developer
do_it run --task "Do all tests pass?"                    --role qa
do_it run --task "Plan the auth module refactor"         --role boss
```

### Before running

```bash
git status           # start from a clean working tree
ollama list          # verify required models are available
```

### Write a good task description

```bash
# Vague — bad
do_it run --task "Improve the code"

# Specific — good
do_it run --task "In src/lexer.rs the tokenize function returns an empty Vec \
  for input containing only spaces. Fix it and add a regression test." --role developer

# From a file with full context
do_it run --task tasks/issue-42.md --role developer
```

A task file can include reproduction steps, logs, expected vs actual behaviour, and any relevant constraints.

### Project-specific role prompts

```bash
mkdir -p .ai/prompts
cat > .ai/prompts/developer.md << 'EOF'
You are a Rust expert on this codebase.
Always run `cargo clippy -- -D warnings` after edits.
Always run `cargo test` after edits.
Use thiserror for error types in library code.
All public items require doc comments.
EOF
```

---

## Troubleshooting

**`Tool 'X' is not allowed for role 'Y'`** — the model tried to use a tool outside the role's allowlist. Either switch to `--role default` or add the tool to `.ai/prompts/<role>.md` with an explicit mention.

**`Cannot reach Ollama at http://localhost:11434`**
```bash
ollama serve
curl http://localhost:11434/api/tags
```

**`Model 'X' not found`**
```bash
ollama pull qwen3.5:9b
ollama list
```

**`LLM response has no JSON`** — the model responded outside of JSON format. Try a larger model, set `temperature = 0.0`, or use a role to reduce the tool count. Add `IMPORTANT: respond ONLY with a JSON object.` to the system prompt.

**`str_replace: old_str found N times`** — provide more context to make `old_str` unique:
```json
{ "old_str": "fn process(x: i32) -> i32 {\n    x + 1", "new_str": "..." }
```

**`ask_human via Telegram: no reply received within 5 minutes`** — verify the bot token and chat_id, and that you have sent `/start` to the bot. The agent continues with an error and falls back to console.

**`fetch_url: HTTP 403` or empty result** — the site blocks bots or requires JavaScript. Use direct API endpoints instead: `raw.githubusercontent.com` instead of `github.com`.

**Agent is looping** — reduce `history_window` to 4, restart with a more specific task, or use a larger model.

---

## Project Structure

```
do_it/
├── Cargo.toml           name="do_it", edition="2024", version="0.2.0"
├── config.toml          runtime configuration (models, Telegram, prompt)
├── README.md            project overview
├── DOCS.md              this file
├── LICENSE              MIT
├── .gitignore
├── .ai/                 agent memory (created automatically, gitignored)
│   ├── prompts/         custom role prompts
│   ├── state/           plan, last_session, session_counter
│   ├── logs/            history.md
│   └── knowledge/       agent-written project notes
└── src/
    ├── main.rs          CLI: run | config | roles; --role flag; image detection
    ├── agent.rs         main loop, role enforcement, tool allowlist, session_init
    ├── config.rs        AgentConfig, Role enum + allowlists + built-in prompts, ModelRouter
    ├── history.rs       sliding window context manager
    ├── shell.rs         Ollama HTTP client: chat, chat_with_image, check_models
    └── tools.rs         all 19 ACI tools; 6-language AST parsers
```

### Dependencies

| Crate | Purpose |
|---|---|
| `tokio` | async runtime |
| `reqwest` | HTTP — Ollama, fetch_url, web_search, Telegram API |
| `serde` / `serde_json` | JSON serialization |
| `toml` | config.toml parsing |
| `clap` | CLI argument parsing |
| `walkdir` | recursive filesystem traversal |
| `regex` | search_in_files, find_references, AST parsers |
| `base64` | image encoding for vision |
| `anyhow` | error handling |
| `tracing` / `tracing-subscriber` | structured logging |