xcodeai 2.0.1

Autonomous AI coding agent — zero human intervention, sbox sandboxed, OpenAI-compatible
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
# xcodeai

[![Crates.io](https://img.shields.io/crates/v/xcodeai.svg)](https://crates.io/crates/xcodeai)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

Fully autonomous AI coding agent in Rust. Give it a task — it writes the code, runs the tools, and finishes without asking for permission.

Built as a complete, lightweight replacement for [opencode](https://opencode.ai): no TUI dependency, works on Termux, controllable via HTTP API for chat-interface integrations (e.g. 企业微信 vibe coding).

---

## Install

```bash
cargo install xcodeai
```

Requires Rust 1.75+. The binary is placed at `~/.cargo/bin/xcodeai`.

---

## Quick Start

```bash
export XCODE_API_KEY=sk-your-key

xcodeai run "Create a hello world HTTP server in main.rs" --project ./myproject --no-sandbox
```

That's it. The agent loops until the task is done.

---

## Configuration

xcodeai looks for a config file at `~/.config/xcode/config.json`. On first run, a default template is created automatically.

```json
{
  "provider": {
    "api_base": "https://api.openai.com/v1",
    "api_key": "sk-..."
  },
  "model": "gpt-4o",
  "sandbox": {
    "enabled": false
  },
  "agent": {
    "max_iterations": 25,
    "max_tool_calls_per_response": 10
  }
}
```

### Environment Variables

| Variable | Description | Example |
|---|---|---|
| `XCODE_API_KEY` | API key | `sk-abc123` |
| `XCODE_API_BASE` | Provider base URL | `https://api.deepseek.com/v1` |
| `XCODE_MODEL` | Model name | `deepseek-chat` |

### Precedence (low → high)

```
defaults → config file → env vars → CLI flags
```

CLI flags always win.

---

## Usage

### Interactive REPL (default)

Just run `xcodeai` with no arguments to enter the interactive loop:

```bash
xcodeai
```

```
  ✦ xcodeai v2.0.0  ·  gpt-4o  ·  /home/user/myproject  ·  no auth
  Type your task. /help for commands. Ctrl-D to exit.
──────────────────────────────────────────────────────────────────
xcodeai› Add error handling to all functions in lib.rs
(agent runs autonomously...)
──────────────────────────────────────────────────────────────────
  ✓ done · 3 iterations · 7 tool calls
──────────────────────────────────────────────────────────────────

xcodeai› Now write tests for the new error handling
(agent runs...)
```

Each session is saved automatically — use `/session` to see the session ID.

You can pass the same flags as `run`:

```bash
xcodeai --project ./mylib --model deepseek-chat --no-sandbox
```

Pass `--no-markdown` to disable terminal markdown rendering:

```bash
xcodeai --no-markdown
```

#### REPL special commands

| Command | Effect |
|---|---|
| `/plan` | Switch to **Plan mode** — discuss & clarify your task with the LLM (no file writes) |
| `/act` | Switch back to **Act mode** — full tool execution |
| `/undo` | Undo the last Act-mode run (restores git state via `git stash pop`) |
| `/undo N` | Undo the last N runs |
| `/undo list` | Show the undo history for this session |
| `/login` | GitHub Copilot device-code OAuth (browser + code) |
| `/logout` | Remove saved Copilot credentials |
| `/connect` | Interactive provider selector — pick from built-in presets |
| `/model [name]` | Show current model or switch immediately (`/model gpt-4o`) |
| `/session` | Browse history or start a new session |
| `/clear` | Start a fresh session (same as "New session" in `/session`) |
| `/compact` | Summarise conversation history to reduce token usage |
| `/help` | Show all commands + current mode |
| `/exit` / `/quit` / `/q` | Exit xcodeai |
| `Ctrl+C` | Clear current input line |
| `Ctrl+D` | Exit xcodeai |

#### Plan Mode

Plan mode lets you have a free-form discussion with the LLM to clarify your task before executing anything.

```
xcodeai› /plan
  ⟳ Switched to Plan mode — discuss your task freely. /act to execute.

[plan] xcodeai› I want to refactor the database module but I'm not sure whether to
              use the repository pattern or keep it procedural.

(LLM discusses tradeoffs, asks clarifying questions, produces a plan…)

[plan] xcodeai› Let's go with the repository pattern. Generate the plan.

(LLM outlines exact steps…)

[plan] xcodeai› /act
  ⟳ Switched to Act mode — ready to execute.

xcodeai› Go ahead and implement the plan.

(agent executes autonomously, with context from the discussion above)
```

#### Multi-Step Undo

xcodeai records a git stash entry for every Act-mode agent run. You can rewind multiple steps:

```
xcodeai› /undo          # undo the most recent run
xcodeai› /undo 3        # undo the last 3 runs
xcodeai› /undo list     # see undo history (up to 10 entries)
```

Undo requires the project directory to be a git repository.

### Run a coding task

```bash
# Basic — uses config file for provider settings
xcodeai run "Add error handling to all functions in lib.rs" --project ./mylib

# Override provider inline
xcodeai run "Write tests for src/parser.rs" \
  --project . \
  --provider-url https://api.deepseek.com/v1 \
  --api-key sk-xxx \
  --model deepseek-chat

# Skip sandbox (direct execution)
xcodeai run "Refactor the database module" --project . --no-sandbox

# Disable markdown rendering in output
xcodeai run "Summarise all TODOs" --project . --no-markdown

# All flags
xcodeai run --help
```

### HTTP API Server

Start xcodeai as an HTTP server — useful for chat-interface integrations (企业微信, web UIs, scripts):

```bash
xcodeai serve                    # listens on 0.0.0.0:8080 (default)
xcodeai serve --addr 127.0.0.1:9090
```

#### Endpoints

| Method | Path | Description |
|---|---|---|
| `POST` | `/sessions` | Create a new session |
| `GET` | `/sessions` | List recent sessions (latest 50) |
| `GET` | `/sessions/:id` | Get one session with its message history |
| `DELETE` | `/sessions/:id` | Delete a session |
| `POST` | `/sessions/:id/messages` | Send a message and stream agent output via SSE |

#### SSE Event Types

`POST /sessions/:id/messages` returns a `text/event-stream` response. Each event has a named type:

| Event name | Data fields | Meaning |
|---|---|---|
| `status` | `{"msg": "..."}` | Agent progress update or final response text |
| `tool_call` | `{"name": "...", "args": "..."}` | Agent is about to call a tool |
| `tool_result` | `{"preview": "...", "is_error": bool}` | Result of a tool call |
| `error` | `{"msg": "..."}` | Agent-level error (not a tool error) |
| `complete` | `{}` | Agent finished; stream ends |

#### Example curl session

```bash
# Create a session
SESSION=$(curl -s -X POST http://localhost:8080/sessions \
  -H 'Content-type: application/json' \
  -d '{"title":"my task"}' | jq -r .session_id)

# Run an agent task, streaming output
curl -N http://localhost:8080/sessions/$SESSION/messages \
  -X POST \
  -H 'Content-type: application/json' \
  -d '{"content":"Create a Fibonacci function in fib.rs"}'

# List all sessions
curl http://localhost:8080/sessions

# Get session with history
curl http://localhost:8080/sessions/$SESSION

# Delete it
curl -X DELETE http://localhost:8080/sessions/$SESSION
```

#### Image Attachments

Send image files alongside a message using the `images` field:

```bash
curl -X POST http://localhost:8080/sessions/$SESSION/messages \
  -H 'Content-type: application/json' \
  -d '{"content":"Implement the UI shown in this screenshot","images":["/path/to/screenshot.png"]}'
```

Images are read from disk and base64-encoded before being sent to the LLM. All three built-in providers (OpenAI, Anthropic, Gemini) support multimodal image input.

### Session management

```bash
# List recent sessions
xcodeai session list

# Show full conversation for a session
xcodeai session show <session-id>
```

---

## Supported Providers

Any OpenAI-compatible API endpoint works, plus native Anthropic and Gemini support:

| Provider | api_base | Notes |
|---|---|---|
| OpenAI | `https://api.openai.com/v1` | GPT-4o, o1, etc. |
| Anthropic | `https://api.anthropic.com` | Claude 3.x — native, not OpenAI-compat |
| Gemini | `https://generativelanguage.googleapis.com` | Gemini 1.5+ — native |
| DeepSeek | `https://api.deepseek.com/v1` | OpenAI-compat |
| Qwen (Alibaba Cloud) | `https://dashscope.aliyuncs.com/compatible-mode/v1` | OpenAI-compat |
| GLM (Zhipu AI) | `https://open.bigmodel.cn/api/paas/v4` | OpenAI-compat |
| Local (Ollama) | `http://localhost:11434/v1` | OpenAI-compat |
| **GitHub Copilot** | `copilot` (special sentinel) | Device-code OAuth, no API key needed |

### GitHub Copilot Authentication

xcodeai supports your GitHub Copilot subscription via device-code OAuth — no separate API key needed.

```bash
# 1. Start xcodeai
xcodeai --provider-url copilot

# 2. In the REPL, authenticate:
xcodeai› /login

# GitHub shows a code and URL:
#   Visit: https://github.com/login/device
#   Enter code:  XXXX-XXXX
#
# After you approve in the browser:
#   ✓ Logged in to GitHub Copilot.

# 3. Now run tasks normally
xcodeai> Write a Fibonacci function in main.rs
```

The OAuth token is saved to `~/.config/xcode/copilot_auth.json`. Future sessions authenticate automatically. The short-lived Copilot API token (~25 min TTL) is refreshed transparently.

```bash
# Remove saved credentials
xcodeai› /logout
```

---

## Tools

The agent has access to built-in tools plus optional Git, LSP, MCP, and orchestration tools:

### Built-in Tools

| Tool | Description | Key Parameters |
|---|---|---|
| `file_read` | Read file content with line numbers | `path`, `offset`, `limit` |
| `file_write` | Write or create a file | `path`, `content` |
| `file_edit` | Replace a string in a file | `path`, `old_string`, `new_string` |
| `bash` | Execute a shell command | `command`, `timeout` (default 120s) |
| `glob_search` | Find files by glob pattern | `pattern`, `path` (max 100 results) |
| `grep_search` | Search file contents by regex | `pattern`, `path`, `include` (max 200 matches) |
| `question` | Ask the user a clarifying question | `question` |

### Git Tools

| Tool | Description |
|---|---|
| `git_status` | Show working tree status |
| `git_diff` | Show staged/unstaged changes |
| `git_log` | Show recent commit history |
| `git_commit` | Stage all changes and create a commit |
| `git_checkout` | Switch branches or restore files |

### LSP Tools

| Tool | Description |
|---|---|
| `lsp_hover` | Get type info / docs at a position |
| `lsp_definition` | Jump to symbol definition |
| `lsp_references` | Find all usages of a symbol |
| `lsp_diagnostics` | Get errors and warnings from the language server |
| `lsp_rename` | Rename a symbol across the whole project |

### Orchestration Tools (spawn_task)

The `spawn_task` tool lets the agent delegate sub-tasks to child agents — enabling multi-agent workflows with up to 3 levels of nesting.

```
Parent agent
  └── spawn_task("Write all unit tests")
        └── Child agent (full tool access)
```

### MCP Tools

xcodeai can connect to any [Model Context Protocol](https://modelcontextprotocol.io) server, automatically registering all tools the server exposes.

Configure in `~/.config/xcode/config.json`:
```json
{
  "mcp": {
    "servers": [
      { "name": "my-server", "command": "npx", "args": ["-y", "@my/mcp-server"] }
    ]
  }
}
```

---

## Agent Architecture

```
xcodeai run "task"
   Director
    Coder ◄───────────────────────────┤
      │                          │
      ▼                          │
  LLM call (streaming SSE)       │
      │                          │
      ▼                          │
  tool_calls?                    │
  ├── yes → execute tools ───────┘
  └── no  → task complete
```

- **Director** — entry point, creates the CoderAgent and executes the task
- **Coder** — runs the LLM ↔ tool loop until no more tool calls or `max_iterations` reached
- **Context management** — keeps system prompt + last N messages when approaching the context window limit
- **Compact mode**`/compact` summarises conversation history to reduce token usage
- **Session persistence** — every run is stored in SQLite at `~/.local/share/xcode/sessions.db`
- **Token tracking** — prompt/completion/total tokens displayed after each run
- **AGENTS.md** — place an `AGENTS.md` file in your project root to inject project-specific instructions into the system prompt

---

## Sandboxing

By default, xcodeai runs tools directly in the project directory. Optionally, install [sbox](https://github.com/CVPaul/sbox) for rootless user-space session isolation:

```json
{ "sandbox": { "enabled": true, "sbox_path": "/usr/local/bin/sbox" } }
```

Or disable per-run:

```bash
xcodeai run "task" --no-sandbox
```

---

## Development

```bash
git clone <repo>
cd xcode

# Build
export PATH="$HOME/.cargo/bin:$PATH"
cargo build

# Run tests (671 total)
cargo test

# Release binary
cargo build --release
./target/release/xcodeai --help

# Lint (zero warnings enforced)
cargo clippy -- -D warnings
cargo fmt --check
```

### Project Structure

```
src/
├── main.rs            CLI entry point (clap) + serve_command()
├── lib.rs             Public API surface for integration tests
├── config.rs          Config loading with env/CLI overrides
├── context.rs         AgentContext — shared agent state
├── agent/             Director + CoderAgent loop + AGENTS.md loader
├── auth/              GitHub Copilot device-code OAuth
├── http/              HTTP API server (axum) — serve subcommand
│   ├── mod.rs         AppState + start_server()
│   └── routes.rs      REST + SSE route handlers
├── io/                AgentIO trait for pluggable output
│   ├── mod.rs         AgentIO trait + NullIO + AutoApproveIO
│   ├── terminal.rs    TerminalIO — REPL/run output with markdown rendering
│   └── http.rs        HttpIO — SSE event channel for HTTP API
├── llm/               LLM providers + streaming
│   ├── mod.rs         LlmProvider trait, Message, ContentPart (multimodal)
│   ├── openai.rs      OpenAI / OpenAI-compat SSE client
│   ├── anthropic.rs   Anthropic native SSE client
│   ├── gemini.rs      Gemini native SSE client
│   ├── registry.rs    ProviderRegistry — select provider by URL
│   └── retry.rs       RetryingLlmProvider — exponential backoff
├── tools/             Tool trait + registry + all tools
│   ├── mod.rs         ToolRegistry, ToolContext
│   ├── bash.rs, file_*.rs, glob_search.rs, grep_search.rs, question.rs
│   ├── git/           Git tools (status, diff, log, commit, checkout)
│   ├── lsp/           LSP tools (hover, definition, references, diagnostics, rename)
│   ├── mcp_resource.rs MCP resource-read tool
│   └── spawn_task.rs  Multi-agent orchestration tool
├── session/           Session types + SQLite store + undo history
├── sandbox/           SboxSession + NoSandbox implementations
├── repl/              Interactive REPL loop + slash command dispatch
├── lsp/               LSP client (JSON-RPC 2.0 over stdio)
├── mcp/               MCP client (JSON-RPC 2.0 over stdio)
├── orchestrator/      Multi-step task graph executor
├── tracking.rs        Token usage tracking
└── ui.rs              Console styling helpers
tests/
├── mock_llm_server.rs   axum mock SSE server for integration tests
├── helpers.rs           Shared test utilities
├── e2e_run.rs           End-to-end integration tests
└── http_integration.rs  HTTP API integration tests
```

---

## License

MIT — see [LICENSE](LICENSE).