# xcodeai
[](https://crates.io/crates/xcodeai)
[](LICENSE)
Fully autonomous AI coding agent in Rust. Give it a task — it writes the code, runs the tools, and finishes without asking for permission.
Built as a complete, lightweight replacement for [opencode](https://opencode.ai): no TUI dependency, works on Termux, controllable via HTTP API for chat-interface integrations (e.g. 企业微信 vibe coding).
---
## Install
```bash
cargo install xcodeai
```
Requires Rust 1.75+. The binary is placed at `~/.cargo/bin/xcodeai`.
---
## Quick Start
```bash
export XCODE_API_KEY=sk-your-key
xcodeai run "Create a hello world HTTP server in main.rs" --project ./myproject --no-sandbox
```
That's it. The agent loops until the task is done.
---
## Configuration
xcodeai looks for a config file at `~/.config/xcode/config.json`. On first run, a default template is created automatically.
```json
{
"provider": {
"api_base": "https://api.openai.com/v1",
"api_key": "sk-..."
},
"model": "gpt-4o",
"sandbox": {
"enabled": false
},
"agent": {
"max_iterations": 25,
"max_tool_calls_per_response": 10
}
}
```
### Environment Variables
| `XCODE_API_KEY` | API key | `sk-abc123` |
| `XCODE_API_BASE` | Provider base URL | `https://api.deepseek.com/v1` |
| `XCODE_MODEL` | Model name | `deepseek-chat` |
### Precedence (low → high)
```
defaults → config file → env vars → CLI flags
```
CLI flags always win.
---
## Usage
### Interactive REPL (default)
Just run `xcodeai` with no arguments to enter the interactive loop:
```bash
xcodeai
```
```
✦ xcodeai v2.0.0 · gpt-4o · /home/user/myproject · no auth
Type your task. /help for commands. Ctrl-D to exit.
──────────────────────────────────────────────────────────────────
xcodeai› Add error handling to all functions in lib.rs
(agent runs autonomously...)
──────────────────────────────────────────────────────────────────
✓ done · 3 iterations · 7 tool calls
──────────────────────────────────────────────────────────────────
xcodeai› Now write tests for the new error handling
(agent runs...)
```
Each session is saved automatically — use `/session` to see the session ID.
You can pass the same flags as `run`:
```bash
xcodeai --project ./mylib --model deepseek-chat --no-sandbox
```
Pass `--no-markdown` to disable terminal markdown rendering:
```bash
xcodeai --no-markdown
```
#### REPL special commands
| `/plan` | Switch to **Plan mode** — discuss & clarify your task with the LLM (no file writes) |
| `/act` | Switch back to **Act mode** — full tool execution |
| `/undo` | Undo the last Act-mode run (restores git state via `git stash pop`) |
| `/undo N` | Undo the last N runs |
| `/undo list` | Show the undo history for this session |
| `/login` | GitHub Copilot device-code OAuth (browser + code) |
| `/logout` | Remove saved Copilot credentials |
| `/connect` | Interactive provider selector — pick from built-in presets |
| `/model [name]` | Show current model or switch immediately (`/model gpt-4o`) |
| `/session` | Browse history or start a new session |
| `/clear` | Start a fresh session (same as "New session" in `/session`) |
| `/compact` | Summarise conversation history to reduce token usage |
| `/help` | Show all commands + current mode |
| `/exit` / `/quit` / `/q` | Exit xcodeai |
| `Ctrl+C` | Clear current input line |
| `Ctrl+D` | Exit xcodeai |
#### Plan Mode
Plan mode lets you have a free-form discussion with the LLM to clarify your task before executing anything.
```
xcodeai› /plan
⟳ Switched to Plan mode — discuss your task freely. /act to execute.
[plan] xcodeai› I want to refactor the database module but I'm not sure whether to
use the repository pattern or keep it procedural.
(LLM discusses tradeoffs, asks clarifying questions, produces a plan…)
[plan] xcodeai› Let's go with the repository pattern. Generate the plan.
(LLM outlines exact steps…)
[plan] xcodeai› /act
⟳ Switched to Act mode — ready to execute.
xcodeai› Go ahead and implement the plan.
(agent executes autonomously, with context from the discussion above)
```
#### Multi-Step Undo
xcodeai records a git stash entry for every Act-mode agent run. You can rewind multiple steps:
```
xcodeai› /undo # undo the most recent run
xcodeai› /undo 3 # undo the last 3 runs
xcodeai› /undo list # see undo history (up to 10 entries)
```
Undo requires the project directory to be a git repository.
### Run a coding task
```bash
# Basic — uses config file for provider settings
xcodeai run "Add error handling to all functions in lib.rs" --project ./mylib
# Override provider inline
xcodeai run "Write tests for src/parser.rs" \
--project . \
--provider-url https://api.deepseek.com/v1 \
--api-key sk-xxx \
--model deepseek-chat
# Skip sandbox (direct execution)
xcodeai run "Refactor the database module" --project . --no-sandbox
# Disable markdown rendering in output
xcodeai run "Summarise all TODOs" --project . --no-markdown
# All flags
xcodeai run --help
```
### HTTP API Server
Start xcodeai as an HTTP server — useful for chat-interface integrations (企业微信, web UIs, scripts):
```bash
xcodeai serve # listens on 0.0.0.0:8080 (default)
xcodeai serve --addr 127.0.0.1:9090
```
#### Endpoints
| `POST` | `/sessions` | Create a new session |
| `GET` | `/sessions` | List recent sessions (latest 50) |
| `GET` | `/sessions/:id` | Get one session with its message history |
| `DELETE` | `/sessions/:id` | Delete a session |
| `POST` | `/sessions/:id/messages` | Send a message and stream agent output via SSE |
#### SSE Event Types
`POST /sessions/:id/messages` returns a `text/event-stream` response. Each event has a named type:
| `status` | `{"msg": "..."}` | Agent progress update or final response text |
| `tool_call` | `{"name": "...", "args": "..."}` | Agent is about to call a tool |
| `tool_result` | `{"preview": "...", "is_error": bool}` | Result of a tool call |
| `error` | `{"msg": "..."}` | Agent-level error (not a tool error) |
| `complete` | `{}` | Agent finished; stream ends |
#### Example curl session
```bash
# Create a session
SESSION=$(curl -s -X POST http://localhost:8080/sessions \
-H 'Content-type: application/json' \
# Run an agent task, streaming output
curl -N http://localhost:8080/sessions/$SESSION/messages \
-X POST \
-H 'Content-type: application/json' \
-d '{"content":"Create a Fibonacci function in fib.rs"}'
# List all sessions
curl http://localhost:8080/sessions
# Get session with history
curl http://localhost:8080/sessions/$SESSION
# Delete it
curl -X DELETE http://localhost:8080/sessions/$SESSION
```
#### Image Attachments
Send image files alongside a message using the `images` field:
```bash
curl -X POST http://localhost:8080/sessions/$SESSION/messages \
-H 'Content-type: application/json' \
-d '{"content":"Implement the UI shown in this screenshot","images":["/path/to/screenshot.png"]}'
```
Images are read from disk and base64-encoded before being sent to the LLM. All three built-in providers (OpenAI, Anthropic, Gemini) support multimodal image input.
### Session management
```bash
# List recent sessions
xcodeai session list
# Show full conversation for a session
xcodeai session show <session-id>
```
---
## Supported Providers
Any OpenAI-compatible API endpoint works, plus native Anthropic and Gemini support:
| OpenAI | `https://api.openai.com/v1` | GPT-4o, o1, etc. |
| Anthropic | `https://api.anthropic.com` | Claude 3.x — native, not OpenAI-compat |
| Gemini | `https://generativelanguage.googleapis.com` | Gemini 1.5+ — native |
| DeepSeek | `https://api.deepseek.com/v1` | OpenAI-compat |
| Qwen (Alibaba Cloud) | `https://dashscope.aliyuncs.com/compatible-mode/v1` | OpenAI-compat |
| GLM (Zhipu AI) | `https://open.bigmodel.cn/api/paas/v4` | OpenAI-compat |
| Local (Ollama) | `http://localhost:11434/v1` | OpenAI-compat |
| **GitHub Copilot** | `copilot` (special sentinel) | Device-code OAuth, no API key needed |
### GitHub Copilot Authentication
xcodeai supports your GitHub Copilot subscription via device-code OAuth — no separate API key needed.
```bash
# 1. Start xcodeai
xcodeai --provider-url copilot
# 2. In the REPL, authenticate:
xcodeai› /login
# GitHub shows a code and URL:
# Visit: https://github.com/login/device
# Enter code: XXXX-XXXX
#
# After you approve in the browser:
# ✓ Logged in to GitHub Copilot.
# 3. Now run tasks normally
xcodeai> Write a Fibonacci function in main.rs
```
The OAuth token is saved to `~/.config/xcode/copilot_auth.json`. Future sessions authenticate automatically. The short-lived Copilot API token (~25 min TTL) is refreshed transparently.
```bash
# Remove saved credentials
xcodeai› /logout
```
---
## Tools
The agent has access to built-in tools plus optional Git, LSP, MCP, and orchestration tools:
### Built-in Tools
| `file_read` | Read file content with line numbers | `path`, `offset`, `limit` |
| `file_write` | Write or create a file | `path`, `content` |
| `file_edit` | Replace a string in a file | `path`, `old_string`, `new_string` |
| `bash` | Execute a shell command | `command`, `timeout` (default 120s) |
| `glob_search` | Find files by glob pattern | `pattern`, `path` (max 100 results) |
| `grep_search` | Search file contents by regex | `pattern`, `path`, `include` (max 200 matches) |
| `question` | Ask the user a clarifying question | `question` |
### Git Tools
| `git_status` | Show working tree status |
| `git_diff` | Show staged/unstaged changes |
| `git_log` | Show recent commit history |
| `git_commit` | Stage all changes and create a commit |
| `git_checkout` | Switch branches or restore files |
### LSP Tools
| `lsp_hover` | Get type info / docs at a position |
| `lsp_definition` | Jump to symbol definition |
| `lsp_references` | Find all usages of a symbol |
| `lsp_diagnostics` | Get errors and warnings from the language server |
| `lsp_rename` | Rename a symbol across the whole project |
### Orchestration Tools (spawn_task)
The `spawn_task` tool lets the agent delegate sub-tasks to child agents — enabling multi-agent workflows with up to 3 levels of nesting.
```
Parent agent
└── spawn_task("Write all unit tests")
└── Child agent (full tool access)
```
### MCP Tools
xcodeai can connect to any [Model Context Protocol](https://modelcontextprotocol.io) server, automatically registering all tools the server exposes.
Configure in `~/.config/xcode/config.json`:
```json
{
"mcp": {
"servers": [
{ "name": "my-server", "command": "npx", "args": ["-y", "@my/mcp-server"] }
]
}
}
```
---
## Agent Architecture
```
xcodeai run "task"
│
▼
Director
│
▼
Coder ◄───────────────────────────┤
│ │
▼ │
LLM call (streaming SSE) │
│ │
▼ │
tool_calls? │
├── yes → execute tools ───────┘
└── no → task complete
```
- **Director** — entry point, creates the CoderAgent and executes the task
- **Coder** — runs the LLM ↔ tool loop until no more tool calls or `max_iterations` reached
- **Context management** — keeps system prompt + last N messages when approaching the context window limit
- **Compact mode** — `/compact` summarises conversation history to reduce token usage
- **Session persistence** — every run is stored in SQLite at `~/.local/share/xcode/sessions.db`
- **Token tracking** — prompt/completion/total tokens displayed after each run
- **AGENTS.md** — place an `AGENTS.md` file in your project root to inject project-specific instructions into the system prompt
---
## Sandboxing
By default, xcodeai runs tools directly in the project directory. Optionally, install [sbox](https://github.com/CVPaul/sbox) for rootless user-space session isolation:
```json
{ "sandbox": { "enabled": true, "sbox_path": "/usr/local/bin/sbox" } }
```
Or disable per-run:
```bash
xcodeai run "task" --no-sandbox
```
---
## Development
```bash
git clone <repo>
cd xcode
# Build
export PATH="$HOME/.cargo/bin:$PATH"
cargo build
# Run tests (671 total)
cargo test
# Release binary
cargo build --release
./target/release/xcodeai --help
# Lint (zero warnings enforced)
cargo clippy -- -D warnings
cargo fmt --check
```
### Project Structure
```
src/
├── main.rs CLI entry point (clap) + serve_command()
├── lib.rs Public API surface for integration tests
├── config.rs Config loading with env/CLI overrides
├── context.rs AgentContext — shared agent state
├── agent/ Director + CoderAgent loop + AGENTS.md loader
├── auth/ GitHub Copilot device-code OAuth
├── http/ HTTP API server (axum) — serve subcommand
│ ├── mod.rs AppState + start_server()
│ └── routes.rs REST + SSE route handlers
├── io/ AgentIO trait for pluggable output
│ ├── mod.rs AgentIO trait + NullIO + AutoApproveIO
│ ├── terminal.rs TerminalIO — REPL/run output with markdown rendering
│ └── http.rs HttpIO — SSE event channel for HTTP API
├── llm/ LLM providers + streaming
│ ├── mod.rs LlmProvider trait, Message, ContentPart (multimodal)
│ ├── openai.rs OpenAI / OpenAI-compat SSE client
│ ├── anthropic.rs Anthropic native SSE client
│ ├── gemini.rs Gemini native SSE client
│ ├── registry.rs ProviderRegistry — select provider by URL
│ └── retry.rs RetryingLlmProvider — exponential backoff
├── tools/ Tool trait + registry + all tools
│ ├── mod.rs ToolRegistry, ToolContext
│ ├── bash.rs, file_*.rs, glob_search.rs, grep_search.rs, question.rs
│ ├── git/ Git tools (status, diff, log, commit, checkout)
│ ├── lsp/ LSP tools (hover, definition, references, diagnostics, rename)
│ ├── mcp_resource.rs MCP resource-read tool
│ └── spawn_task.rs Multi-agent orchestration tool
├── session/ Session types + SQLite store + undo history
├── sandbox/ SboxSession + NoSandbox implementations
├── repl/ Interactive REPL loop + slash command dispatch
├── lsp/ LSP client (JSON-RPC 2.0 over stdio)
├── mcp/ MCP client (JSON-RPC 2.0 over stdio)
├── orchestrator/ Multi-step task graph executor
├── tracking.rs Token usage tracking
└── ui.rs Console styling helpers
tests/
├── mock_llm_server.rs axum mock SSE server for integration tests
├── helpers.rs Shared test utilities
├── e2e_run.rs End-to-end integration tests
└── http_integration.rs HTTP API integration tests
```
---
## License
MIT — see [LICENSE](LICENSE).