# agent-teams: A Rust Framework for Orchestrating Heterogeneous AI Agent Teams
> Build multi-agent systems where Claude Code, Codex, and Gemini CLI collaborate as teammates — with type-safe traits, file-based coordination, and reduced backend coupling.
## The Problem: AI Agents Work Alone
Today's AI coding agents are powerful, but they work in silos. Claude Code, Codex, Gemini CLI — each has its own process model, protocol, and strengths:
- **Claude Code** excels at multi-turn, tool-rich coding sessions via its interactive SDK
- **Codex** brings persistent threads with JSON-RPC and a dedicated code-review mode
- **Gemini CLI** offers fast, stateless single-turn analysis with Google's latest models
What if you could combine them into a **team** — a lead orchestrator assigning tasks to heterogeneous agents, each running on the backend best suited to their role? A Claude Code agent writes the implementation, a Gemini agent reviews it, and a Codex agent validates the tests — all coordinated through a shared task list and inbox system.
This is what **`agent-teams`** does.
## Getting Started
Before diving into the architecture, here's a quick taste of the API:
### Prerequisites
You'll need at least one of the following CLI tools installed and authenticated:
| Claude Code | `npm install -g @anthropic-ai/claude-code` | `claude` (interactive login) |
| Codex | `npm install -g @openai/codex` | `OPENAI_API_KEY` or interactive login |
| Gemini CLI | `npm install -g @anthropic-ai/gemini-cli` or Homebrew | `gemini` (interactive login) |
### Minimal Example
```toml
[dependencies]
agent-teams = "0.1"
tokio = { version = "1", features = ["full"] }
```
```rust
use agent_teams::prelude::*;
#[tokio::main]
async fn main() -> agent_teams::Result<()> {
let orch = TeamOrchestrator::builder()
.with_gemini_cli(GeminiCliBackend::new()?)
.build()?;
orch.create_team("my-team", None).await?;
let cfg = SpawnConfig::new("assistant", "You are a helpful coding assistant.");
orch.spawn_teammate("my-team", cfg, BackendType::GeminiCli).await?;
orch.send_input("my-team", "assistant", "What is the fastest sorting algorithm?").await?;
// take_output_receiver() returns Option — None if already taken (take-once semantics)
let mut rx = orch
.take_output_receiver("my-team", "assistant")
.await?
.expect("receiver not yet taken");
// Always use a timeout to avoid indefinite waits
let timeout = tokio::time::timeout(std::time::Duration::from_secs(60), async {
while let Some(output) = rx.recv().await {
match output {
AgentOutput::Delta(text) => print!("{text}"),
AgentOutput::TurnComplete => { println!(); break; }
AgentOutput::Error(e) => { eprintln!("Agent error: {e}"); break; }
_ => {}
}
}
}).await;
if timeout.is_err() {
eprintln!("Timed out waiting for agent response");
}
orch.shutdown_teammate("my-team", "assistant").await?;
orch.delete_team("my-team").await?;
Ok(())
}
```
## Architecture Overview
```
┌──────────────────────────┐
│ TeamOrchestrator │
│ (single entry point) │
└─────┬──────┬──────┬───────┘
│ │ │
┌───────────┘ │ └───────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ FileTeam │ │ FileTask │ │ FileInbox │
│ Manager │ │ Manager │ │ Manager │
└──────┬───────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
▼ ▼ ▼
~/.claude/teams/ ~/.claude/tasks/ inboxes/*.json
{team}/config.json {team}/{id}.json
┌────────────────────────────┐
│ Backend Abstraction │
│ AgentBackend (factory) │
│ AgentSession (handle) │
└─────┬──────┬──────┬────────┘
│ │ │
┌───────────┘ │ └───────────┐
▼ ▼ ▼
ClaudeCode Codex GeminiCli
(cc-sdk) (JSON-RPC) (one-shot CLI)
```
The framework has four distinct layers:
1. **Foundation** (`error.rs`, `models/`, `util/`) — Data types, error handling, atomic file I/O
2. **Managers** (`team/`, `task/`, `messaging/`) — Trait-based managers for teams, tasks, and messaging
3. **Backend** (`backend/`) — The `AgentBackend`/`AgentSession` trait pair with three implementations
4. **Orchestrator** (`orchestrator/`) — `TeamOrchestrator` composes everything into a single API
## The Backend Abstraction: Two Traits, Three Worlds
The core insight of `agent-teams` is unifying vastly different agent runtimes behind two simple traits:
```rust
// Uses: async_trait, tokio::sync::mpsc::Receiver, and the crate's own Result type alias.
/// Factory trait: creates agent sessions for a specific backend.
#[async_trait]
pub trait AgentBackend: Send + Sync {
fn backend_type(&self) -> BackendType;
async fn spawn(&self, config: SpawnConfig) -> Result<Box<dyn AgentSession>>;
}
/// A running agent session that can receive input and emit output.
#[async_trait]
pub trait AgentSession: Send + Sync {
fn name(&self) -> &str;
async fn send_input(&mut self, input: &str) -> Result<()>;
fn output_receiver(&mut self) -> Option<Receiver<AgentOutput>>;
async fn is_alive(&self) -> bool;
async fn shutdown(&mut self) -> Result<()>;
async fn force_kill(&mut self) -> Result<()>;
}
```
This design separates **creation** (factory) from **interaction** (session), allowing the orchestrator to be completely backend-agnostic. A `SpawnConfig` captures all the common parameters:
```rust
let config = SpawnConfig {
name: "reviewer".into(),
prompt: "You are a senior Rust code reviewer.".into(),
model: Some("gemini-2.5-pro".into()),
cwd: Some("/path/to/project".into()),
permission_mode: Some("bypassPermissions".into()),
..Default::default()
};
```
### Three Process Models, One Interface
Each backend maps `SpawnConfig` to its native execution model:
| **ClaudeCode** | Long-lived session task via `cc-sdk` | Command channel → session task → output channel | Multi-turn (SDK manages) |
| **Codex** | Persistent `codex app-server` subprocess | JSON-RPC over stdin/stdout (initialize → thread/start → turn/start) | Multi-turn (thread-based) |
| **Gemini CLI** | Ephemeral process per turn | Pipe stdin → read stdout line-by-line | Stateless (system prompt re-injected via `-p`) |
> **Note:** These process models reflect the current behavior of each tool's CLI. Upstream changes may alter these semantics.
A key design challenge was ensuring **output channel reuse**. All three backends create a single `mpsc::Sender<AgentOutput>` at `spawn()` time. For Claude Code and Codex, this sender is passed to a long-lived reader task. For Gemini CLI, the sender is **cloned** to each new ephemeral reader task, so the orchestrator's receiver stays valid across process lifetimes:
```
spawn() → process 1 → reader task 1 → output_tx.clone()
send_input() → kill proc 1 → process 2 → reader task 2 → output_tx.clone()
send_input() → kill proc 2 → process 3 → reader task 3 → output_tx.clone()
↓
orchestrator's output_rx
(valid for entire session)
```
### The Output Event Protocol
All backends emit the same `AgentOutput` enum:
```rust
pub enum AgentOutput {
Message(String), // Complete text message
Delta(String), // Streaming text delta
TurnComplete, // Agent finished a turn
Idle, // Agent is idle / waiting
Error(String), // Error occurred
}
```
A critical implementation detail addresses channel backpressure: the shared `send_agent_output` helper differentiates **control events** from **data events**:
- `TurnComplete`, `Error`, `Idle` → use `send().await` (guaranteed delivery — dropping these would cause the orchestrator to hang)
- `Delta`, `Message` → use `try_send()` (acceptable to drop under backpressure — text loss is tolerable, deadlocks are not)
The default channel capacity is 256 events. This is sufficient for most use cases; if you need to tune it, the constant `OUTPUT_CHANNEL_SIZE` is defined in each backend module.
## The Orchestrator: Composing Everything
`TeamOrchestrator` is the user-facing entry point. It composes all managers with pluggable backends via a builder pattern:
```rust
use agent_teams::backend::claude_code::ClaudeCodeBackend;
use agent_teams::backend::codex::CodexBackend;
use agent_teams::backend::gemini::GeminiCliBackend;
let orchestrator = TeamOrchestrator::builder()
.teams_base("/path/to/teams")
.tasks_base("/path/to/tasks")
.with_claude_code(ClaudeCodeBackend::new())
.with_codex(CodexBackend::new()?)
.with_gemini_cli(GeminiCliBackend::new()?)
.build()?;
```
From here, the full lifecycle is straightforward:
```rust
// Create a team
orchestrator.create_team("review-team", Some("Code review squad")).await?;
// Spawn heterogeneous teammates
let claude_cfg = SpawnConfig::new("implementer", "You write Rust code.");
orchestrator.spawn_teammate("review-team", claude_cfg, BackendType::ClaudeCode).await?;
let gemini_cfg = SpawnConfig {
name: "reviewer".into(),
prompt: "You review Rust code for correctness and style.".into(),
model: Some("gemini-2.5-pro".into()),
..Default::default()
};
orchestrator.spawn_teammate("review-team", gemini_cfg, BackendType::GeminiCli).await?;
// Create and assign tasks
let task = orchestrator.create_task("review-team", CreateTaskRequest {
subject: "Review authentication module".into(),
description: Some("Check for SQL injection and auth bypass.".into()),
..Default::default()
}).await?;
orchestrator.assign_task("review-team", &task.id, "reviewer").await?;
// Send input to a specific agent
orchestrator.send_input("review-team", "reviewer", "Please review src/auth.rs").await?;
// Read output (with timeout and error handling)
let mut rx = orchestrator
.take_output_receiver("review-team", "reviewer")
.await?
.expect("receiver not yet taken");
while let Some(output) = rx.recv().await {
match output {
AgentOutput::Delta(text) => print!("{text}"),
AgentOutput::Error(e) => { eprintln!("Error: {e}"); break; }
AgentOutput::TurnComplete => break,
_ => {}
}
}
```
## File-Based Coordination: Claude Code Compatible
The framework uses JSON files on disk for all coordination state — teams, tasks, and inboxes. This design is intentionally compatible with Claude Code's own agent teams format:
```
~/.claude/
├── teams/
│ └── review-team/
│ ├── config.json # Team config + member list
│ └── inboxes/
│ ├── implementer.json # Per-agent inbox
│ └── reviewer.json
└── tasks/
└── review-team/
├── 1.json # Task files
└── 2.json
```
All file operations use **atomic writes** (via temporary files and renames) and **advisory file locking** for crash safety. On Unix-like systems, this is achieved with `flock(2)`, ensuring robust concurrent access. (Note: This currently limits the framework to non-Windows platforms.)
### Task Dependency Graph
Tasks support `blocks` / `blockedBy` dependencies with **cycle detection** (DFS-based) and **automatic cascade**: when a task is completed, it's automatically removed from the `blockedBy` lists of dependent tasks, potentially unblocking them:
```rust
// Task 2 depends on Task 1
orchestrator.update_task("team", "2", TaskUpdate {
add_blocked_by: Some(vec!["1".into()]),
..Default::default()
}).await?;
// Completing Task 1 automatically unblocks Task 2
orchestrator.update_task("team", "1", TaskUpdate {
status: Some(TaskStatus::Completed),
..Default::default()
}).await?;
```
### Structured Messaging
The inbox system supports both plain-text messages and structured protocol messages:
```rust
// Plain message
orchestrator.send_message("team", "lead", "worker", "Please start task 3").await?;
// Structured messages (auto-generated by assign_task, send_shutdown_request, etc.)
// TaskAssignment, ShutdownRequest, ShutdownApproved,
// IdleNotification, PlanApprovalRequest, PlanApprovalResponse
```
## Why Three Backends?
Each backend has distinct strengths that make them optimal for different roles:
| **Implementation** | Claude Code | Rich tool access (file edit, shell, web search), multi-turn state |
| **Code Review** | Gemini CLI | Fast single-turn analysis, large context window, no tool overhead |
| **Testing** | Codex | Persistent thread with dedicated code-review mode |
| **Quick Analysis** | Gemini CLI (flash) | Fastest response time, lowest latency |
| **Complex Debugging** | Claude Code | Extended thinking, interactive tool use |
| **Parallel Validation** | All three | Cross-model agreement increases confidence |
## Numbers
| Source lines | 5,718 |
| Source files | 21 |
| Unit tests | 86 |
| Integration tests | 19 |
| Backends | 3 (Claude Code, Codex, Gemini CLI) |
| Dependencies | 14 (tokio, serde, thiserror, cc-sdk, etc.) |
| Rust edition | 2024 (`edition = "2024"` in Cargo.toml) |
## Design Principles
1. **Trait-first abstraction**: Backend differences are hidden behind `AgentBackend` + `AgentSession`. Adding a fourth backend (e.g., Aider, Cursor) requires implementing just two traits.
2. **File-based coordination**: No database, no server. JSON files with atomic writes and flock — simple, debuggable, and compatible with Claude Code's native format.
3. **Control events never drop**: The `send_agent_output` helper guarantees delivery of `TurnComplete` and `Error` events while gracefully dropping text under backpressure. This prevents deadlocks without losing liveness signals.
4. **Defensive resource cleanup**: Every backend implements `Drop` to abort reader tasks, uses `kill_on_drop(true)` for child processes, and provides both `shutdown()` (graceful) and `force_kill()` (immediate) paths.
5. **Reduced backend coupling**: The orchestrator doesn't know or care which backend runs which agent. You can swap a Claude Code agent for a Gemini agent by changing one line — the `BackendType` parameter in `spawn_teammate()`. Note that you still depend on the respective CLI tools being installed and authenticated.
## Known Limitations
- **Unix-only**: File locking uses `flock(2)`, which is not available on Windows.
- **Single receiver ownership**: `output_receiver()` uses take-once semantics — only one consumer can read an agent's output stream.
- **External CLI dependency**: Each backend requires its respective CLI tool to be installed, authenticated, and on `$PATH`.
- **No built-in retry**: If a backend process crashes, the orchestrator does not automatically restart it. The caller is responsible for re-spawning.
## What's Next
- **Streaming event adapter**: `tokio_stream::wrappers::ReceiverStream` for `async for` ergonomics
- **Session resume**: Persist and resume Codex threads across orchestrator restarts
- **Dynamic routing**: Route tasks to the optimal backend based on cost, latency, and capability
- **Web dashboard**: Real-time visualization of team state, task progress, and agent output
- **Cross-platform locking**: Windows-compatible file locking via `fs2` or similar
---
*`agent-teams` is open-source under the MIT license. Contributions welcome.*