aca 0.3.1 - Docs.rs

# aca (Automatic Coding Agent) - Design Document

## Overview

A Rust-based agentic tool that automates coding tasks using multiple LLM providers. The system features dynamic task trees, comprehensive session persistence, and full resumability for long-running automated coding sessions. Supports Claude (CLI/API), OpenAI, and local models with intelligent task parsing and execution planning.

## Deliverable Documents

This design has been broken down into focused deliverable documents:

- **[1.1 Architecture Overview](1.1-architecture-overview.md)** - Comprehensive system architecture, dual-mode design, component interfaces, and resource management
- **[1.2 Task Management System](1.2-task-management.md)** - Task tree architecture, scheduling algorithms, and dynamic task management
- **[1.3 Session Persistence System](1.3-session-persistence.md)** - State management, persistence formats, and recovery mechanisms
- **[1.4 Claude Code Integration](1.4-claude-integration.md)** - Claude Code SDK integration, rate limiting, and conversation management
- **[1.5 LLM Provider Abstraction](1.5-llm-provider-abstraction.md)** - Multi-provider support, unified interface, and provider capabilities
- **[1.6 Configuration & Security](1.6-configuration-security.md)** - Configuration management, security controls, and operational monitoring

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│                        aca System                           │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │                   CLI Frontend                          │ │
│  │   - Argument parsing (clap)                            │ │
│  │   - Task file loading (Markdown/TOML)                   │ │
│  │   - Execution mode selection                            │ │
│  │   - Plan dumping and loading                            │ │
│  └─────────────────────────────────────────────────────────┘ │
│                             │                               │
│                             ▼                               │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │           Intelligent Task Parser (Optional)            │ │
│  │   - LLM-based task analysis                            │ │
│  │   - Dependency detection                                │ │
│  │   - Priority/complexity estimation                      │ │
│  │   - Execution strategy planning                         │ │
│  └─────────────────────────────────────────────────────────┘ │
│                             │                               │
│                             ▼                               │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │              Agent Integration Layer                    │ │
│  │  ┌─────────────┐  ┌──────────────┐  ┌────────────────┐ │ │
│  │  │Task Manager │  │LLM Providers │  │Session Manager │ │ │
│  │  │- Task tree  │  │- Claude CLI  │  │- Checkpoints   │ │ │
│  │  │- Scheduler  │  │- Claude API  │  │- Persistence   │ │ │
│  │  │- Execution  │  │- OpenAI      │  │- Recovery      │ │ │
│  │  │- Progress   │  │- Local (Ollama)│  │- State mgmt  │ │ │
│  │  └─────────────┘  └──────────────┘  └────────────────┘ │ │
│  └─────────────────────────────────────────────────────────┘ │
│                                                             │
│  Working Directory:                                         │
│  .aca/                 - Session metadata and checkpoints   │
│  .aca/sessions/        - Per-session state                  │
│  logs/                 - Execution logs                     │
└─────────────────────────────────────────────────────────────┘
```

## Core Components

### 1. CLI Frontend

**Responsibilities:**

- Parse command-line arguments using clap
- Load task files (Markdown or TOML formats)
- Initialize session state or resume from checkpoint
- Configure LLM provider mode (CLI/API)
- Handle execution plan dumping and loading
- Provide progress monitoring and reporting

**Key Operations:**

- Task file parsing (structured TOML or intelligent Markdown parsing)
- Execution plan analysis and review workflow
- Session checkpoint management
- Provider configuration and selection

### 2. Intelligent Task Parser

**Responsibilities:**

- Analyze unstructured task descriptions using LLM
- Detect hierarchical task relationships
- Identify task dependencies
- Estimate priority and complexity
- Determine optimal execution strategies
- Support markdown file references in task descriptions

### 3. Agent Integration Layer

**Responsibilities:**

- Execute task automation logic
- Interface with configured LLM providers
- Manage dynamic task tree with execution
- Handle provider-specific rate limiting
- Maintain session context and conversation state
- Persist state for resumability

## Task Management System

### Task Tree Structure

```rust
#[derive(Serialize, Deserialize, Clone)]
pub struct Task {
    pub id: TaskId,
    pub title: String,
    pub description: String,
    pub status: TaskStatus,
    pub parent_id: Option<TaskId>,
    pub children: Vec<TaskId>,
    pub dependencies: Vec<TaskId>,
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
    pub metadata: TaskMetadata,
}

#[derive(Serialize, Deserialize, Clone)]
pub enum TaskStatus {
    Pending,
    InProgress,
    Blocked(String),
    Completed,
    Failed(String),
    Skipped(String),
}

#[derive(Serialize, Deserialize, Clone)]
pub struct TaskMetadata {
    pub priority: u8,
    pub estimated_complexity: Option<u8>,
    pub repository_refs: Vec<String>,
    pub file_refs: Vec<PathBuf>,
    pub tags: Vec<String>,
}
```

### Task Tree Operations

- **Dynamic Subtask Creation**: Agent can break down complex tasks into smaller subtasks
- **Dependency Resolution**: Automatic handling of task dependencies
- **Context Inheritance**: Subtasks inherit relevant context from parent tasks
- **Progress Tracking**: Real-time status updates throughout the tree

## Session Persistence

### State Components

1. **Task Tree State**

   - Complete task hierarchy with status
   - Inter-task dependencies and relationships
   - Progress metrics and timing data

2. **Claude Code Context**

   - Conversation history and context
   - Session configuration and preferences
   - Model usage and rate limiting state

3. **File System State**

   - Modified files and their change history
   - Build artifacts and compilation results
   - Workspace directory structure

4. **Execution Logs**
   - Structured task execution logs
   - Claude Code interaction traces
   - Error logs and debugging information

### Persistence Format

```
.aca/sessions/{session_id}/
├── meta/
│   └── session.json        # Session metadata and task hierarchy
├── claude/                 # Claude Code conversation state
│   ├── messages.json
│   ├── session_config.json
│   └── rate_limit_state.json
├── file_changes/          # File modification tracking
│   ├── change_log.json
│   └── snapshots/
└── execution_logs/        # Structured execution logs
    ├── task_logs/
    └── system_logs/
```

## LLM Provider Integration

### Provider Abstraction Layer

The system supports multiple LLM providers through a unified async interface:

```rust
pub trait LlmProvider: Send + Sync {
    fn send_message<'a>(
        &'a mut self,
        prompt: &'a str,
        system_message: Option<&'a str>,
    ) -> BoxFuture<'a, Result<String, LlmError>>;
}
```

### Supported Providers

1. **Claude CLI** (Default)
   - Uses `claude` command-line tool
   - JSON output format for structured responses
   - Subprocess-based execution with output parsing
   - Conversational state persistence

2. **Claude API**
   - Direct Anthropic API integration
   - Message-based conversation history
   - Streaming support (future)
   - Token usage tracking

3. **OpenAI**
   - OpenAI API compatibility
   - GPT-4 and other models
   - Standard chat completion interface

4. **Local Models (Ollama)**
   - Local model execution
   - Privacy-focused option
   - No external API dependencies

### Provider Mode Configuration

- **CLI Mode**: Default, uses subprocess execution
- **API Mode**: Direct API calls with credentials
- Configurable via command-line flags or config file

## Current Implementation Status

### ✅ Implemented Features
- CLI frontend with clap argument parsing
- Intelligent task parser with LLM-based analysis
- Task management system with dynamic tree structure
- Session persistence with checkpoint/resume
- LLM provider abstraction (Claude CLI/API, OpenAI, Ollama)
- Execution plan dumping and loading
- Markdown file reference resolution
- Dependency mapping and detection
- `.aca` directory structure for session state
- TOML configuration support

### 🚧 Planned Features
- Docker containerization for isolated execution
- Headless Claude Code SDK integration via WebSocket
- Multi-container distributed execution
- Advanced rate limiting with usage tracking
- Web dashboard for real-time monitoring
- Plugin system for extensible task handlers

## Intelligent Task Parsing (Implemented)

### LLM-Based Analysis

The intelligent parser uses LLM capabilities to analyze unstructured task descriptions:

**Features:**
- Hierarchical task structure detection
- Automatic dependency identification
- Priority and complexity estimation
- Execution strategy determination (Sequential/Parallel/Intelligent)
- File reference resolution (markdown links to actual files)
- Custom system prompt support via `--append-system-prompt`

### Execution Plan Workflow

1. **Analyze**: `aca --task-file tasks.md --dry-run --dump-plan plan.json`
2. **Review**: Examine and modify `plan.json` as needed
3. **Execute**: `aca --execution-plan plan.json`

This allows for human review and modification before execution.

## Agent Execution Flow

### 1. Initialization Phase

1. Load or create session state from `.aca/sessions/{session_id}`
2. Parse task file (structured TOML or intelligent Markdown parsing)
3. Build task tree from parsed tasks or execution plan
4. Initialize LLM provider (CLI/API mode)
5. Set up session logging and progress tracking

### 2. Task Execution Loop

```
while has_pending_tasks() {
    task = select_next_task()

    match execute_task(task) {
        Success(result) => {
            update_task_status(task, Completed)
            process_result(result)
            create_subtasks_if_needed(result)
        }
        Failure(error) => {
            update_task_status(task, Failed)
            handle_error(error)
            maybe_create_retry_task(task)
        }
        Blocked(reason) => {
            update_task_status(task, Blocked)
            schedule_retry_or_skip(task)
        }
    }

    persist_session_state()

    if should_checkpoint() {
        create_checkpoint()
    }
}
```

### 3. Task Selection Algorithm

- **Priority-based**: Higher priority tasks selected first
- **Dependency-aware**: Only select tasks whose dependencies are met
- **Context-optimized**: Prefer tasks that share context with recent work
- **Load-balanced**: Distribute work across different repositories/areas

## Error Handling & Recovery

### Error Categories

1. **Transient Errors**: Network issues, rate limits, temporary API failures
2. **Task Errors**: Code compilation failures, test failures, logical errors
3. **System Errors**: File system issues, Docker problems, resource exhaustion

### Recovery Strategies

- **Automatic Retry**: Exponential backoff for transient errors
- **Task Decomposition**: Break down failed tasks into smaller subtasks
- **Context Reset**: Clear and rebuild context when corrupted
- **Manual Intervention**: Flag tasks requiring human input

## Configuration Management

### Agent Configuration

```toml
[session]
max_duration_hours = 8
checkpoint_interval_minutes = 30
max_concurrent_tasks = 3

[claude_code]
model = "claude-sonnet-4-20250514"
max_tokens = 4000
temperature = 0.1

[rate_limiting]
requests_per_minute = 30
tokens_per_minute = 100000
burst_allowance = 5

[docker]
image = "claude-code-agent:latest"
cpu_limit = "2.0"
memory_limit = "8g"
network_mode = "bridge"

[logging]
level = "info"
structured = true
include_claude_traces = true
```

## Performance Considerations

### Optimization Strategies

1. **Context Reuse**: Maintain Claude Code session across related tasks
2. **Batching**: Group related file operations and API calls
3. **Caching**: Cache compilation results and analysis outputs
4. **Resource Management**: Monitor and limit container resource usage

### Monitoring Metrics

- Task completion rate and average time
- Claude Code API usage and costs
- Container resource utilization
- Error rates by category
- Session checkpoint and resume success rates

## Security Considerations

### Container Security

- **Read-only repositories**: Prevent accidental source code modification
- **Isolated workspace**: Sandbox for all build and test operations
- **Resource limits**: Prevent resource exhaustion attacks
- **Network restrictions**: Limit container network access

### Data Security

- **Session encryption**: Encrypt sensitive session data at rest
- **Access logging**: Audit all file system and API access
- **Credential management**: Secure handling of API keys and tokens

## Deployment & Distribution

### Binary Distribution

```bash
# Build optimized release binary
cargo build --release

# The binary is self-contained with no Docker dependencies
# Requires only: Rust runtime and configured LLM provider (claude CLI or API keys)
```

### Usage Examples

```bash
# Structured TOML task file
aca --task-file tasks.toml

# Intelligent Markdown parsing
aca --task-file tasks.md --use-intelligent-parser \
    --context "project context" \
    --append-system-prompt "additional instructions"

# Execution plan workflow
aca --task-file tasks.md --dry-run --dump-plan plan.json
# Review and edit plan.json
aca --execution-plan plan.json

# Resume from checkpoint
aca resume <checkpoint-id>

# List available checkpoints
aca --list-checkpoints
```

## Implementation References

This section consolidates key external references and examples that inform the implementation of specific components.

### Core References

| Component             | Reference                                                                                                    | Purpose                                          |
| --------------------- | ------------------------------------------------------------------------------------------------------------ | ------------------------------------------------ |
| **Headless SDK**      | [Claude Code SDK Documentation](https://docs.claude.com/en/docs/claude-code/sdk/sdk-headless)                | Primary API for programmatic Claude Code control |
| **Rate Limiting**     | [ccusage Session Blocks](https://github.com/ryoppippi/ccusage/blob/main/apps/ccusage/src/_session-blocks.ts) | Production-tested rate limiting patterns         |
| **Project Structure** | [ccusage Repository](https://github.com/ryoppippi/ccusage/tree/main)                                         | Overall architecture and organization patterns   |

### Component-Specific Implementation Guides

#### 1. Claude Code Integration

- **SDK Initialization**: Reference headless SDK docs for WebSocket connection setup
- **Message Protocol**: Use SDK documentation for structured request/response formats
- **Session Management**: Follow SDK patterns for maintaining persistent sessions

#### 2. Rate Limiting Implementation

From the ccusage codebase, key patterns to adapt:

```typescript
// Reference pattern from ccusage/_session-blocks.ts
interface SessionBlock {
  startTime: number;
  endTime: number;
  tokenUsage: number;
  requestCount: number;
}
```

- Session-based usage tracking
- Proactive blocking before hitting limits
- Exponential backoff with configurable multipliers
- Usage projection and early warning systems

#### 3. Docker Container Management

- **Volume Mounting**: Use Docker API best practices for read-only repository mounts
- **Resource Limits**: Reference Docker documentation for CPU/memory constraints
- **Network Isolation**: Implement restricted networking for security

#### 4. Session Persistence

- **State Serialization**: Use Rust serde patterns for JSON-based persistence
- **Atomic Updates**: Implement write-ahead logging for session state changes
- **Recovery Logic**: Design idempotent operations for crash recovery

### Configuration References

#### Docker Configuration

```yaml
# Reference Docker Compose patterns for volume mounting
services:
  agent:
    volumes:
      - "${REPOS_PATH}:/repos:ro"
      - "${WORKSPACE_PATH}:/workspace:rw"
      - "${SESSION_PATH}:/session:rw"
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 8G
```

#### Rate Limiting Configuration

Based on ccusage patterns:

```toml
[rate_limiting]
# Adapt from ccusage session management
session_duration_minutes = 60
max_tokens_per_session = 100000
max_requests_per_minute = 30
backoff_multiplier = 2.0
max_backoff_seconds = 300
```

## Future Enhancements (Planned)

### Docker Containerization
- Isolated execution environments
- Read-only repository mounts
- Separate workspace for modifications
- Resource limits (CPU, memory, network)
- Container lifecycle management

### Advanced Features
1. **Multi-model Support**: Support for different Claude models based on task complexity
2. **Distributed Execution**: Run multiple agent containers in parallel
3. **Web Dashboard**: Real-time monitoring and control interface
4. **Plugin System**: Extensible task handlers for specific domains
5. **Integration APIs**: REST/GraphQL APIs for external system integration
6. **Headless Claude Code SDK**: Direct WebSocket integration for better control

### Scalability Considerations
- **Horizontal scaling**: Multiple agent containers with shared session state
- **Resource optimization**: Dynamic container sizing based on workload
- **State sharding**: Distribute large session states across multiple storage backends

---

This design provides a robust foundation for building a sophisticated agentic coding assistant that can handle complex, long-running development tasks with full persistence and resumability.

---

## Implementation Roadmap

The implementation should proceed through the following phases:

### Phase 1: Core Foundation (Deliverable 1.1)
- Basic CLI and configuration system
- Docker container management
- Session initialization and cleanup

### Phase 2: Task Management (Deliverable 1.2)
- Task tree data structure and operations
- Basic task scheduling and execution
- Task persistence and state management

### Phase 3: Session Persistence (Deliverable 1.3)
- Comprehensive state persistence system
- Checkpoint and recovery mechanisms
- Data integrity and validation

### Phase 4: Claude Integration (Deliverable 1.4)
- Headless SDK integration
- Rate limiting and usage management
- Context optimization and error recovery

### Phase 5: Production Deployment (Deliverable 1.5)
- Advanced container orchestration
- Volume management and security
- Resource monitoring and optimization

### Phase 6: Security & Operations (Deliverable 1.6)
- Security controls and compliance
- Monitoring and alerting systems
- Performance optimization and analytics

Each phase builds upon the previous ones, allowing for incremental development and testing while maintaining a clear path toward the complete system.