prodigy 0.1.6

Turn ad-hoc Claude sessions into reproducible development pipelines with parallel AI agents
Documentation
# Architecture Assessment: Workflow Execution Paths

## Current State Analysis

### Problem: Multiple Execution Paths

The current implementation has **divergent execution paths** for workflows depending on how they're invoked:

1. **Standard Workflow Path** (`execute_workflow`)
   - Uses `WorkflowExecutor` directly
   - Properly handles validation, on_failure, on_success, etc.
   - Full feature support

2. **Args/Map Path** (`execute_workflow_with_args`) 
   - Was converting `WorkflowCommand``Command` (losing features)
   - Direct command execution without WorkflowExecutor
   - **Lost validation configuration** and other workflow features
   - Now fixed but still represents a separate code path

3. **MapReduce Path** (`execute_mapreduce_workflow`)
   - Separate implementation for parallel execution
   - Different from sequential workflows

### Root Cause

The architecture evolved organically with features being added incrementally:
- Initial implementation supported simple sequential workflows
- `--args` and `--map` were added as special cases
- MapReduce was added as a separate execution mode
- Validation was added to WorkflowStep but not propagated to all paths

This led to **feature inconsistency** where new workflow features (like validation) only worked in certain execution paths.

## Issues Identified

### 1. Code Duplication
- Command execution logic duplicated between orchestrator and executor
- Variable substitution logic in multiple places
- Git verification logic repeated

### 2. Inconsistent Feature Support
- Validation only worked in standard workflow path
- `on_failure`, `on_success` handlers inconsistently available
- Capture output behavior varies by path

### 3. Testing Complexity
- Need to test same features across multiple paths
- Mock setup is complex due to many dependencies
- Easy to miss edge cases (as evidenced by validation bug)

### 4. Maintenance Burden
- New features must be added to multiple code paths
- Bug fixes may need to be applied in multiple places
- Risk of regression when modifying one path

## Recommended Architecture

### Short-term Fix (Implemented)
✅ Modified `execute_workflow_with_args` to use WorkflowExecutor
✅ Preserved validation configuration in command conversion
✅ Added test coverage for validation in all paths

### Long-term Refactoring

#### 1. Unified Execution Model

```rust
// Single workflow executor that handles all modes
pub struct UnifiedWorkflowExecutor {
    mode: ExecutionMode,
    executor: Arc<dyn CommandExecutor>,
}

pub enum ExecutionMode {
    Sequential,
    Parallel { max_workers: usize },
    MapReduce { map: Phase, reduce: Phase },
}

impl UnifiedWorkflowExecutor {
    pub async fn execute(&self, workflow: Workflow, inputs: Vec<Input>) -> Result<()> {
        match self.mode {
            Sequential => self.execute_sequential(workflow, inputs).await,
            Parallel { max_workers } => self.execute_parallel(workflow, inputs, max_workers).await,
            MapReduce { map, reduce } => self.execute_mapreduce(map, reduce, inputs).await,
        }
    }
}
```

#### 2. Input Abstraction

```rust
pub enum WorkflowInput {
    Single,                    // No args/map
    Args(Vec<String>),        // Direct arguments
    Map(Vec<PathBuf>),        // File patterns
    MapReduce(MapReduceConfig), // Complex parallel execution
}

// Workflow always gets inputs the same way
impl Workflow {
    pub fn with_inputs(self, inputs: WorkflowInput) -> ExecutableWorkflow {
        // Convert to standard format
    }
}
```

#### 3. Single Command Execution Path

```rust
// All commands go through one executor
pub trait CommandExecutor {
    async fn execute(&self, command: Command, context: Context) -> Result<Output>;
}

// WorkflowStep is the canonical representation
pub struct WorkflowStep {
    command: CommandSpec,
    validation: Option<ValidationConfig>,
    handlers: Handlers,
    // ... all features in one place
}
```

## Benefits of Refactoring

### 1. Consistency
- All features available in all execution modes
- Single source of truth for command execution
- Predictable behavior across invocation methods

### 2. Testability
- Test features once, work everywhere
- Simpler mock setup
- Better coverage with fewer tests

### 3. Maintainability
- New features added in one place
- Bug fixes apply universally
- Clear separation of concerns

### 4. Performance
- Opportunity to optimize single code path
- Better caching and reuse
- Cleaner async execution

## Migration Strategy

### Phase 1: Consolidate Execution (Current)
- [x] Fix immediate validation bug
- [x] Add comprehensive tests
- [ ] Document current architecture

### Phase 2: Refactor Core Execution
- [ ] Create unified CommandExecutor trait
- [ ] Implement WorkflowStep as canonical representation
- [ ] Migrate all paths to use WorkflowExecutor

### Phase 3: Simplify Orchestration
- [ ] Remove duplicate execution methods
- [ ] Consolidate variable substitution
- [ ] Unify error handling

### Phase 4: Enhance Testing
- [ ] Create integration test suite
- [ ] Add property-based tests
- [ ] Benchmark performance

## Conclusion

The current architecture works but has accumulated technical debt through organic growth. The validation bug exposed a fundamental issue: **features aren't consistently available across execution paths**.

### Immediate Recommendations
1. **Document the current execution paths** clearly
2. **Add integration tests** that verify features work with `--args`, `--map`, and standard execution
3. **Create a feature matrix** showing which features work in which modes

### Long-term Recommendations
1. **Refactor to a unified execution model** where all workflows go through the same pipeline
2. **Treat WorkflowStep as the canonical representation** and convert all inputs to this format early
3. **Separate concerns** - orchestration (what to run) from execution (how to run it)

This refactoring would prevent bugs like the validation issue and make the codebase more maintainable and extensible.