pmat 3.17.0 - Docs.rs

# Sprint 64 Kickoff: Mutation Testing - Testing, Examples, and Documentation

**Sprint**: Sprint 64
**Target Version**: v2.177.0
**Duration**: 3 days
**Start Date**: October 27, 2025
**Focus**: Comprehensive testing infrastructure, example projects, CI/CD integration, and documentation

---

## Context

### Previous Sprint Completion
- **Sprint 63**: ✅ Complete - Multi-Language Mutation Testing Support (v2.176.0)
  - Centralized language detection (Language enum)
  - 6 languages supported (Rust, Python, TypeScript, JavaScript, Go, C++)
  - 19 comprehensive tests (100% passing)
  - Published to crates.io on October 27, 2025

### Current State
- **Mutation Testing Feature**: Core functionality implemented and released
- **Language Support**: 6 languages with centralized detection
- **Test Coverage**: Basic tests exist, but comprehensive test suite needed
- **Documentation**: Feature documented in CHANGELOG, needs user guides
- **Examples**: No example projects exist yet
- **CI/CD**: No integration guides available

---

## Sprint 64 Objectives

### Primary Goals
1. **Build Comprehensive Test Suite** for mutation testing feature
2. **Create Example Projects** demonstrating mutation testing in 3 languages
3. **Develop CI/CD Integration Guides** for popular platforms
4. **Establish Performance Benchmarks** and optimization targets
5. **Write User Documentation** and best practices guides

### Success Criteria
- ✅ Test coverage >85% for mutation feature
- ✅ 3 example projects created (Rust, Python, TypeScript)
- ✅ 3 CI/CD integration guides written (GitHub Actions, GitLab CI, Jenkins)
- ✅ Performance benchmarks established
- ✅ Mutation score badge generation implemented
- ✅ User guide and best practices documentation complete

---

## Day 1: Testing Infrastructure

### Objectives
- Implement comprehensive test suite for mutation testing
- Achieve >85% test coverage for mutation feature
- Establish testing patterns for future development

### Tasks

#### 1. Unit Tests for Mutation Handler (~50 tests)
**File**: `server/src/cli/handlers/mutate.rs`

**Test Categories**:
- Command argument parsing
- Output format selection (text, JSON, markdown)
- Failures-only filtering
- Color coding logic
- Error handling
- Progress indicator functionality

**Example Tests**:
```rust
#[test]
fn test_mutate_handler_text_output_format() { }

#[test]
fn test_mutate_handler_json_output_format() { }

#[test]
fn test_mutate_handler_markdown_output_format() { }

#[test]
fn test_mutate_handler_failures_only_filter() { }

#[test]
fn test_mutate_handler_invalid_target_error() { }

#[test]
fn test_mutate_handler_timeout_configuration() { }
```

#### 2. Integration Tests for Full Workflow (~20 tests)
**File**: `server/tests/mutation_integration_tests.rs`

**Test Scenarios**:
- End-to-end mutation testing for each supported language
- Multi-file mutation testing
- Large file handling (>1000 lines)
- Concurrent mutation execution
- Error recovery and resilience

**Example Tests**:
```rust
#[test]
fn test_rust_mutation_full_workflow() { }

#[test]
fn test_python_mutation_full_workflow() { }

#[test]
fn test_typescript_mutation_full_workflow() { }

#[test]
fn test_multi_file_mutation_testing() { }

#[test]
fn test_large_file_mutation_performance() { }
```

#### 3. Property-Based Tests with proptest (~10 tests)
**File**: `server/tests/mutation_property_tests.rs`

**Properties to Test**:
- Mutant generation is deterministic for same input
- All generated mutants are syntactically valid
- Mutation score is always between 0.0 and 1.0
- Killed mutant count ≤ total mutant count
- Output format consistency across languages

**Example Tests**:
```rust
proptest! {
    #[test]
    fn test_mutant_generation_deterministic(code: String) { }

    #[test]
    fn test_mutation_score_bounded(mutants: Vec<Mutant>) { }

    #[test]
    fn test_output_format_consistency(format: OutputFormat) { }
}
```

### Deliverables
- [ ] 50+ unit tests for mutation handler
- [ ] 20+ integration tests for full workflow
- [ ] 10+ property-based tests
- [ ] Test coverage report showing >85% coverage
- [ ] CI integration for automated test execution

---

## Day 2: Example Projects and CI/CD Integration

### Objectives
- Create working example projects for 3 languages
- Develop CI/CD integration guides for 3 platforms
- Demonstrate real-world mutation testing usage

### Tasks

#### 1. Rust Example Project
**Directory**: `examples/rust-mutation-testing/`

**Structure**:
```
examples/rust-mutation-testing/
├── Cargo.toml
├── README.md
├── src/
│   ├── lib.rs (calculator library with tests)
│   └── validator.rs (input validation)
├── tests/
│   └── integration_tests.rs
└── .github/
    └── workflows/
        └── mutation-testing.yml
```

**Features**:
- Simple library with 5-10 functions
- Comprehensive unit tests
- Integration tests
- GitHub Actions workflow for mutation testing
- Mutation score badge in README

#### 2. Python Example Project
**Directory**: `examples/python-mutation-testing/`

**Structure**:
```
examples/python-mutation-testing/
├── pyproject.toml
├── README.md
├── src/
│   ├── calculator.py
│   └── validator.py
├── tests/
│   ├── test_calculator.py
│   └── test_validator.py
└── .github/
    └── workflows/
        └── mutation-testing.yml
```

**Features**:
- Python package with pytest
- Type hints
- GitHub Actions workflow
- Mutation score tracking

#### 3. TypeScript Example Project
**Directory**: `examples/typescript-mutation-testing/`

**Structure**:
```
examples/typescript-mutation-testing/
├── package.json
├── tsconfig.json
├── README.md
├── src/
│   ├── calculator.ts
│   └── validator.ts
├── tests/
│   ├── calculator.test.ts
│   └── validator.test.ts
└── .github/
    └── workflows/
        └── mutation-testing.yml
```

**Features**:
- TypeScript project with Jest
- Type-safe implementation
- GitHub Actions workflow
- npm package structure

#### 4. CI/CD Integration Guides

**Guide 1: GitHub Actions** (`docs/guides/mutation-testing-github-actions.md`)
```yaml
name: Mutation Testing

on: [push, pull_request]

jobs:
  mutation-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install pmat
        run: cargo install pmat
      - name: Run mutation tests
        run: pmat mutate --target src/ --failures-only
      - name: Upload results
        uses: actions/upload-artifact@v3
        with:
          name: mutation-results
          path: mutation_results.json
```

**Guide 2: GitLab CI** (`docs/guides/mutation-testing-gitlab-ci.md`)
```yaml
mutation-testing:
  image: rust:latest
  stage: test
  script:
    - cargo install pmat
    - pmat mutate --target src/ --output-format json > mutation_results.json
  artifacts:
    reports:
      junit: mutation_results.json
```

**Guide 3: Jenkins** (`docs/guides/mutation-testing-jenkins.md`)
```groovy
pipeline {
    agent any
    stages {
        stage('Mutation Testing') {
            steps {
                sh 'cargo install pmat'
                sh 'pmat mutate --target src/ --failures-only'
            }
        }
    }
}
```

### Deliverables
- [ ] 3 complete example projects (Rust, Python, TypeScript)
- [ ] Each example project has README with setup instructions
- [ ] 3 CI/CD integration guides (GitHub Actions, GitLab CI, Jenkins)
- [ ] Each guide includes badge generation
- [ ] Examples demonstrate best practices

---

## Day 3: Performance Benchmarking and Documentation

### Objectives
- Establish performance benchmarks
- Implement mutation score badge generation
- Write comprehensive user documentation

### Tasks

#### 1. Performance Benchmarking
**File**: `server/benches/mutation_benchmarks.rs`

**Benchmarks**:
- Mutant generation speed (mutants/second)
- Large file processing (>1000 lines)
- Multi-file project analysis
- Language-specific performance comparisons
- Memory usage profiling

**Target Metrics**:
- Rust: >100 mutants/second
- Python: >50 mutants/second
- TypeScript: >50 mutants/second
- Memory: <500MB for 1000+ mutants

**Implementation**:
```rust
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn benchmark_rust_mutation(c: &mut Criterion) {
    c.bench_function("rust mutant generation", |b| {
        b.iter(|| {
            // Benchmark mutant generation for Rust code
        });
    });
}

criterion_group!(benches, benchmark_rust_mutation);
criterion_main!(benches);
```

#### 2. Mutation Score Badge Generation
**Feature**: Generate SVG badges for mutation scores

**Implementation**:
```rust
// server/src/services/mutation/badge_generator.rs

pub struct BadgeGenerator;

impl BadgeGenerator {
    pub fn generate_svg(score: f64) -> String {
        let color = match score {
            s if s >= 0.8 => "brightgreen",  // ≥80% = green
            s if s >= 0.6 => "yellow",        // 60-80% = yellow
            _ => "red",                        // <60% = red
        };

        format!(
            r#"<svg xmlns="http://www.w3.org/2000/svg" width="140" height="20">
                <text x="10" y="14">Mutation Score: {:.1}%</text>
            </svg>"#,
            score * 100.0
        )
    }
}
```

**CLI Integration**:
<!-- pmat:ignore-link -->
```bash
# Generate badge SVG
pmat mutate --target src/ --output-badge mutation-score.svg

# Badge in README.md
![Mutation Score](./mutation-score.svg)
```

#### 3. User Documentation

**Guide 1**: `docs/guides/mutation-testing.md`
- Introduction to mutation testing
- How PMAT mutation testing works
- Command reference
- Output formats
- Best practices

**Guide 2**: `docs/guides/mutation-testing-best-practices.md`
- Writing testable code
- Interpreting mutation scores
- Common pitfalls
- Performance optimization
- CI/CD integration strategies

**Guide 3**: `examples/mutation_testing_workflow.md`
- Step-by-step workflow
- Real-world scenarios
- Troubleshooting guide
- FAQ

### Deliverables
- [ ] Performance benchmark suite
- [ ] Baseline performance metrics documented
- [ ] Badge generation feature implemented
- [ ] User guide (mutation-testing.md)
- [ ] Best practices guide
- [ ] Workflow examples

---

## Testing Strategy

### Unit Testing
- Focus on individual components
- Mock external dependencies
- Test edge cases and error conditions
- Aim for >90% code coverage

### Integration Testing
- Test end-to-end workflows
- Use real file system (temporary directories)
- Test all supported languages
- Validate output formats

### Property-Based Testing
- Use proptest for invariant testing
- Generate random test inputs
- Verify mathematical properties
- Ensure deterministic behavior

### Performance Testing
- Use criterion for benchmarking
- Establish baseline metrics
- Track performance regressions
- Profile memory usage

---

## Quality Gates

### Before Each Commit
1. ✅ Run clippy: `cargo clippy --all-targets --all-features`
2. ✅ Run tests: `cargo test --all-features`
3. ✅ Run benchmarks: `cargo bench` (Day 3)
4. ✅ Check test coverage: `cargo llvm-cov --all-features`

### Before Release (v2.177.0)
1. ✅ All tests passing (unit, integration, property-based)
2. ✅ Test coverage >85%
3. ✅ All example projects working
4. ✅ All CI/CD guides tested
5. ✅ Documentation reviewed and complete
6. ✅ Performance benchmarks meet targets
7. ✅ Update CHANGELOG.md
8. ✅ Update version in Cargo.toml

---

## Success Metrics

### Testing
- [ ] >50 unit tests for mutation handler
- [ ] >20 integration tests for workflows
- [ ] >10 property-based tests
- [ ] >85% test coverage

### Examples
- [ ] 3 example projects (Rust, Python, TypeScript)
- [ ] Each example has working CI/CD workflow
- [ ] Each example includes mutation score badge

### Documentation
- [ ] 3 CI/CD integration guides
- [ ] User guide and best practices
- [ ] API documentation complete

### Performance
- [ ] Benchmarks establish baseline metrics
- [ ] Performance competitive with existing tools
- [ ] Memory usage <500MB for large projects

---

## Risk Mitigation

### Risk 1: Test Implementation Complexity
**Mitigation**: Start with simple unit tests, progressively add complexity

### Risk 2: Example Project Scope Creep
**Mitigation**: Keep examples simple (5-10 functions each), focus on clarity

### Risk 3: Performance Benchmarking Variability
**Mitigation**: Run benchmarks multiple times, use statistical analysis

### Risk 4: Documentation Completeness
**Mitigation**: Follow template structure, peer review before completion

---

## Dependencies

### External Dependencies
- `proptest` - Property-based testing framework
- `criterion` - Benchmarking framework
- `tempfile` - Temporary file/directory creation for tests

### Internal Dependencies
- Mutation engine (existing)
- Language detection (v2.176.0)
- Output formatters (existing)

---

## Resources

### Code References
- **Mutation Handler**: `server/src/cli/handlers/mutate.rs`
- **Mutation Engine**: `server/src/services/mutation/engine.rs`
- **Language Detector**: `server/src/services/mutation/language_detector.rs`
- **Mutation Types**: `server/src/services/mutation/types.rs`

### Documentation
- **Sprint 62-64 Roadmap**: `docs/execution/SPRINT-62-64-ROADMAP.md`
- **Sprint 63 Kickoff**: `docs/execution/SPRINT-63-KICKOFF.md`
- **NEXT-STEPS**: `NEXT-STEPS.md`

---

## Daily Checklist

### Day 1
- [ ] Create test files for unit tests
- [ ] Implement 50+ unit tests
- [ ] Create integration test suite
- [ ] Implement 20+ integration tests
- [ ] Create property test suite
- [ ] Implement 10+ property tests
- [ ] Run coverage analysis
- [ ] Document test patterns

### Day 2
- [ ] Create Rust example project
- [ ] Create Python example project
- [ ] Create TypeScript example project
- [ ] Write GitHub Actions guide
- [ ] Write GitLab CI guide
- [ ] Write Jenkins guide
- [ ] Test all examples with CI/CD workflows

### Day 3
- [ ] Implement benchmark suite
- [ ] Run baseline benchmarks
- [ ] Implement badge generation
- [ ] Write user guide
- [ ] Write best practices guide
- [ ] Write workflow examples
- [ ] Final quality gate checks

---

## Contact

**Project Maintainer**: Noah Gift (@noahgift)
**Repository**: https://github.com/paiml/paiml-mcp-agent-toolkit
**Issues**: https://github.com/paiml/paiml-mcp-agent-toolkit/issues

---

**Created**: October 27, 2025
**Sprint Duration**: 3 days
**Target Version**: v2.177.0
**Previous Sprint**: Sprint 63 (v2.176.0 - Multi-Language Mutation Testing Support)