# Dockerfile Testing Parity Implementation Plan
## Executive Summary
**Goal**: Achieve testing parity between Dockerfile and Makefile/script.sh transformations
**Current State Snapshot** (as of November 11, 2025):
- **Dockerfile**: 16 CLI tests, ~75% coverage, 0 property tests, 0 mutation tests
- **Makefile**: Comprehensive unit + property + CLI tests, ~88% coverage, mutation testing in progress
- **script.sh**: 6004+ tests, property + mutation + coverage >85%
**Target State**:
- **Dockerfile**: 50+ CLI tests, >85% coverage, 100+ property test cases, >90% mutation kill rate
- **Unified Quality**: Property testing, mutation testing, edge case coverage across all three
**Estimated Effort**: 280-320 hours (8-10 weeks at full-time development)
**Priority**: P2 (after script.sh completion, before v7.0 release)
---
## Current Testing Landscape
### Dockerfile (INCOMPLETE)
**File Locations**:
- CLI Tests: `/home/noah/src/bashrs/rash/tests/cli_dockerfile_purify.rs` (529 lines, 16 tests)
- Property Tests: `/home/noah/src/bashrs/rash/tests/property_dockerfile_purify.rs` (454 lines, 10 proptest blocks + 4 edge case tests)
- Source: `/home/noah/src/bashrs/rash/src/linter/rules/docker*.rs` (5 rule files)
**Current Coverage Analysis**:
```
Dockerfile CLI Tests (16 tests):
├── Command existence (1 test)
├── Basic purification (2 tests)
├── DOCKER001 - Add USER (3 tests)
├── DOCKER002 - Pin base images (2 tests)
├── DOCKER003 - Add cleanup (2 tests)
├── DOCKER005 - --no-install-recommends (1 test)
├── DOCKER006 - ADD → COPY (2 tests)
└── CLI options (1 test)
Property Tests: 14 tests
├── Determinism (1 proptest block)
├── Idempotency (1 proptest block)
├── Preservation properties (4 blocks)
├── Edge case tests (4 tests)
└── Stress tests (3 tests)
Coverage Gap: ~75% (estimate)
- Missing: Error handling, recovery, edge cases
- Missing: Integration with other rules
- Missing: Performance characteristics
- Missing: CLI flag combinations
```
### Makefile (REFERENCE)
**File Locations**:
- CLI Tests: `/home/noah/src/bashrs/rash/tests/cli_make_*.rs` (multiple files)
- Unit Tests: `/home/noah/src/bashrs/rash/src/make_parser/tests.rs`
- Property Tests: `/home/noah/src/bashrs/rash/tests/make_formatting_property_tests.rs`
- Coverage: ~88%
**Makefile Test Pattern**:
```
Unit Tests (inline):
├── Lexer tests (20+)
├── Parser tests (50+)
├── Transformer tests (30+)
└── Emitter tests (25+)
CLI Tests:
├── Basic operations (10 tests)
├── Error handling (8 tests)
├── Flag combinations (12 tests)
└── Integration scenarios (15 tests)
Property Tests:
├── Determinism (1 block, 100 cases)
├── Idempotency (1 block, 100 cases)
├── Preservation properties (5 blocks, 500 cases)
└── Syntax validity (1 block, 100 cases)
```
### script.sh (GOLD STANDARD)
**Test Infrastructure**:
- 6004+ total tests
- Property-based testing: 100+ cases per property
- Mutation testing: >90% kill rate
- Coverage: >85% across all modules
- EXTREME TDD for all components
- assert_cmd for all CLI tests
**Key Patterns to Follow**:
1. RED → GREEN → REFACTOR cycle for each feature
2. Property tests covering 100+ cases
3. Mutation testing integrated into CI/CD
4. Edge case cataloging and testing
5. Integration tests alongside unit tests
---
## Phase 1: Test Infrastructure Enhancement (Weeks 1-2, 40 hours)
### Phase 1.1: Extend CLI Tests (20 hours)
**Objectives**:
- Expand from 16 to 35+ CLI tests
- Cover all Docker transformations (DOCKER001-DOCKER010)
- Add error handling paths
- Add flag combination tests
**EXTREME TDD Approach**:
```rust
// RED: Write failing tests first
#[test]
fn test_DOCKER_001_cli_user_directive_basic() {
// ARRANGE: Create test Dockerfile without USER
let dockerfile = "FROM debian:12\nCMD [\"bash\"]";
let temp_file = write_temp_dockerfile(dockerfile);
// ACT: Run purify command
bashrs_cmd()
.arg("dockerfile")
.arg("purify")
.arg(&temp_file)
.assert()
.success()
.stdout(predicate::str::contains("USER"));
// ASSERT: Verify user creation RUN command added
// (This test should FAIL before implementation)
}
```
**New Tests to Add** (19 additional):
1. **DOCKER001 - User Directive Tests (5 new)**
- test_DOCKER_001_user_in_layers (nested RUNs)
- test_DOCKER_001_user_with_sudo (conditional execution)
- test_DOCKER_001_user_permissions (special dirs)
- test_DOCKER_001_user_combined_with_other_fixes (integration)
- test_DOCKER_001_user_idempotency (running twice)
2. **DOCKER002 - Base Image Pinning (4 new)**
- test_DOCKER_002_pin_multi_stage_builds
- test_DOCKER_002_pin_sha256_digests (already pinned)
- test_DOCKER_002_pin_registry_prefixes
- test_DOCKER_002_pin_with_args (ARG in FROM)
3. **DOCKER003/004 - Cleanup (3 new)**
- test_DOCKER_003_cleanup_combined_with_001
- test_DOCKER_004_health_check_combinations
- test_DOCKER_003_cleanup_already_present
4. **DOCKER005 - Package Manager Flags (3 new)**
- test_DOCKER_005_yum_equivalent_flags
- test_DOCKER_005_pip_upgrade_combinations
- test_DOCKER_005_flag_order_preservation
5. **DOCKER006 - ADD to COPY (2 new)**
- test_DOCKER_006_add_to_copy_with_wildcards
- test_DOCKER_006_add_checksum_preserved
6. **CLI Flag Tests (4 new)**
- test_DOCKER_CLI_010_dry_run_multiple_files
- test_DOCKER_CLI_011_fix_with_backup_verification
- test_DOCKER_CLI_012_quiet_flag
- test_DOCKER_CLI_013_json_output
**Effort Breakdown**:
- Test design and setup: 8 hours
- Implementation per test: 0.5 hours × 19 = 9.5 hours
- Documentation and validation: 2.5 hours
**Success Criteria**:
- All 35 CLI tests written in RED phase
- assert_cmd pattern followed for all tests
- Test naming convention: `test_DOCKER_<NUMBER>_<feature>_<scenario>`
- All tests runnable (not necessarily passing)
### Phase 1.2: Property Test Generation (15 hours)
**Objectives**:
- Expand property tests from 10 to 20+ blocks
- Generate 100+ test cases per property
- Add edge case generators
**EXTREME TDD Approach**:
```rust
// RED: Define properties that MUST hold
proptest! {
#![proptest_config(ProptestConfig {
cases: 100, // Minimum for property tests
max_shrink_iters: 100,
.. ProptestConfig::default()
})]
#[test]
fn prop_DOCKER_001_user_always_before_cmd(
dockerfile in dockerfile_with_user_and_cmd()
) {
// ASSERT: USER must come before CMD
let purified = purify_dockerfile(&dockerfile);
let user_pos = purified.find("USER").unwrap_or(usize::MAX);
let cmd_pos = purified.find("CMD").unwrap_or(usize::MAX);
prop_assert!(user_pos < cmd_pos);
}
}
```
**New Property Blocks** (10 additional):
1. **Ordering Properties (3)**
- prop_DOCKER_user_before_cmd
- prop_DOCKER_cleanup_after_install
- prop_DOCKER_healthcheck_after_expose
2. **Semantic Preservation (4)**
- prop_DOCKER_image_digest_unchanged
- prop_DOCKER_workdir_preserved
- prop_DOCKER_labels_preserved
- prop_DOCKER_env_vars_preserved
3. **Transformation Properties (3)**
- prop_DOCKER_add_to_copy_only_for_local
- prop_DOCKER_flags_only_when_needed
- prop_DOCKER_multi_stage_independence
**Test Generators to Add** (5 new):
```rust
/// Generate Dockerfiles with multiple RUN commands
fn dockerfile_with_multiple_runs() -> impl Strategy<Value = String> { ... }
/// Generate Dockerfiles with multi-stage builds
fn dockerfile_multi_stage() -> impl Strategy<Value = String> { ... }
/// Generate Dockerfiles with complex FROM expressions
fn dockerfile_with_args() -> impl Strategy<Value = String> { ... }
/// Generate edge-case Dockerfiles (malformed but recognizable)
fn dockerfile_edge_cases() -> impl Strategy<Value = String> { ... }
/// Generate intentionally vulnerable Dockerfiles
fn dockerfile_vulnerable_patterns() -> impl Strategy<Value = String> { ... }
```
**Effort Breakdown**:
- Property definition: 5 hours
- Generator implementation: 6 hours
- Validation and refinement: 4 hours
**Success Criteria**:
- 20 proptest blocks written
- Each block generates 100+ test cases (configurable)
- All generators compile and produce valid Dockerfiles
- Properties all FAIL initially (RED phase)
### Phase 1.3: Edge Case Catalog (5 hours)
**Objectives**:
- Document all edge cases and boundary conditions
- Create test matrix for systematic coverage
- Link to EXTREME TDD requirements
**Edge Cases to Document**:
```markdown
## DOCKER001 (User Directive)
- [x] No FROM - recovery
- [x] FROM scratch - skip
- [ ] Existing USER - preserve
- [ ] USER in COPY --chown - interaction
- [ ] RUN with /root access - needs sudo
- [ ] Alpine (busybox useradd) - different syntax
- [ ] Multi-stage build - user in each stage?
## DOCKER002 (Pin Images)
- [x] Untagged image (FROM ubuntu)
- [x] :latest tag (FROM debian:latest)
- [ ] Digest-only (FROM ubuntu@sha256:...)
- [ ] Registry prefix (FROM gcr.io/ubuntu)
- [ ] ARG in FROM (FROM ${BASE_IMAGE})
- [ ] Multi-stage with different bases
- [ ] FROM alias (FROM base AS builder)
## DOCKER003 (Cleanup)
- [x] apt-get install → add cleanup
- [x] apk add → add cleanup
- [ ] Multi-line RUN with cleanup
- [ ] Already-clean RUN - no change
- [ ] yum/dnf/pacman - different syntax
- [ ] pip install - no cleanup needed
- [ ] Combined with other fixes
## DOCKER005 (Package Flags)
- [x] apt-get install missing flag
- [ ] Already has flag - no change
- [ ] apt install (shorthand) - needs flag
- [ ] Apt-get with conditional logic
- [ ] Non-apt managers (yum, apk)
## DOCKER006 (ADD to COPY)
- [x] Local files - convert
- [x] Remote URLs - preserve
- [ ] Glob patterns - handle correctly
- [ ] With --chown - preserve flag
- [ ] Tar extraction URLs - special case
```
**Effort Breakdown**:
- Research and cataloging: 2.5 hours
- Test matrix creation: 1.5 hours
- Documentation: 1 hour
**Success Criteria**:
- 40+ edge cases documented
- Each mapped to test requirement
- Created test matrix (rows = transformations, cols = conditions)
---
## Phase 2: Unit Test Expansion (Weeks 2-3, 60 hours)
### Phase 2.1: Docker Rule Unit Tests (40 hours)
**Objectives**:
- Expand inline unit tests in docker*.rs files
- Target >85% code coverage per rule
- Test all code paths (normal + error)
**Current Coverage Gap**:
```
docker001.rs: ~70% coverage
├── Identified gaps:
│ ├── Error path for invalid group names
│ ├── Special case: getent group check
│ ├── Multiple CMD/ENTRYPOINT handling
│ └── Comment preservation
docker002.rs: ~75% coverage
├── Identified gaps:
│ ├── Registry-prefixed images
│ ├── Digest+tag combinations
│ ├── Version mapping logic edge cases
│ └── Unknown image defaults
docker003.rs: ~65% coverage
├── Identified gaps:
│ ├── Multi-line RUN parsing
│ ├── Shell script in RUN
│ ├── Cleanup already present (no-op)
│ └── Different cleanup commands
```
**EXTREME TDD Pattern**:
```rust
// In docker001.rs
#[cfg(test)]
mod tests {
use super::*;
// RED: Write failing test
#[test]
fn test_docker001_user_group_name_validation() {
let input = "FROM debian:12\nRUN useradd baduser-123!\nCMD [\"bash\"]";
// Should fail due to invalid group name
let result = purify_with_docker001(input);
assert!(result.is_err() || result.unwrap().contains("Invalid user"));
}
// Test coverage matrix
#[rstest]
#[case("FROM ubuntu", true)] // Untagged
#[case("FROM ubuntu:22.04", false)] // Already pinned
#[case("FROM ubuntu:latest", true)] // Latest tag
#[case("FROM gcr.io/ubuntu", true)] // Registry prefix
#[case("FROM ubuntu@sha256:abc", false)] // Digest only
fn test_docker002_pin_decision_matrix(
#[case] from_line: &str,
#[case] should_pin: bool
) {
let dockerfile = format!("{}\nCMD [\"bash\"]", from_line);
let result = purify_with_docker002(&dockerfile);
let needs_pin = result.lines().any(|l| {
l.contains("FROM") && !l.contains("${") && should_pin
});
assert_eq!(needs_pin, should_pin);
}
}
```
**New Unit Tests Required** (per rule):
1. **docker001.rs additions** (10 new tests)
- test_docker001_invalid_user_names
- test_docker001_user_in_conditional
- test_docker001_user_with_sudo_rules
- test_docker001_user_home_directory
- test_docker001_multi_stage_user_per_stage
- test_docker001_user_with_volume_ownership
- test_docker001_recovery_from_invalid_syntax
- test_docker001_user_group_conflicts
- test_docker001_skip_distroless_images
- test_docker001_preserve_existing_root_runs
2. **docker002.rs additions** (12 new tests)
- test_docker002_unknown_image_strategy
- test_docker002_registry_credentials
- test_docker002_sha256_digest_only
- test_docker002_digest_plus_tag
- test_docker002_version_mapping_cache
- test_docker002_dev_vs_stable_channel
- test_docker002_pre_release_handling
- test_docker002_base_image_as_arg
- test_docker002_custom_registries
- test_docker002_mirror_redirection
- test_docker002_pinning_idempotency
- test_docker002_error_recovery
3. **docker003.rs additions** (8 new tests)
- test_docker003_multiline_run_parsing
- test_docker003_shell_script_in_run
- test_docker003_cleanup_already_present
- test_docker003_cleanup_partially_present
- test_docker003_different_package_managers
- test_docker003_run_with_conditions
- test_docker003_cleanup_order
- test_docker003_error_on_invalid_run
4. **docker004.rs (Health Checks)** (8 new tests)
- test_docker004_healthcheck_only_once
- test_docker004_healthcheck_replace_existing
- test_docker004_healthcheck_interval_validation
- test_docker004_healthcheck_timeout_validation
- test_docker004_healthcheck_retries_validation
- test_docker004_healthcheck_start_period
- test_docker004_curl_vs_other_commands
- test_docker004_shell_vs_exec_form
5. **docker005.rs additions** (8 new tests)
- test_docker005_yum_equivalent
- test_docker005_pip_no_cache
- test_docker005_npm_production_flag
- test_docker005_alpine_vs_debian
- test_docker005_already_optimized
- test_docker005_flag_order_preservation
- test_docker005_complex_run_commands
- test_docker005_security_flag_combination
6. **docker006.rs additions** (6 new tests)
- test_docker006_glob_patterns
- test_docker006_with_chown_flag
- test_docker006_tar_archives
- test_docker006_checksum_comments
- test_docker006_whitespace_preservation
- test_docker006_path_normalization
**Effort Breakdown**:
- Test design (coverage matrix): 12 hours
- Implementation: 24 hours (2 hours per rule)
- Validation and fixes: 4 hours
**Success Criteria**:
- All 52 new unit tests written (RED phase)
- Coverage tooling shows >85% per rule
- All code paths exercised
- Tests fail initially (RED)
### Phase 2.2: Integration Tests (20 hours)
**Objectives**:
- Test interactions between multiple Docker rules
- Test complete transformation pipeline
- Test CLI with combinations of flags
**Integration Scenarios**:
```rust
#[test]
fn test_DOCKER_integration_001_all_fixes_combined() {
// Vulnerable Dockerfile with ALL issues:
// - No USER
// - Unpinned image
// - No cleanup
// - No --no-install-recommends
// - Uses ADD for local files
let dockerfile = r#"
FROM ubuntu
RUN apt-get update && apt-get install -y python3
ADD app.py /app/
CMD ["python3", "/app/app.py"]
"#;
let purified = purify_all(dockerfile);
// ASSERT all fixes applied
assert!(purified.contains("FROM ubuntu:22.04")); // DOCKER002
assert!(purified.contains("--no-install-recommends")); // DOCKER005
assert!(purified.contains("rm -rf /var/lib/apt/lists/*")); // DOCKER003
assert!(purified.contains("USER")); // DOCKER001
assert!(purified.contains("COPY app.py")); // DOCKER006
}
#[test]
fn test_DOCKER_integration_002_multi_stage_build() {
// Multi-stage Dockerfile
let dockerfile = r#"
FROM ubuntu AS builder
RUN apt-get update && apt-get install -y build-essential
COPY src /src
RUN cd /src && make
FROM ubuntu
COPY --from=builder /src/bin /app/bin
CMD ["/app/bin/main"]
"#;
let purified = purify_all(dockerfile);
// Each stage should be independently fixed
assert!(purified.contains("FROM ubuntu:22.04 AS builder"));
assert!(purified.contains("FROM ubuntu:22.04")); // Second stage also pinned
}
#[test]
fn test_DOCKER_integration_003_idempotency_chain() {
// Running purify 5 times should produce same result
let mut current = dockerfile_input.to_string();
for round in 1..=5 {
let next = purify_all(¤t);
assert_eq!(current, next, "Round {} changed output", round);
current = next;
}
}
```
**Integration Test Categories**:
1. **Complete Pipeline Tests** (5 tests)
- test_DOCKER_integration_001_all_fixes_combined
- test_DOCKER_integration_002_multi_stage_build
- test_DOCKER_integration_003_idempotency_chain
- test_DOCKER_integration_004_no_double_application
- test_DOCKER_integration_005_performance_large_file
2. **Cross-Rule Interaction** (6 tests)
- test_DOCKER_interaction_001_002_pin_then_user
- test_DOCKER_interaction_002_005_pin_with_flags
- test_DOCKER_interaction_003_006_cleanup_with_copy
- test_DOCKER_interaction_001_003_user_with_cleanup
- test_DOCKER_interaction_004_005_health_check_ordering
- test_DOCKER_interaction_all_priority_ordering
3. **Error Recovery** (3 tests)
- test_DOCKER_integration_recovery_001_partial_fix
- test_DOCKER_integration_recovery_002_invalid_input
- test_DOCKER_integration_recovery_003_format_preservation
4. **Edge Case Combinations** (4 tests)
- test_DOCKER_integration_edge_001_empty_dockerfile
- test_DOCKER_integration_edge_002_only_comments
- test_DOCKER_integration_edge_003_mixed_base_images
- test_DOCKER_integration_edge_004_circular_dependencies
5. **Performance Integration** (2 tests)
- test_DOCKER_integration_perf_001_large_dockerfile
- test_DOCKER_integration_perf_002_many_stages
**Effort Breakdown**:
- Test scenario design: 8 hours
- Implementation: 10 hours (5 tests + integration framework)
- Validation: 2 hours
**Success Criteria**:
- 20 integration tests written
- All RED (failing initially)
- Test complete transformation pipeline
---
## Phase 3: Mutation Testing Implementation (Weeks 4-5, 80 hours)
### Phase 3.1: Mutation Test Infrastructure (20 hours)
**Objectives**:
- Set up cargo-mutants for Docker rules
- Configure mutation test targets
- Establish baseline (should be ~0% initially)
**Setup Steps**:
```bash
# 1. Install mutation testing tool
cargo install cargo-mutants
# 2. Create mutants.toml configuration
[mutants]
timeout = 30
output-options = ["unambiguous", "json"]
[[exclude-functions]]
name = "logger" # Don't mutate logging code
[[mutate]]
name = "docker001"
path = "src/linter/rules/docker001.rs"
min_kill_rate = 0.90 # 90% target
[[mutate]]
name = "docker002"
path = "src/linter/rules/docker002.rs"
min_kill_rate = 0.90
# ... etc for docker003-006
# 3. Run baseline mutation testing
cargo mutants --output results.json
```
**Effort Breakdown**:
- Configuration: 5 hours
- Infrastructure testing: 8 hours
- Baseline establishment: 7 hours
**Success Criteria**:
- cargo-mutants runs successfully
- mutants.toml configured for all rules
- Baseline mutation score recorded (should be ~0% with incomplete tests)
### Phase 3.2: Test Hardening (50 hours)
**Objective**: Improve mutation kill rate to >90% per rule
**EXTREME TDD Process**:
```
1. RUN: cargo mutants --file src/linter/rules/docker001.rs
2. ANALYZE: Which mutations survived?
- Mutation survived: Changed condition from `==` to `!=`
- Survived mutation: if (tag == "latest") → if (tag != "latest")
- Indicates: Test not checking this branch
3. FIX: Add test for this condition
#[test]
fn test_docker002_latest_tag_is_pinned() {
// Test the specific condition mutation found
}
4. VERIFY: Re-run mutations, ensure kill rate increases
5. REPEAT: Until 90% kill rate achieved
```
**Rule-by-Rule Hardening**:
1. **docker001.rs** (10 hours)
- Target: 90% kill rate
- Estimated survivors: 5-8 mutations
- Focus areas:
- User creation commands (addgroup, adduser syntax)
- Conditional logic (scratch images, existing USER)
- String manipulation (user names, groups)
2. **docker002.rs** (15 hours)
- Target: 90% kill rate
- Estimated survivors: 8-12 mutations
- Focus areas:
- Version mapping logic
- Tag detection (latest, untagged, digest)
- Registry parsing
- Edge cases (ARG in FROM)
3. **docker003.rs** (8 hours)
- Target: 90% kill rate
- Estimated survivors: 3-5 mutations
- Focus areas:
- Package manager detection
- Cleanup command insertion
- Multi-line RUN handling
4. **docker004.rs** (6 hours)
- Target: 90% kill rate
- Estimated survivors: 2-4 mutations
- Focus areas:
- Health check parameters
- Syntax validation
5. **docker005.rs** (6 hours)
- Target: 90% kill rate
- Estimated survivors: 2-4 mutations
- Focus areas:
- Flag detection
- Flag addition logic
6. **docker006.rs** (5 hours)
- Target: 90% kill rate
- Estimated survivors: 1-3 mutations
- Focus areas:
- ADD vs COPY distinction
- URL detection
**Effort Breakdown**:
- Per-rule analysis and test writing: 40 hours
- Verification and iteration: 10 hours
**Success Criteria**:
- >90% kill rate for each rule
- All surviving mutations documented
- Mutation testing integrated into CI/CD
### Phase 3.3: Mutation Automation (10 hours)
**Objectives**:
- Integrate into Makefile targets
- Add CI/CD pipeline checks
- Document mutation testing workflow
**Implementation**:
```makefile
# In Makefile
.PHONY: mutate mutate-dockerfile mutate-ci
mutate-dockerfile: ## Run mutation tests on Dockerfile rules
@echo "🧬 Running mutation tests on Dockerfile rules..."
cargo mutants --file src/linter/rules/docker*.rs --output results.json
@echo "✓ Mutation testing complete"
@cargo mutants --analyze results.json
mutate-ci: ## Mutation testing for CI/CD (strict)
cargo mutants --file src/linter/rules/docker*.rs \
--minimum-kill-rate 0.90 \
--output ci-results.json
@if ! cargo mutants --analyze ci-results.json --fail-if-below 0.90; then \
echo "❌ Kill rate below 90% threshold"; exit 1; \
fi
```
**Effort Breakdown**:
- Makefile integration: 3 hours
- CI/CD configuration: 4 hours
- Documentation: 3 hours
**Success Criteria**:
- `make mutate-dockerfile` works
- CI/CD enforces 90% kill rate
- Mutation results logged and analyzed
---
## Phase 4: Coverage Analysis & Gap Filling (Weeks 5-6, 60 hours)
### Phase 4.1: Coverage Measurement (15 hours)
**Objectives**:
- Measure code coverage with llvm-cov
- Identify coverage gaps
- Create gap-filling test plan
**Process**:
```bash
# 1. Generate baseline coverage
cargo llvm-cov --no-report nextest --features default
cargo llvm-cov report --html --output-path target/coverage/dockerfile
# 2. Analyze per-file coverage
--output-path target/coverage/detailed
# 3. Parse coverage data
grep -A 5 "docker001.rs" target/coverage/detailed/index.html
grep -A 5 "docker002.rs" target/coverage/detailed/index.html
# ... etc
```
**Coverage Targets**:
```
docker001.rs: >85% (currently ~70%)
├── Lines: 85/100 (missing: error paths, edge cases)
├── Branches: 12/14 (missing: nested conditions)
└── Functions: 100%
docker002.rs: >85% (currently ~75%)
├── Lines: 85/100 (missing: registry parsing, edge cases)
├── Branches: 18/22 (missing: version logic branches)
└── Functions: 100%
docker003.rs: >85% (currently ~65%)
├── Lines: 85/100 (missing: multi-line handling)
├── Branches: 14/16 (missing: cleanup conditions)
└── Functions: 100%
docker004.rs: >85% (estimated ~80%)
docker005.rs: >85% (estimated ~80%)
docker006.rs: >85% (estimated ~82%)
```
**Effort Breakdown**:
- Tooling setup: 5 hours
- Analysis and gap identification: 7 hours
- Documentation: 3 hours
**Success Criteria**:
- Coverage report generated
- >85% target per module identified
- Gap analysis documented with test requirements
### Phase 4.2: Gap-Filling Tests (40 hours)
**Objective**: Write tests to cover identified gaps
**Example Gap-Filling Pattern**:
```rust
// DISCOVERED GAP: docker001.rs line 45, error path not covered
// Branch: if validate_user_name(&user_name).is_err()
#[test]
fn test_docker001_gap_fill_001_invalid_user_name_handling() {
// This test fills gap in error path
let input = "FROM debian:12\nCMD [\"bash\"]";
let result = apply_docker001_with_invalid_user(input);
assert!(result.is_err() || result.unwrap().contains("appuser"));
}
// DISCOVERED GAP: docker002.rs line 78, edge case not tested
// Branch: if image.contains("@") && image.contains(":")
#[test]
fn test_docker002_gap_fill_002_digest_and_tag_combo() {
let dockerfile = "FROM ubuntu:22.04@sha256:abc123\nCMD [\"bash\"]";
let result = apply_docker002(dockerfile);
// Should preserve both tag and digest
assert!(result.contains("@sha256:"));
assert!(result.contains(":22.04"));
}
```
**Gap-Filling Strategies**:
1. **Error Path Coverage** (8 tests)
- Invalid inputs that should fail gracefully
- Recovery from malformed instructions
- Validation error messages
2. **Edge Case Coverage** (12 tests)
- Boundary conditions
- Empty/null values
- Maximum lengths
- Special characters
3. **Branch Coverage** (15 tests)
- Untested conditional branches
- Mutation survivors validation
- Logic edge cases
4. **Integration Coverage** (5 tests)
- Interactions between modules
- State management
- Cross-module dependencies
**Effort Breakdown**:
- Gap analysis per rule: 15 hours
- Test implementation: 20 hours
- Validation: 5 hours
**Success Criteria**:
- 40 gap-filling tests written
- Coverage increases to >85% per module
- All identified gaps tested
---
## Phase 5: Documentation & Specification (Weeks 6-7, 40 hours)
### Phase 5.1: Update unified-testing-quality-spec.md (15 hours)
**Objectives**:
- Document Dockerfile testing standards
- Establish quality gates
- Define CI/CD checks
**Changes to unified-testing-quality-spec.md**:
```markdown
## Dockerfile Testing Standards (NEW SECTION)
### Test Naming Convention
- Pattern: `test_DOCKER_<number>_<feature>_<scenario>`
- Example: `test_DOCKER_001_user_directive_basic`
- CLI tests must include `cli` in name: `test_DOCKER_cli_001_purify_command`
### Testing Matrix (per transformation)
| DOCKER001 | 15+ | 2 blocks | 3 | 90% | >85% |
| DOCKER002 | 17+ | 3 blocks | 4 | 90% | >85% |
| DOCKER003 | 11+ | 2 blocks | 3 | 90% | >85% |
| DOCKER004 | 8+ | 1 block | 2 | 90% | >85% |
| DOCKER005 | 9+ | 1 block | 2 | 90% | >85% |
| DOCKER006 | 8+ | 1 block | 2 | 90% | >85% |
### Quality Gates (MANDATORY)
**RED Phase** (Write failing tests first):
- [ ] All unit tests written and failing
- [ ] Property tests generated (100+ cases each)
- [ ] Integration tests defined
- [ ] Coverage map created
**GREEN Phase** (Implementation):
- [ ] All tests passing (100% pass rate)
- [ ] Coverage >85% per module
- [ ] No panics on any input
- [ ] Error paths covered
**REFACTOR Phase** (Polish):
- [ ] Code complexity <10
- [ ] Mutation kill rate >90%
- [ ] Performance baseline met
- [ ] Documentation complete
### CI/CD Integration
```yaml
dockerfile-quality-gates:
- unit-tests: "cargo test --lib src/linter/rules/docker*.rs"
- coverage: "cargo llvm-cov --fail-under-lines 85"
- mutations: "cargo mutants --fail-if-below 90"
- property: "cargo test --test property_dockerfile_purify -- --test-threads 1"
- cli: "cargo test --test cli_dockerfile_purify"
required-all-pass: true
fail-fast: false
```
### Test Organization
```
tests/
├── cli_dockerfile_purify.rs (35+ CLI tests)
├── cli_dockerfile_integration.rs (20+ integration tests)
├── property_dockerfile_purify.rs (20+ property blocks)
└── unit_dockerfile_rules.rs (52+ unit tests in src/linter/rules/)
```
### Coverage Requirements
- **Critical Paths** (required): User detection, image pinning, cleanup
- **Error Paths** (required): Invalid input, malformed instructions
- **Integration** (required): Rule interactions, multi-stage builds
- **Performance** (target): <10ms per 1000 lines
### Property Testing Standards
Each property block must:
- [ ] Test 100+ cases minimum
- [ ] Cover at least 3 input categories
- [ ] Have clear failure messages
- [ ] Document assumptions
- [ ] Handle edge cases (empty input, max length, special chars)
### Mutation Testing Standards
Each rule must:
- [ ] Achieve >90% kill rate
- [ ] Document surviving mutations
- [ ] Test all conditional branches
- [ ] Validate error conditions
- [ ] Check boundary conditions
```
**Effort Breakdown**:
- Specification design: 7 hours
- Writing guidelines: 5 hours
- Review and refinement: 3 hours
**Success Criteria**:
- Document updated with Dockerfile section
- Quality gates defined clearly
- CI/CD integration specified
- Examples provided
### Phase 5.2: Update ROADMAP.yaml (15 hours)
**Objectives**:
- Document completion of Docker testing parity
- Define next priorities
- Establish timeline
**New ROADMAP.yaml Entry**:
```yaml
# docs/DOCKERFILE-TESTING-ROADMAP.yaml (NEW)
project: "Dockerfile Testing Parity"
version: "1.0"
status: "IN PROGRESS"
started: "2025-11-11"
target-completion: "2025-12-20"
phases:
phase-1:
name: "Test Infrastructure Enhancement"
status: "PLANNED"
duration: "2 weeks"
effort: "40 hours"
objectives:
- Expand CLI tests from 16 to 35+
- Add 10 property test blocks
- Document 40+ edge cases
tasks:
task-1.1:
name: "Extend CLI Tests"
status: "PLANNED"
effort: "20 hours"
subtasks:
- Add 19 CLI tests covering all transformations
- Implement flag combination tests
- Add error handling tests
- document test naming convention
task-1.2:
name: "Property Test Generation"
status: "PLANNED"
effort: "15 hours"
subtasks:
- Implement 10 property blocks (100+ cases each)
- Create 5 new Dockerfile generators
- Validate property definitions
task-1.3:
name: "Edge Case Cataloging"
status: "PLANNED"
effort: "5 hours"
subtasks:
- Document 40+ edge cases
- Create test matrix
- Link to test requirements
phase-2:
name: "Unit Test Expansion"
status: "PLANNED"
duration: "2 weeks"
effort: "60 hours"
target-coverage: ">85%"
tasks:
task-2.1:
name: "Docker Rule Unit Tests"
status: "PLANNED"
effort: "40 hours"
per-rule-tests:
docker001: "10 new tests"
docker002: "12 new tests"
docker003: "8 new tests"
docker004: "8 new tests"
docker005: "8 new tests"
docker006: "6 new tests"
task-2.2:
name: "Integration Tests"
status: "PLANNED"
effort: "20 hours"
categories:
- "Complete pipeline tests (5)"
- "Cross-rule interactions (6)"
- "Error recovery (3)"
- "Edge case combinations (4)"
- "Performance integration (2)"
phase-3:
name: "Mutation Testing Implementation"
status: "PLANNED"
duration: "2 weeks"
effort: "80 hours"
target-kill-rate: ">90%"
tasks:
task-3.1:
name: "Mutation Test Infrastructure"
status: "PLANNED"
effort: "20 hours"
task-3.2:
name: "Test Hardening"
status: "PLANNED"
effort: "50 hours"
per-rule-hardening:
docker001: "10 hours"
docker002: "15 hours"
docker003: "8 hours"
docker004: "6 hours"
docker005: "6 hours"
docker006: "5 hours"
task-3.3:
name: "Mutation Automation"
status: "PLANNED"
effort: "10 hours"
phase-4:
name: "Coverage Analysis & Gap Filling"
status: "PLANNED"
duration: "2 weeks"
effort: "60 hours"
target-coverage: ">85%"
tasks:
task-4.1:
name: "Coverage Measurement"
status: "PLANNED"
effort: "15 hours"
task-4.2:
name: "Gap-Filling Tests"
status: "PLANNED"
effort: "40 hours"
categories:
- "Error path coverage (8 tests)"
- "Edge case coverage (12 tests)"
- "Branch coverage (15 tests)"
- "Integration coverage (5 tests)"
phase-5:
name: "Documentation & Specification"
status: "PLANNED"
duration: "1 week"
effort: "40 hours"
tasks:
task-5.1:
name: "Update unified-testing-quality-spec.md"
status: "PLANNED"
effort: "15 hours"
task-5.2:
name: "Update ROADMAP.yaml"
status: "PLANNED"
effort: "15 hours"
task-5.3:
name: "Create testing guide"
status: "PLANNED"
effort: "10 hours"
quality-gates:
- unit-tests: "All 52+ unit tests passing"
- property-tests: "All 20+ property blocks with 100+ cases"
- integration-tests: "All 20+ integration tests passing"
- coverage: ">85% per module (verified with llvm-cov)"
- mutations: ">90% kill rate per rule (verified with cargo-mutants)"
- cli: "All 35+ CLI tests passing with assert_cmd"
- documentation: "unified-testing-quality-spec.md updated"
milestones:
- "2025-11-18: Phase 1 complete (test infrastructure)"
- "2025-11-25: Phase 2 complete (unit tests)"
- "2025-12-02: Phase 3 complete (mutation testing)"
- "2025-12-09: Phase 4 complete (coverage analysis)"
- "2025-12-16: Phase 5 complete (documentation)"
- "2025-12-20: RELEASE - Dockerfile testing parity achieved"
metrics:
cli-tests: "16 → 35+ (2.2x increase)"
unit-tests: "0 → 52+ (new category)"
property-tests: "14 → 40+ (3x increase)"
integration-tests: "0 → 20+ (new category)"
coverage: "~75% → >85% (target)"
mutations: "0% → >90% kill rate"
total-tests: "30 → 147+"
dependencies:
- "script.sh testing must be >85% complete first"
- "Makefile testing patterns established"
- "EXTREME TDD infrastructure in place"
- "CI/CD pipeline ready for Docker tests"
next-phase:
after: "Dockerfile testing parity achieved"
name: "Rust → Shell Transpilation Testing (v3.0)"
effort: "200+ hours"
features:
- "Rust → Shell type checking (new)"
- "Stdlib mapping validation (new)"
- "Integration testing with Rust code (new)"
```
**Effort Breakdown**:
- ROADMAP design: 8 hours
- Documentation: 5 hours
- Validation: 2 hours
**Success Criteria**:
- DOCKERFILE-TESTING-ROADMAP.yaml created
- All phases documented
- Timeline established
- Next priority defined
### Phase 5.3: Create Testing Guide (10 hours)
**Objectives**:
- Document how to write Dockerfile tests
- Provide examples and patterns
- Enable developer self-service
**docs/guides/DOCKERFILE-TESTING-GUIDE.md (NEW)**:
```markdown
# Dockerfile Testing Guide
## Quick Start
### Writing a Unit Test
```rust
#[test]
fn test_DOCKER_NNN_feature_scenario() {
// ARRANGE: Set up test case
let input = "FROM debian:12\n...";
// ACT: Run transformation
let result = transform(input);
// ASSERT: Verify expected behavior
assert!(result.contains("expected"), "Clear failure message");
}
```
### Writing a Property Test
```rust
proptest! {
#[test]
fn prop_DOCKER_NNN_property_description(
input in dockerfile_generator()
) {
let result = transform(&input);
prop_assert!(some_property(&result), "Property must hold");
}
}
```
### Writing a CLI Test
```rust
#[test]
fn test_DOCKER_cli_NNN_command_option() {
let temp_dir = TempDir::new().unwrap();
let input_file = temp_dir.path().join("Dockerfile");
fs::write(&input_file, TEST_DOCKERFILE).unwrap();
bashrs_cmd()
.arg("dockerfile")
.arg("purify")
.arg(&input_file)
.assert()
.success()
.stdout(predicate::str::contains("expected"));
}
```
## Running Tests
```bash
# Run all Dockerfile tests
cargo test --test cli_dockerfile_purify
cargo test --test property_dockerfile_purify
# Run with coverage
cargo llvm-cov --test cli_dockerfile_purify
# Run mutation tests
cargo mutants --file src/linter/rules/docker*.rs
```
## Testing Checklist
- [ ] Test name follows convention
- [ ] RED phase: test fails initially
- [ ] GREEN phase: implementation makes test pass
- [ ] REFACTOR: code complexity <10
- [ ] Property tests: 100+ cases
- [ ] Coverage: >85%
- [ ] Mutations: >90% kill rate
```
**Effort Breakdown**:
- Guide design: 3 hours
- Examples and patterns: 4 hours
- Validation: 3 hours
**Success Criteria**:
- Testing guide created and published
- Examples compilable and runnable
- Clear patterns documented
---
## Phase 6: Release & Post-Release (Weeks 7-8, 40 hours)
### Phase 6.1: Final Verification (15 hours)
**Objectives**:
- Run full test suite
- Verify all quality gates
- Document final metrics
**Verification Checklist**:
```
TEST EXECUTION:
[ ] cargo test --lib (all unit tests passing)
[ ] cargo test --test cli_dockerfile_* (all CLI tests)
[ ] cargo test --test property_dockerfile_* (all property tests)
[ ] cargo test --test integration_dockerfile_* (all integration)
COVERAGE VERIFICATION:
[ ] cargo llvm-cov report (>85% per module)
[ ] Review coverage report
[ ] Document any gaps
MUTATION VERIFICATION:
[ ] cargo mutants --file src/linter/rules/docker*.rs
[ ] Verify >90% kill rate per rule
[ ] Document surviving mutations
[ ] Justify or fix survivors
PERFORMANCE VERIFICATION:
[ ] cargo bench --bench dockerfile_benchmarks
[ ] Verify <10ms per 1000 lines
[ ] No performance regression
DOCUMENTATION VERIFICATION:
[ ] unified-testing-quality-spec.md updated
[ ] DOCKERFILE-TESTING-ROADMAP.yaml complete
[ ] DOCKERFILE-TESTING-GUIDE.md accessible
[ ] CHANGELOG.md updated
INTEGRATION VERIFICATION:
[ ] CI/CD passes all checks
[ ] No breaking changes
[ ] Backward compatibility verified
[ ] All platforms tested (Linux, macOS, Windows)
```
**Effort Breakdown**:
- Verification execution: 8 hours
- Issue resolution: 5 hours
- Final documentation: 2 hours
**Success Criteria**:
- All quality gates passing
- Metrics documented
- Ready for release
### Phase 6.2: Release Preparation (15 hours)
**Objectives**:
- Update CHANGELOG
- Prepare release notes
- Create release tag
**Release Notes Template**:
```markdown
# v7.0.0 - Dockerfile Testing Parity
## Summary
Dockerfile testing now achieves testing parity with Makefile and script.sh
transformation tools, providing enterprise-grade quality assurance.
## What's New
### Testing Infrastructure
- ✅ Expanded CLI tests: 16 → 35+ tests
- ✅ Property-based testing: 100+ cases per property
- ✅ Mutation testing: >90% kill rate per rule
- ✅ Coverage analysis: >85% per module
- ✅ Integration testing: Complete pipeline validation
### Quality Improvements
- ✅ DOCKER001: User directive (15 unit tests, 2 property blocks)
- ✅ DOCKER002: Image pinning (17 unit tests, 3 property blocks)
- ✅ DOCKER003: Package cleanup (11 unit tests, 2 property blocks)
- ✅ DOCKER004: Health checks (8 unit tests, 1 property block)
- ✅ DOCKER005: Package flags (9 unit tests, 1 property block)
- ✅ DOCKER006: ADD → COPY (8 unit tests, 1 property block)
### Documentation
- ✅ unified-testing-quality-spec.md updated with Dockerfile section
- ✅ DOCKERFILE-TESTING-ROADMAP.yaml created
- ✅ DOCKERFILE-TESTING-GUIDE.md for developers
## Test Coverage
| CLI Tests | 16 | 35+ | 30+ |
| Unit Tests | 0 | 52+ | 50+ |
| Property Tests | 14 | 40+ | 30+ |
| Integration Tests | 0 | 20+ | 15+ |
| Code Coverage | ~75% | >85% | >85% |
| Mutation Kill Rate | N/A | >90% | >90% |
## Breaking Changes
None. This release maintains full backward compatibility with previous versions.
## Migration Guide
No migration required. All existing tools work unchanged.
## Testing
Run full Dockerfile test suite:
```bash
cargo test --test cli_dockerfile_purify
cargo test --test property_dockerfile_purify
cargo llvm-cov --test cli_dockerfile_*
cargo mutants --file src/linter/rules/docker*.rs
```
## Credits
Testing parity implementation following EXTREME TDD methodology:
- Property-based testing with 100+ cases per property
- Mutation testing with >90% kill rate target
- Comprehensive coverage analysis
- Complete integration testing
---
🤖 Generated with Claude Code
```
**Effort Breakdown**:
- CHANGELOG update: 5 hours
- Release notes: 6 hours
- Tag creation and verification: 4 hours
**Success Criteria**:
- CHANGELOG.md updated
- Release notes clear and complete
- Release tag created
- CI/CD passes final checks
### Phase 6.3: Post-Release Documentation (10 hours)
**Objectives**:
- Create retrospective
- Document lessons learned
- Plan next phase
**Retrospective Template**:
```markdown
# Dockerfile Testing Parity - Retrospective
## Goals Achieved
### Primary Goals
- [x] 35+ CLI tests (target: 30+)
- [x] 52+ unit tests (target: 50+)
- [x] 40+ property test blocks (target: 30+)
- [x] 20+ integration tests (target: 15+)
- [x] >85% code coverage (target: >85%)
- [x] >90% mutation kill rate (target: >90%)
### Metrics Achieved
| CLI Tests | 30+ | 35 | +5 |
| Unit Tests | 50+ | 52 | +2 |
| Property Blocks | 30+ | 40 | +10 |
| Coverage | >85% | 87% | +2% |
| Mutation Kill Rate | >90% | 92% | +2% |
## Challenges & Solutions
1. **Challenge**: Multi-stage builds require special handling
**Solution**: Added dedicated test generators and property blocks
2. **Challenge**: Mutation testing exposed subtle logic errors
**Solution**: Enhanced branch testing, improved mutation kill rate
3. **Challenge**: Coverage gaps in error paths
**Solution**: Systematic error case enumeration and testing
## Lessons Learned
1. Property testing catches edge cases unit tests miss
2. Mutation testing validates test quality effectively
3. Early documentation prevents integration issues
4. Test infrastructure investment pays dividends
## Next Phase Readiness
Dockerfile testing parity achieved. Ready for:
- [ ] Integration with other transformation tools
- [ ] Rust → Shell transpilation testing (v3.0)
- [ ] Performance optimization
- [ ] Extended rule coverage (DOCKER007-DOCKER010)
## Recommendations
1. Maintain >90% mutation kill rate in CI/CD
2. Continue property-based testing for new rules
3. Use Dockerfile testing as pattern for future work
4. Consider automated test generation from failing mutations
```
**Effort Breakdown**:
- Retrospective writing: 5 hours
- Lessons learned documentation: 3 hours
- Next phase planning: 2 hours
**Success Criteria**:
- Retrospective completed
- Lessons documented
- Next phase clearly defined
---
## Consolidated Implementation Timeline
### Week 1-2: Test Infrastructure Enhancement (40 hours)
```
Week 1:
Mon: Phase 1.1 planning + setup (4h)
Tue-Thu: CLI test writing (12h)
Fri: Property test setup (4h)
Week 2:
Mon-Wed: Property tests implementation (12h)
Thu-Fri: Edge case cataloging (4h + 4h)
```
### Week 3-4: Unit Test Expansion (60 hours)
```
Week 3:
Mon-Wed: docker001-002 unit tests (20h)
Thu-Fri: docker003-004 unit tests (10h)
Week 4:
Mon-Wed: docker005-006 + integration tests (20h)
Thu-Fri: Validation and fixes (10h)
```
### Week 5-6: Mutation Testing (80 hours)
```
Week 5:
Mon-Tue: Infrastructure setup (10h)
Wed-Fri: docker001-002 hardening (20h)
Week 6:
Mon-Wed: docker003-005 hardening (16h)
Thu: docker006 hardening + automation (8h)
Fri: Verification (4h)
```
### Week 7: Coverage Analysis (60 hours)
```
Week 7:
Mon-Tue: Coverage measurement (8h)
Wed-Fri: Gap-filling tests (36h)
Remaining: Analysis and iteration (16h)
```
### Week 8: Documentation (40 hours)
```
Week 8:
Mon-Tue: Spec updates (10h)
Wed: ROADMAP update (8h)
Thu: Testing guide (6h)
Fri: Release prep + verification (16h)
```
---
## Success Metrics & Quality Gates
### Phase Completion Criteria
**Phase 1 (Test Infrastructure)**:
- [ ] 35+ CLI tests written (RED phase)
- [ ] 20+ property blocks written (RED phase)
- [ ] 40+ edge cases documented
- [ ] All tests fail initially (expected in RED)
**Phase 2 (Unit Tests)**:
- [ ] 52+ new unit tests GREEN (passing)
- [ ] 20 integration tests GREEN (passing)
- [ ] 100% pass rate on all tests
- [ ] Code complexity <10
**Phase 3 (Mutation Tests)**:
- [ ] cargo-mutants configured
- [ ] >90% kill rate per rule
- [ ] All surviving mutations documented
- [ ] Mutation tests in CI/CD
**Phase 4 (Coverage)**:
- [ ] >85% coverage per module (verified)
- [ ] All identified gaps tested
- [ ] 40 gap-filling tests GREEN
- [ ] Coverage report generated
**Phase 5 (Documentation)**:
- [ ] unified-testing-quality-spec.md updated
- [ ] DOCKERFILE-TESTING-ROADMAP.yaml created
- [ ] DOCKERFILE-TESTING-GUIDE.md published
- [ ] All examples compilable
**Phase 6 (Release)**:
- [ ] All quality gates passing
- [ ] Final metrics documented
- [ ] Release notes prepared
- [ ] Retrospective completed
---
## Resource Requirements
### Team Composition
- **1 Primary Developer** (280-320 hours)
- Phases 1-2: Full focus on testing infrastructure
- Phases 3-4: Mutation testing + gap analysis
- Phase 5-6: Documentation + release
- **Optional: Code Review** (40 hours)
- Technical review of tests (20 hours)
- Mutation analysis (10 hours)
- Documentation review (10 hours)
### Infrastructure
- **CI/CD Pipeline**:
- llvm-cov for coverage analysis (already installed)
- cargo-mutants for mutation testing (to install)
- GitHub Actions for test automation
- **Developer Tools**:
- Rust 1.70+ (already available)
- cargo-llvm-cov (already installed)
- proptest 1.0+ (already in Cargo.toml)
### Knowledge Base
- EXTREME TDD principles (documented in CLAUDE.md)
- Makefile testing patterns (already implemented)
- script.sh testing standards (6004+ tests reference)
- Property testing best practices (proptest docs)
---
## Risk Assessment
### High-Risk Areas
1. **Mutation Test Configuration**
- Risk: Difficult to achieve >90% kill rate
- Mitigation: Start with 80%, gradually increase; document survivors
- Impact: Could add 20-40 hours if complex
2. **Multi-Stage Build Handling**
- Risk: Interactions between stages not fully tested
- Mitigation: Dedicated test generators + integration tests
- Impact: Could require 10-20 additional tests
3. **Performance Regression**
- Risk: New tests slow down CI/CD pipeline
- Mitigation: Property tests run in parallel; benchmark performance
- Impact: Acceptable if <10% CI/CD time increase
### Mitigation Strategies
1. **Regular Checkpoints**
- Week 2 checkpoint: Phase 1 complete assessment
- Week 4 checkpoint: Unit test quality review
- Week 6 checkpoint: Mutation kill rate evaluation
2. **Fallback Options**
- If mutation testing too difficult: Accept >85% kill rate
- If coverage gaps significant: Extend phase 4 by 1 week
- If time constraint: Defer Phase 5.2 (ROADMAP) to post-release
3. **Communication Plan**
- Weekly status updates
- Phase completion sign-offs
- Risk escalation if >10% off schedule
---
## Next Priority After Dockerfile Parity
### Phase 7: Rust → Shell Transpilation Testing (v3.0)
**Estimated Effort**: 200-250 hours (6-8 weeks)
**Scope**:
- Comprehensive testing of Rust → Shell conversion
- Stdlib mapping validation
- Type system verification
- Integration testing with real Rust code
**Deliverables**:
- 100+ property test blocks (for type checking)
- 75+ unit tests (for stdlib coverage)
- 50+ integration tests (for end-to-end workflows)
- >90% mutation kill rate
- >85% code coverage
**Prerequisites**:
- Dockerfile testing parity complete
- Infrastructure improvements complete
- Team experience with property testing established
---
## References & Related Documents
- `/home/noah/src/bashrs/rash/tests/cli_dockerfile_purify.rs` - Current CLI tests (16 tests)
- `/home/noah/src/bashrs/rash/tests/property_dockerfile_purify.rs` - Current property tests (14 tests)
- `/home/noah/src/bashrs/rash/Makefile` - Build automation reference
- `/home/noah/src/bashrs/rash/docs/BASH-INGESTION-ROADMAP.yaml` - Roadmap pattern
- `/home/noah/src/bashrs/rash/src/linter/rules/docker*.rs` - Implementation files
- `/home/noah/src/bashrs/CLAUDE.md` - EXTREME TDD guidelines
---
## Conclusion
Achieving Dockerfile testing parity represents a significant investment in quality
assurance and developer confidence. By following EXTREME TDD principles and leveraging
proven patterns from script.sh testing, this implementation plan provides a structured
approach to comprehensive test coverage.
The phased approach allows for:
- **Early validation** (Phases 1-2)
- **Quality verification** (Phase 3-4)
- **Sustainable maintenance** (Phase 5-6)
- **Continuous improvement** (via CI/CD integration)
Upon completion, Dockerfile transformations will meet enterprise-grade testing standards
with >85% coverage and >90% mutation kill rate, enabling confident deployment in
production environments.
**Target Release Date**: December 20, 2025 (v7.0.0)
**Status**: Ready to implement