# REPL EXTREME Quality Refactoring Plan
## Current State Analysis (UNACCEPTABLE)
- **File Size**: 10,908 lines (VIOLATION: Should be <500 lines)
- **Functions**: 546 functions in one file (VIOLATION: Should be <20)
- **Complexity**: Unknown but guaranteed >100 (VIOLATION: Must be <10)
- **Coverage**: 18.95% (VIOLATION: Must be >90%)
- **TDG Grade**: Likely F (VIOLATION: Must be A+)
## Toyota Way Root Cause Analysis (5 Whys)
1. **Why is coverage so low?** → File is too complex to test
2. **Why is file too complex?** → Everything is in one monolithic file
3. **Why is everything in one file?** → No modular design was enforced
4. **Why was no modular design enforced?** → Quality gates were not in place
5. **Why were quality gates not in place?** → Development prioritized features over quality
## EXTREME Quality Refactoring Strategy
### Phase 1: Component Extraction (Complexity <10 per component)
```rust
src/runtime/repl/
├── mod.rs // Main REPL orchestrator (<100 lines)
├── commands/
│ ├── mod.rs // Command registry (<50 lines)
│ ├── help.rs // Help command (<50 lines)
│ ├── history.rs // History commands (<50 lines)
│ ├── inspection.rs // :ast, :tokens, :type (<100 lines)
│ ├── session.rs // :save, :load, :clear (<100 lines)
│ └── mode.rs // :debug, :transpile, etc (<100 lines)
├── evaluation/
│ ├── mod.rs // Evaluation orchestrator (<100 lines)
│ ├── expression.rs // Expression evaluator (<200 lines)
│ ├── statement.rs // Statement executor (<200 lines)
│ └── multiline.rs // Multiline handling (<100 lines)
├── completion/
│ ├── mod.rs // Tab completion engine (<100 lines)
│ ├── keywords.rs // Keyword completions (<50 lines)
│ ├── variables.rs // Variable completions (<50 lines)
│ └── functions.rs // Function completions (<50 lines)
├── formatting/
│ ├── mod.rs // Output formatting (<100 lines)
│ ├── values.rs // Value pretty-printing (<100 lines)
│ ├── errors.rs // Error formatting (<100 lines)
│ └── colors.rs // Terminal colors (<50 lines)
├── state/
│ ├── mod.rs // REPL state management (<100 lines)
│ ├── environment.rs // Variable environment (<200 lines)
│ ├── history.rs // Command history (<100 lines)
│ └── settings.rs // User settings (<50 lines)
└── tests/
├── mod.rs // Test utilities (<100 lines)
└── fixtures.rs // Test fixtures (<100 lines)
```
### Phase 2: TDD Implementation (Write tests FIRST)
For each component:
1. Write comprehensive unit tests (minimum 20 tests per component)
2. Write property tests (minimum 5 properties per component)
3. Write integration tests (minimum 10 scenarios)
4. Measure complexity BEFORE implementation
5. Implement with complexity <10 per function
6. Verify 90% coverage per component
### Phase 3: Complexity Enforcement
Every function MUST meet:
- **Cyclomatic Complexity**: ≤10 (measured by PMAT)
- **Cognitive Complexity**: ≤10 (measured by PMAT)
- **Lines of Code**: ≤30 per function
- **Parameters**: ≤4 per function
- **Nesting Depth**: ≤3 levels
### Phase 4: Quality Gates (MANDATORY)
```bash
# Pre-commit hook for REPL module
pmat tdg src/runtime/repl --min-grade A- --fail-on-violation
pmat analyze complexity src/runtime/repl --max-cyclomatic 10 --fail-on-violation
cargo llvm-cov --lib --html -- runtime::repl
# Coverage must be >90%
```
### Phase 5: User Experience Improvements
1. **Response Time**: <50ms for simple expressions
2. **Error Messages**: Clear, actionable, with suggestions
3. **Tab Completion**: <10ms response time
4. **History**: Persistent across sessions
5. **Multiline**: Intuitive with proper indentation
6. **Colors**: Semantic highlighting for all output
## Implementation Order (Sprint v3.22.0)
### Day 1: Command System Extraction
- [ ] Extract command registry (TDD)
- [ ] Extract help system (TDD)
- [ ] Extract history management (TDD)
- [ ] Verify complexity <10 for all
- [ ] Achieve 95% coverage
### Day 2: Evaluation System
- [ ] Extract expression evaluator (TDD)
- [ ] Extract statement executor (TDD)
- [ ] Extract multiline handler (TDD)
- [ ] Verify complexity <10 for all
- [ ] Achieve 95% coverage
### Day 3: Completion & Formatting
- [ ] Extract tab completion (TDD)
- [ ] Extract output formatting (TDD)
- [ ] Extract error formatting (TDD)
- [ ] Verify complexity <10 for all
- [ ] Achieve 95% coverage
### Day 4: State Management
- [ ] Extract environment management (TDD)
- [ ] Extract settings system (TDD)
- [ ] Extract session persistence (TDD)
- [ ] Verify complexity <10 for all
- [ ] Achieve 95% coverage
### Day 5: Integration & Polish
- [ ] Integration tests (100+ scenarios)
- [ ] Performance benchmarks
- [ ] Documentation with examples
- [ ] Verify TDG A+ grade
- [ ] Achieve 90% overall coverage
## Success Metrics (MANDATORY)
- **Coverage**: ≥90% (from 18.95%)
- **Complexity**: ≤10 for ALL functions
- **TDG Grade**: A+ (≥95 points)
- **Response Time**: <50ms (p99)
- **Test Count**: 500+ tests
- **File Size**: <500 lines per file
- **Zero Technical Debt**: No TODOs, FIXMEs, or HACKs
## PMAT Enforcement
```bash
# Run every hour during development
watch -n 3600 'pmat tdg src/runtime/repl --min-grade A- --fail-on-violation'
# Run before every commit
pmat quality-gate --fail-on-violation
```
## Toyota Way Principles Applied
1. **Jidoka**: Quality built into every component
2. **Genchi Genbutsu**: Direct observation via comprehensive testing
3. **Kaizen**: Continuous improvement via metrics
4. **Poka-Yoke**: Error prevention via type system
5. **Stop the Line**: Halt on ANY quality violation
## NO COMPROMISES
- NO function over 10 complexity
- NO file over 500 lines
- NO component under 90% coverage
- NO technical debt
- NO exceptions
This is EXTREME quality - we stop for NOTHING less than perfection.