# CLAUDE.md - Ruchy Compiler Implementation Protocol
## Prime Directive
**Generate correct code that compiles on first attempt. Quality is built-in, not bolted-on.**
## Implementation Hierarchy
```yaml
Navigation:
1. SPECIFICATION.md # What to build (reference)
2. ROADMAP.md # Strategic priorities
3. docs/execution/ # Tactical work breakdown
├── roadmap.yaml # PDMT task generation template
├── roadmap.md # Generated execution DAG
└── velocity.json # Empirical performance data
```
## Task Execution Protocol
### Pre-Implementation Verification
```rust
// HALT. Before implementing ANY feature:
□ Locate specification section in SPECIFICATION.md
□ Find task ID in docs/execution/roadmap.md
□ Verify dependencies completed via DAG
□ Check existing patterns in codebase
□ Confirm complexity budget (<10 cognitive)
```
### Task Code Format
```bash
# Every commit references execution task:
git commit -m "DF-P-001: Implement DataFrame literal parsing
Validates: SPECIFICATION.md Section 3.1
Performance: 65MB/s parsing throughput
Coverage: 94% on parser/expr.rs"
```
## Compiler Architecture Patterns
### Parser Pattern - Pratt with Error Recovery
```rust
impl Parser {
fn parse_expr(&mut self, min_bp: u8) -> Result<Expr, ParseError> {
let mut lhs = self.parse_prefix()?;
while let Some(&op) = self.peek() {
let (l_bp, r_bp) = op.binding_power();
if l_bp < min_bp { break; }
self.advance();
let rhs = self.parse_expr(r_bp)?;
lhs = Expr::binary(op, lhs, rhs, self.span());
}
Ok(lhs)
}
}
```
### Type Inference - Bidirectional Checking
```rust
impl TypeChecker {
fn check(&mut self, expr: &Expr, expected: Type) -> Result<(), TypeError> {
match (&expr.kind, expected) {
(ExprKind::Lambda(params, body), Type::Function(arg_tys, ret_ty)) => {
self.check_params(params, arg_tys)?;
self.check(body, *ret_ty)
}
_ => {
let inferred = self.infer(expr)?;
self.unify(inferred, expected)
}
}
}
fn infer(&mut self, expr: &Expr) -> Result<Type, TypeError> {
match &expr.kind {
ExprKind::Variable(name) => self.env.lookup(name),
ExprKind::Apply(func, arg) => {
let func_ty = self.infer(func)?;
let (param_ty, ret_ty) = self.expect_function(func_ty)?;
self.check(arg, param_ty)?;
Ok(ret_ty)
}
_ => self.infer_primitive(expr)
}
}
}
```
### Zero-Cost Transpilation
```rust
impl Transpiler {
fn transpile_dataframe(&self, df: &DataFrameExpr) -> TokenStream {
// Direct Polars LogicalPlan generation - no intermediate allocation
match &df.kind {
DfKind::Literal(cols) => {
let columns = cols.iter().map(|c| self.transpile_column(c));
quote! { ::polars::prelude::df![#(#columns),*] }
}
DfKind::Pipeline(source, ops) => {
let base = self.transpile(source);
ops.iter().fold(base, |acc, op| {
self.apply_operation(acc, op)
})
}
}
}
fn apply_operation(&self, frame: TokenStream, op: &DfOp) -> TokenStream {
// Zero-copy operation chaining
match op {
DfOp::Filter(predicate) => {
let pred = self.transpile(predicate);
quote! { #frame.lazy().filter(#pred) }
}
DfOp::GroupBy(keys) => {
let key_exprs = self.transpile_keys(keys);
quote! { #frame.groupby([#(#key_exprs),*]) }
}
}
}
}
```
## Memory Management Patterns
### Arena Allocation for AST
```rust
pub struct AstArena {
nodes: TypedArena<Expr>,
strings: StringInterner,
}
impl AstArena {
pub fn alloc(&self, expr: Expr) -> &Expr {
// O(1) allocation, bulk deallocation
self.nodes.alloc(expr)
}
pub fn intern(&mut self, s: &str) -> InternedString {
// String deduplication
self.strings.get_or_intern(s)
}
}
```
### Session-Scoped Resources
```rust
struct CompilerSession {
arena: AstArena,
type_cache: FxHashMap<TypeId, Type>,
symbol_table: SymbolTable,
}
impl Drop for CompilerSession {
fn drop(&mut self) {
// Bulk deallocation - no per-node overhead
}
}
```
## Performance Invariants
### Parsing Throughput
```rust
#[bench]
fn bench_parse_throughput(b: &mut Bencher) {
let input = include_str!("../corpus/large.ruchy"); // 10K LOC
b.iter(|| {
let mut parser = Parser::new(input);
parser.parse_module()
});
// Invariant: >50MB/s
assert!(b.bytes_per_second() > 50_000_000);
}
```
### Type Inference Latency
```rust
#[bench]
fn bench_type_inference(b: &mut Bencher) {
let ast = test_ast();
b.iter(|| {
let mut checker = TypeChecker::new();
checker.infer_module(&ast)
});
// Invariant: <15ms for typical program
assert!(b.ns_per_iter() < 15_000_000);
}
```
## Error Diagnostics
### Elm-Level Error Quality
```rust
impl Diagnostic {
fn render(&self) -> String {
let mut output = String::new();
// Source context with highlighting
writeln!(output, "{}", self.source_snippet());
writeln!(output, "{}", "^".repeat(self.span.len()).red());
// Primary message
writeln!(output, "\n{}: {}", "Error".red().bold(), self.message);
// Actionable suggestion
if let Some(suggestion) = &self.suggestion {
writeln!(output, "\n{}: {}", "Hint".green(), suggestion);
}
output
}
}
```
## PMAT MCP Quality Proxy (Real-Time Enforcement)
### MCP Server Configuration
```json
// ~/.config/claude/mcps/pmat.json
{
"mcpServers": {
"pmat": {
"command": "~/.local/bin/pmat",
"args": ["serve", "--mode", "mcp", "--config", "~/ruchy/pmat.toml"],
"env": {
"PMAT_PROJECT_ROOT": "~/ruchy",
"PMAT_QUALITY_MODE": "strict"
}
}
}
}
```
### PMAT Thresholds (Enforced in Real-Time)
```toml
# pmat.toml - MCP proxy blocks violations instantly
[thresholds]
cyclomatic_complexity = 10 # Blocks at write-time
cognitive_complexity = 15 # No mental overload
halstead_effort = 5000 # Computational limits
maintainability_index = 70 # Minimum maintainability
test_coverage = 80 # Coverage gate
satd_comments = 0 # Zero technical debt
mutation_score = 75 # Mutation testing gate
```
### MCP Quality Proxy Tools
```bash
# PMAT exposes these MCP tools to Claude:
pmat_analyze_code # Real-time complexity analysis
pmat_check_coverage # Test coverage verification
pmat_detect_smells # Code smell detection
pmat_suggest_refactor # Automated refactoring hints
pmat_mutation_test # Mutation testing on-demand
pmat_quality_gate # Full quality check
# These run automatically as you code via MCP
```
### Live Quality Feedback Pattern
```rust
// As you type, PMAT MCP provides instant feedback:
fn process_data(data: &Data) -> Result<(), Error> {
// PMAT: Complexity 3/10 ✅
validate(data)?;
if data.complex { // PMAT: +1 complexity (4/10)
for item in &data.items { // PMAT: +2 (6/10)
if item.check() { // PMAT: +3 nested (9/10) ⚠️
// PMAT WARNING: Approaching complexity limit
process_item(item)?;
}
}
}
Ok(())
}
// PMAT automatically suggests:
// "Extract loop to process_items() function"
```
### Zero-SATD Enforcement
```rust
// PMAT MCP blocks these in real-time:
fn temporary_hack() {
// TODO: Fix this later // ❌ PMAT: BLOCKED - SATD detected
// FIXME: Memory leak // ❌ PMAT: BLOCKED - SATD detected
// HACK: Works for now // ❌ PMAT: BLOCKED - SATD detected
// Instead, PMAT forces:
// Track in GitHub Issues, not code comments
}
```
## The Make Lint Contract (Zero Warnings Allowed)
```bash
# make lint command from Makefile:
cargo clippy --all-targets --all-features -- -D warnings
```
**Critical**: The `-D warnings` flag treats EVERY clippy warning as a hard error. This is more rigid than standard clippy and creates a quality backlog.
### What This Means for Your Code
```rust
// Standard clippy: These would be warnings
x.to_string(); // Warning: could use .into()
&vec![1, 2, 3]; // Warning: could use slice
if x == true { } // Warning: could omit == true
// With make lint: These FAIL the build
x.to_string(); // ERROR - build fails
&vec![1, 2, 3]; // ERROR - build fails
if x == true { } // ERROR - build fails
```
### Surviving -D warnings
```rust
// Write defensive code from the start:
x.into(); // Prefer into() over to_string()
&[1, 2, 3]; // Use slice literals
if x { } // Omit redundant comparisons
// For unavoidable warnings, be explicit:
#[allow(clippy::specific_lint)] // Document why
fn special_case() { }
```
## Canonical Implementation Order
Based on dependency analysis and critical path:
1. **DataFrame Support** (Blocking all examples)
- Parser extensions for df literals
- Type system DataFrame/Series types
- Transpiler Polars generation
2. **Result Type** (Error handling foundation)
- Parser ? operator precedence
- Type inference for Result<T, E>
- Error propagation in transpiler
3. **Actor System** (Concurrency model)
- Parser actor/receive syntax
- Type system message types
- Runtime mailbox implementation
Each phase validates against SPECIFICATION.md sections and maintains performance invariants.
## Architectural Decisions
### Why Pratt Parsing?
Operator precedence without grammar ambiguity. O(n) time, O(1) space per precedence level.
### Why Bidirectional Type Checking?
Combines inference power with predictable checking. Lambda parameters inferred from context.
### Why Direct Polars Transpilation?
Zero intermediate representation. DataFrame operations compile to optimal LogicalPlan.
### Why Arena Allocation?
Bulk deallocation. No per-node overhead. Cache-friendly traversal.
## The Development Flow
```
1. LOCATE specification section
2. IDENTIFY task in execution roadmap
3. VERIFY dependencies complete
4. IMPLEMENT with <10 complexity
5. VALIDATE performance invariants
6. COMMIT with task reference
```
## Sprint Hygiene Protocol
### Pre-Sprint Cleanup (MANDATORY)
```bash
# Remove all debug binaries before starting sprint
rm -f test_* debug_*
find . -type f -executable -not -path "./target/*" -not -path "./.git/*" -delete
# Verify no large files
find . -type f -size +100M -not -path "./target/*" -not -path "./.git/*"
# Clean build artifacts
cargo clean
```
### Post-Sprint Checklist
```bash
# 1. Remove debug artifacts
rm -f test_* debug_* *.o *.a
# 2. Update tracking
git add docs/execution/velocity.json docs/execution/roadmap.md
# 3. Verify no cruft
git status --ignored
# 4. Push with clean history
git push origin main
```
### .gitignore Requirements
Always maintain in .gitignore:
- `test_*` (debug test binaries)
- `debug_*` (debug executables)
- `!*.rs` (except rust source files)
- `!*.toml` (except config files)
---
**Remember**: Compiler engineering is about systematic transformation, not clever hacks. Every abstraction must have zero runtime cost. Every error must be actionable. Every line of code must justify its complexity budget.