testwall 1.0.1 - Docs.rs

# testwall

CLI tool that enforces test immutability for agentic TDD workflows. Prevents implementing agents from cheating test gates by snapshotting test files, locking them read-only, and verifying integrity before accepting implementation results.

## Background

LLM coding agents routinely cheat test gates — weakening assertions, deleting failing tests, modifying test config, or special-casing test inputs. Research (ImpossibleBench, arxiv 2510.20270) shows frontier models exploit test cases 76% of the time when given write access, but cheating drops to near zero when tests are hidden or read-only. testwall enforces that boundary.

## Architecture

Single-binary Rust CLI. No runtime dependencies. The core mechanism:

1. **Snapshot** test files + compute SHA-256 checksums → `.testwall/manifest.json` + `.testwall/snapshot/`
2. **Lock** test files read-only (`chmod 444`) in the working tree
3. **Run** always restores from snapshot before executing the test runner — even if the agent bypassed file permissions, the real tests execute
4. **Verify** compares current checksums against manifest — exits nonzero on any mismatch
5. **Accept** is the merge gate — verify + unlock + clean up snapshot

## Current State

- **Rust source** (`src/main.rs`): Complete, compiles, includes unit tests for SHA-256, glob matching, and file permissions. ~550 lines, single-file.
- **Python reference** (`testwall.py`): Functionally identical implementation used for integration testing. Can be removed once Rust achieves full parity (it has — keep only as a test oracle if useful).
- **Integration tested**: Full workflow validated — init, lock, verify (clean), tamper simulation, verify (catches it), accept (rejects), run (restores from snapshot), accept (passes).
- **Not yet done**: See roadmap below.

## Commands

```
testwall init [-p PATTERN...] [-c CMD]   # Snapshot test files, record checksums
testwall lock                             # Set test files read-only
testwall unlock                           # Restore write permissions
testwall run [-c CMD] [-- extra args]     # Restore from snapshot + execute tests
testwall verify [--report-only]           # Check checksums, exit 1 on mismatch
testwall accept                           # Verify + unlock + clean snapshot
testwall status                           # Show current testwall state
```

## Roadmap (priority order)

### 1. Git hook integration
Add `testwall install-hooks` that drops a `pre-commit` hook running `testwall verify`. This makes it impossible to commit tampered tests even if the agent bypasses file permissions during its session.

### 2. Config hardening (`--strict` mode)
Extend `init` to also snapshot test runner config that agents use to cheat without touching test files directly:
- `conftest.py`, `pytest.ini`, `setup.cfg`, `tox.ini`
- `jest.config.*`, `vitest.config.*`, `.babelrc`
- `Makefile`, `justfile` (if they contain test targets)
- `.cargo/config.toml`
- CI config (`.github/workflows/`, `.gitlab-ci.yml`)

Some of these are already in the default patterns but `--strict` should be explicit and aggressive.

### 3. Multi-agent session orchestration
The novel feature. Formalize the two-agent workflow:
```
testwall session new --tests-from <branch>   # Pull tests, lock, create worktree
testwall session submit                       # Verify + PR/merge implementation
```
Agent A writes tests on a branch. Agent B implements in an isolated worktree where test files are immutable. `session submit` is the gate.

### 4. Publishing
- `cargo publish` to crates.io (name `testwall` is available)
- `pip install testwall` via pyproject.toml entry point (if keeping Python version)
- `npm` wrapper package that downloads the binary (like `esbuild` does)
- GitHub Actions workflow for cross-platform release binaries

### 5. Polish
- `testwall diff` — show what the agent changed in test files (before/after from snapshot)
- `testwall restore` — restore test files from snapshot without running tests
- `--watch` mode for `verify` — continuous integrity monitoring during agent sessions
- JSON output mode (`--json`) for CI integration
- Configurable exclusion patterns (`--exclude`)

## Development

```bash
cargo build              # Build
cargo test               # Run unit tests
cargo run -- init        # Run locally
```

## Design Decisions

- **Single file**: Kept everything in `src/main.rs` intentionally. Modularize when it gets past ~800 lines.
- **BTreeMap for files**: Deterministic ordering in manifest JSON for clean diffs.
- **Custom glob matching**: Avoids pulling in the `glob` crate for a small set of patterns. Supports `*`, `?`, `**` prefix, and `/**/` middle. If this gets more complex, switch to the `globset` crate.
- **Snapshot dir in .gitignore**: The snapshot contains copies of test files — it's ephemeral working state, not source of truth.
- **Manifest kept after accept**: Audit trail. You can see what was locked and when.