Expand description
Eval / regression corpus runner (idea 4).
Replays a labeled corpus through the REAL pure enforcement functions and scores a confusion matrix per layer:
- detection —
classify_command+extract_file_path: which file a bash command reads (the read gate’s first stage); - decision —
evaluate(): what enforcement does given a file/gotcha state (Allow / Advisory / Deny / …).
Ground truth is independent of current behavior; cases the engine currently
mishandles are tracked in baseline.json. The gate asserts each layer’s
failing set equals its baseline exactly — a new miss is a regression, a
fixed gap forces a baseline update (ratcheting recall up). That makes the
“how do I know it doesn’t miss?” number a measured, regression-gated fact.
The corpus + baseline are embedded at compile time so mati eval runs the
identical corpus in a shipped binary. Pure — no store, daemon, or network;
the eval path stays inside mati’s zero-network invariant.
Structs§
- Eval
Report - Layer
Report - Per-layer confusion matrix and baseline comparison.
Functions§
- run
- Run the embedded corpus through the real enforcement functions and score it.