Skip to main content

Module eval

Module eval 

Source
Expand description

Eval / regression corpus runner (idea 4).

Replays a labeled corpus through the REAL pure enforcement functions and scores a confusion matrix per layer:

  • detectionclassify_command + extract_file_path: which file a bash command reads (the read gate’s first stage);
  • decisionevaluate(): what enforcement does given a file/gotcha state (Allow / Advisory / Deny / …).

Ground truth is independent of current behavior; cases the engine currently mishandles are tracked in baseline.json. The gate asserts each layer’s failing set equals its baseline exactly — a new miss is a regression, a fixed gap forces a baseline update (ratcheting recall up). That makes the “how do I know it doesn’t miss?” number a measured, regression-gated fact.

The corpus + baseline are embedded at compile time so mati eval runs the identical corpus in a shipped binary. Pure — no store, daemon, or network; the eval path stays inside mati’s zero-network invariant.

Structs§

EvalReport
LayerReport
Per-layer confusion matrix and baseline comparison.

Functions§

run
Run the embedded corpus through the real enforcement functions and score it.