Skip to main content

Module evaluate

Module evaluate 

Source
Expand description

In-process corpus evaluation.

Equivalent to aperion-shield --check, but stripped down for the diff use case:

  • memory and burst-detector are disabled (--no-memory, --no-burst equivalents). Both are stateful, so flipping them on would make the second engine’s evaluation depend on the first engine’s history and give us non-reproducible diffs. The Python prototype does the same thing.
  • Workspace context is still computed once and shared between the two runs (it’s a function of --workspace, not the rules).
  • Output is in-process structs, not serialised JSON, so no parse round-trip cost on big corpora.

Output schema mirrors the JSON emitted by --check for the fields the diff explainer actually consumes. If new fields are added to --check’s JSON output, mirror them here only if the diff explainer needs them; otherwise we accumulate stale fields that confuse readers.

Structs§

DecisionLine
One evaluation result, mirroring the JSON shape --check writes per line. Names are kept stable with the Python prototype’s DecisionLine so the JSON output schema stays source-compatible.
EvalOptions
Options that apply equally to both engine runs (before / after).

Functions§

ensure_rules_exists
Validate the rules path early so we can fail with a clearer error than “reading shieldset failed”. Used by run_diff_mode when both paths are checked up-front.
evaluate_corpus
Run the engine at rules_path over the JSON-Lines corpus, returning one DecisionLine per non-blank, non-comment input line in order. Invalid JSON lines map to an “allow” decision with a sentinel reason so the index pairing in the diff stays aligned with the corpus.