Expand description
Compute IR metrics from pre-judged results (no API calls needed)
Reads judgments JSONL (produced by Claude Code or external judge) and retrieval results JSONL, computes standard IR metrics.
Functionsยง
- compute_
metrics_ from_ judgments - Compute metrics from pre-judged results
- format_
metrics_ summary - Format metrics summary to stdout