Skip to main content

Module judge

Module judge 

Source
Expand description

Axis 8: LLM-judge (user-supplied rubric).

This module defines the Judge trait that users implement (usually in Python — see python/src/shadow/llm/). The Rust side only provides the trait and the aggregation logic; no Rust-side default evaluator is included, because calling an LLM from Rust is out of scope for v0.1 (SDKs are Python-first per CONTRIBUTING.md).

Traits§

Judge
User-supplied evaluator that scores a single (baseline, candidate) response pair. Scores are in [0.0, 1.0] where 1.0 means “candidate is at least as good as baseline.”

Functions§

compute
Aggregate scores from a user-supplied judge into an AxisStat.