Skip to main content

Module eval

Module eval 

Source
Expand description

Natural-language evaluations. An eval poses a criterion in plain English and asks the provider’s judge to score the transcript: a boolean assertion, or a numeric score compared against a threshold.

Structs§

EvalOutcome
The result of running one eval against a transcript.

Enums§

Comparator
How a numeric score is compared to its threshold.
Eval
An eval specification, as written in a test case’s YAML.
EvalDetail
The kind-specific detail of an eval outcome, for reporting.
JudgeValue
The raw value a judge returns: either a boolean or a number, matching the eval kind. Deserialized untagged from the provider’s value field.