Skip to main content

Module eval

Module eval 

Source

Structs§

DimensionScores
Per-dimension score breakdown.
EvalRun
An evaluation run of an agent version against a scenario set.
EvalRunRequest
Request to start a new eval run.
EvalWeights
Configurable weights for the 6 evaluation dimensions.
FailureClusterSummary
Summary of failure clusters across a run.

Enums§

EvalRunStatus
FailureCluster
Root cause categories for trace failures.