Structs§
- Dimension
Scores - Per-dimension score breakdown.
- EvalRun
- An evaluation run of an agent version against a scenario set.
- Eval
RunRequest - Request to start a new eval run.
- Eval
Weights - Configurable weights for the 6 evaluation dimensions.
- Failure
Cluster Summary - Summary of failure clusters across a run.
Enums§
- Eval
RunStatus - Failure
Cluster - Root cause categories for trace failures.