Skip to main content

Module eval

Module eval 

Source
Available on crate feature eval only.
Expand description

Agent evaluation framework.

Test and validate agent behavior:

Available with feature: eval

Modules§

criteria
Evaluation criteria definitions
error
Error types for the evaluation framework
evaluator
Core evaluator implementation
llm_judge
LLM-based evaluation scoring
prelude
Prelude for convenient imports
report
Evaluation result reporting
schema
Test file schema definitions
scoring
Scoring implementations for evaluation criteria

Structs§

EvalCase
A single evaluation case (test case)
EvalSet
An eval set references multiple test files
EvaluationConfig
Configuration for the evaluator
EvaluationCriteria
Collection of evaluation criteria
EvaluationReport
Complete evaluation report for a test file or eval set
EvaluationResult
Result for a single test case
Evaluator
The main evaluator struct
Failure
A single failure in evaluation
IntermediateData
Intermediate data during a turn (tool calls, etc.)
LlmJudge
LLM-based judge for semantic evaluation
LlmJudgeConfig
Configuration for the LLM judge
ResponseMatchConfig
Configuration for response matching
ResponseScorer
Scorer for response text similarity
Rubric
A single rubric for quality assessment
RubricConfig
Configuration for rubric-based evaluation
RubricEvaluationResult
Result of rubric-based evaluation
RubricScore
Score for a single rubric
SemanticMatchResult
Result of semantic similarity evaluation
SessionInput
Session input configuration
TestFile
A complete test file containing multiple evaluation cases
ToolTrajectoryConfig
Configuration for tool trajectory matching
ToolTrajectoryScorer
Scorer for tool trajectory matching
ToolUse
A tool use (function call)
Turn
A single turn in a conversation

Enums§

EvalError
Errors that can occur during evaluation

Type Aliases§

Result
Result type alias for evaluation operations
TestCaseResult
Result for a single test case (alias for backward compatibility)