Expand description
Core types for the evaluation framework
Structsยง
- Aggregate
Metrics - Aggregate metrics across queries
- Chunk
Judgment - Judgment for a single retrieved chunk
- Eval
Config - Configuration for eval operations
- Eval
Output - Eval output with full results
- Eval
RunConfig - Config recorded in eval output
- Ground
Truth Entry - A single ground truth entry (query paired with its source chunk)
- Judge
Cache - Persistent cache for LLM judge verdicts
- Judge
Cache Entry - Single cache entry
- Judge
Verdict - LLM judge verdict for a (query, chunk) pair
- Judgment
Entry - A single judgment entry (written by Claude Code or external judge)
- Query
Result - Per-query result with judge details
- Retrieval
Result Entry - Raw retrieval results for a single query
- Retrieved
Chunk - A single retrieved chunk from a query