serdes-ai-evals
Evaluation framework for testing and benchmarking serdes-ai agents
This crate provides evaluation and testing capabilities for SerdesAI:
- Test case definitions
- Evaluation metrics (accuracy, latency, cost)
- Benchmark harness
- Regression testing
- LLM-as-judge evaluators
Installation
[]
= "0.1"
Usage
use ;
let suite = new
.case
.case;
let results = suite.run.await?;
println!;
Part of SerdesAI
This crate is part of the SerdesAI workspace.
For most use cases, you should use the main serdes-ai crate which re-exports these types.
License
MIT License - see LICENSE for details.