serdes-ai-evals 0.1.5

Evaluation framework for testing and benchmarking serdes-ai agents

Coverage
100%
324 out of 324 items documented0 out of 186 items with examples
Size
Source code size: 127.87 kB This is the summed size of all the files inside the crates.io package for this release.
Documentation size: 20.19 MB This is the summed size of all files generated by rustdoc for all configured targets
Links
janfeddersen-wq/serdesAI
12 1 2
crates.io
Dependencies
Versions
Owners

serdes-ai-evals

Evaluation framework for testing and benchmarking serdes-ai agents

This crate provides evaluation and testing capabilities for SerdesAI:

Test case definitions
Evaluation metrics (accuracy, latency, cost)
Benchmark harness
Regression testing
LLM-as-judge evaluators

Installation

[dependencies]
serdes-ai-evals = "0.1"

Usage

use serdes_ai_evals::{EvalSuite, TestCase, Evaluator};

let suite = EvalSuite::new("my-agent-tests")
    .case(TestCase::new("greeting")
        .input("Hello!")
        .expected_contains("Hello"))
    .case(TestCase::new("math")
        .input("What is 2+2?")
        .expected_contains("4"));

let results = suite.run(&agent).await?;
println!("Pass rate: {:.1}%", results.pass_rate() * 100.0);

Part of SerdesAI

This crate is part of the SerdesAI workspace.

For most use cases, you should use the main serdes-ai crate which re-exports these types.

License

MIT License - see LICENSE for details.