cargo-evals 0.2.0

Cargo subcommand for listing and running typed agent eval suites
cargo-evals-0.2.0 is not a library.

cargo-evals

cargo-evals is the Cargo subcommand for running evals suites for agents.

It can:

  • initialize evals.toml
  • discover suites generated by evals::build()
  • list suites and evals
  • run filtered evals
  • emit inline terminal output or JSON events

Commands

cargo evals init
cargo evals list
cargo evals models
cargo evals run
cargo evals preserves

Minimal Setup

Your crate needs:

  • evals::build()? in build.rs
  • evals::setup!(); in src/lib.rs
  • suites under evals/**/*.rs

With that in place, cargo evals will discover and run the generated registry automatically.

External Project Example

build.rs:

fn main() -> anyhow::Result<()> {
    evals::build()?;
    Ok(())
}

src/lib.rs:

evals::setup!();

Generate evals.toml:

cargo evals init

The generated file includes:

  • a working local Ollama target
  • a default timeout and output dir
  • commented examples for OpenAI, Anthropic, OpenRouter, Workers AI, and LM Studio

You can also write evals.toml yourself. A minimal version is:

[evals]

[[evals.targets]]
provider = "ollama"
model = "llama3.2:3b"

Then:

cargo build
cargo evals list
cargo evals run