agentcarousel
agentcarousel is a Rust CLI (and library) for working with fixture files: schema checks, scenario runs (mock or live), and SQLite history.
Evaluate agent behavior and skills with reproducible fixtures, scored checks, and exportable evidence.
Build
Configuration
The CLI reads agentcarousel.toml from the repository root or
~/.config/agentcarousel/config.toml. Use --config <path> to override.
Run history defaults to:
- macOS:
~/Library/Application Support/agentcarousel/history.db - Linux:
~/.local/share/agentcarousel/history.db
Set AGENTCAROUSEL_HISTORY_DB
Live generation + judge evaluation are supported with Gemini, OpenAI, Anthropic, and OpenRouter models.
Quickstart (5 minutes)
# 1) Build
# 2) Export a provider key (examples)
# or export OPENAI_API_KEY=your_key_here
# or export OPENROUTER_API_KEY=your_key_here
# 3) Run a live evaluation (Gemini example)
Provider recipes
# Budget: OpenRouter free tier generator + Gemini judge
# Budget: OpenRouter free tier generator + OpenRouter judge
# Balanced: OpenAI generator + OpenAI judge
# Premium: Gemini generator + Gemini judge
For more examples and troubleshooting, see [docs/quickstart.md](docs/quickstart.md).
Common commands
# Validate fixtures against schema (paths required)
# Run tests from fixture paths (defaults to fixtures/)
# Evaluation pass (see docs/evaluator-contract.md)
# Report on stored runs
# Scaffold a new fixture YAML
# Bundle pack/verify (M3)
# Registry workflow (single publish command)
# Publish all matching local runs for that bundle (newest first)
# Export an evidence pack for a run id
# Check registry trust state (online-first)
# Optional offline minisign verification with local attestation
Internal modules
CLI, core, fixtures, runner, evaluators, and reporters live as submodules under crates/agentcarousel/src/ in one Cargo package.
Documentation
See Github Repo
ATF / trust: AgentCarousel maps to the Agentic Trust Framework as an evidence + CI gates implementation.