agentcarousel
Testing for AI agents
agentcarousel is a CLI for people who want confidence in AI agent behavior before shipping. It starts simple (validate + test locally) and scales to live evaluation, evidence export, and trust workflows.
Why people use this
- Verify agent quality quickly with schema + rule checks.
- Run deterministic offline tests in CI.
- Store run history and compare results over time.
- Export evidence bundles for audits and share benchmarks.
- Publish bundles/runs to a registry and verify trust state.
Start Here
If you just want value fast, do these three commands.
1) Install
|
Notes:
- Installer supports Linux and macOS.
- On Windows, download the
.ziprelease asset from GitHub. - The installer offers an
agcalias for convenience.
2) Validate fixtures
A fixture is a YAML file that defines test cases (and optional mocks) for an agent or skill so the CLI can validate, run offline tests, or evaluate behavior.
With no paths, validate scans the current directory for fixture files.
3) Run offline tests
This is the safest default for CI and public fixture repos.
Common Commands
# Help
# Validate fixture files (schema + rules)
# Run fixtures with mock generation
# Evaluate (mock or live)
# List and inspect stored runs
# Scaffold a new fixture
# Export evidence for one run (or newest N runs)
Configuration
Config file lookup order:
--config <path>(explicit)./agentcarousel.toml(project)~/.config/agentcarousel/config.toml(user)
History database defaults:
- macOS:
~/Library/Application Support/agentcarousel/history.db - Linux:
~/.local/share/agentcarousel/history.db
Override history path with:
Live evaluation with judge models
Supported providers for live generation and judging include Gemini, OpenAI, Anthropic, and OpenRouter.
More recipes and troubleshooting: see docs/fixture-development-process.md and the Configuration section above for live eval env vars.
Bundle workflows
A versioned package of fixtures, described by bundle.manifest.json with content hashes, that can be packed, verified, and published.
# Create a distributable bundle archive
# Verify bundle integrity and structure
# Pull bundle manifest + artifacts from a registry (API details at agentcarousel.com when published)
Publish to registry
# Publish bundle + evidence in one flow
# Publish multiple matching local runs (newest first)
Trust checks
The registry has the published assurance state for a bundle (queryable via trust-check), optionally backed by a signed attestation (e.g. minisign) that you verify.
# Registry trust-state check
# Optional offline attestation verification
Build From Source
Prerequisites:
- Rust 1.95+
# Build package and binaries
# Run from source (explicit binary)
Binaries provided by this package:
agentcarouselagc
OSS and Contributions
- Start here:
CONTRIBUTING.md - Security policy:
SECURITY.md - Changelog:
CHANGELOG.md
For fixture contributions, we prefer opening an issue before implementation.
Releases and crates.io
- Rust crate: crates.io/crates/agentcarousel