AgentCarousel
Unit tests for AI agents. Define behavior in YAML, run offline tests, export signed evidence bundles your reviewers will accept.
Why agentcarousel
- Deterministic by default - Offline runs with mocks mean same inputs → same outputs, every time.
- Built for evidence - Every run produces a signed artifact (
.tar.gz+minisignattestation) you can hand to an auditor, a reviewer, or your customer's security team. - Live evals when you want them - plug in OpenAI, Anthropic, Gemini, or OpenRouter as generator and judge. Diff runs. Catch regressions.
- Compliance-aware fixtures - Risk tier, data handling, certification track — the metadata your governance program already tracks, baked into the test format.
Install
# Install (Linux — Windows: download .zip from Releases)
|
# Homebrew (macOS)
&&
# Cargo (Rust)
Quickstart
# Scaffold a fixture
# Run it offline — no API keys needed
# Validate
# Eval
# Export evidence bundle
Live Eval with LLM-as-a-judge
Bundle workflows
# Create a distributable bundle archive
# Verify bundle integrity and structure
# Pull bundle manifest + artifacts from the registry
Publish to registry
# Publish bundle + evidence in one flow
# Publish multiple matching local runs (newest first)
Trust checks
# Registry trust-state check
# Optional offline attestation verification
Configuration
Config file lookup order:
--config <path>(explicit)./agentcarousel.toml(project)~/.config/agentcarousel/config.toml(user)
Database defaults:
- macOS:
~/Library/Application Support/agentcarousel/history.db - Linux:
~/.local/share/agentcarousel/history.db
Override history path with:
Contributions
- Start here:
CONTRIBUTING.md - Security policy:
SECURITY.md - Changelog:
CHANGELOG.md
For fixture contributions, open an issue before implementation.