assay-core 2.12.0

High-performance evaluation framework for LLM agents (Core)
Documentation

Assay

Crates.io CI License Open Core

Policy-as-Code for AI Agents. Deterministic testing, runtime enforcement, and verifiable evidence for the Model Context Protocol.

Open Core: Engine + baseline packs are open source (MIT/Apache-2.0). Enterprise packs and managed workflows are commercial. See ADR-016 for details.

Install

curl -fsSL https://getassay.dev/install.sh | sh

Or via Cargo:

cargo install assay-cli

Core Workflow

1. Record → Replay → Validate

Record agent behavior once, replay deterministically in CI. No LLM calls, no flakiness.

# Capture traces from your agent
assay import --format mcp-inspector session.json --out trace.jsonl

# Validate against policy (milliseconds, $0 cost)
assay validate --config assay.yaml --trace-file trace.jsonl

# CI gate with SARIF output
assay run --config assay.yaml --format sarif

2. Generate Policies from Behavior

# Single trace → policy
assay generate -i trace.jsonl --heuristics

# Multi-run profiling for stable policies
assay profile init --output profile.yaml --name my-app
assay profile update --profile profile.yaml -i trace.jsonl --run-id ci-123
assay generate --profile profile.yaml --min-stability 0.8

3. Evidence Bundles

Tamper-evident bundles with content-addressed IDs. CloudEvents v1.0 format.

# Export evidence
assay evidence export --profile profile.yaml --out bundle.tar.gz

# Verify integrity
assay evidence verify bundle.tar.gz

# Lint for security issues (SARIF output)
assay evidence lint bundle.tar.gz --format sarif

# Lint with compliance pack
assay evidence lint --pack eu-ai-act-baseline bundle.tar.gz

# Compare runs
assay evidence diff baseline.tar.gz current.tar.gz

4. Compliance Packs

Built-in rule packs for regulatory compliance. Article-referenced, auditor-friendly.

# EU AI Act Article 12 (logging requirements)
assay evidence lint --pack eu-ai-act-baseline bundle.tar.gz

# Multiple packs
assay evidence lint --pack eu-ai-act-baseline,soc2-baseline bundle.tar.gz

# Custom pack
assay evidence lint --pack ./my-org-rules.yaml bundle.tar.gz

SARIF output includes article references for audit trails.

5. Tool Signing

Cryptographic signatures for tool definitions. Ed25519 + DSSE.

# Generate keypair
assay tool keygen --out ~/.assay/keys/

# Sign tool definition
assay tool sign tool.json --key priv.pem --out signed.json

# Verify signature
assay tool verify signed.json --pubkey pub.pem

6. BYOS (Bring Your Own Storage)

Push evidence to your own S3-compatible storage. No vendor lock-in.

# Push bundle
assay evidence push bundle.tar.gz --store s3://my-bucket/evidence

# Pull by ID
assay evidence pull --bundle-id sha256:abc... --store s3://my-bucket/evidence

# List bundles
assay evidence list --store s3://my-bucket/evidence

Supports: AWS S3, Backblaze B2, Cloudflare R2, MinIO, Azure Blob, GCS.

Runtime Enforcement

MCP Server Proxy

# Start policy enforcement proxy
assay mcp-server --policy policy.yaml

Kernel-Level Sandbox (Linux)

# Landlock isolation (rootless)
assay sandbox --policy policy.yaml -- python agent.py

# eBPF/LSM enforcement (requires capabilities)
sudo assay monitor --policy policy.yaml --pid <agent-pid>

GitHub Action

- uses: Rul1an/assay-action@v2

Zero-config evidence verification. SARIF integration with GitHub Security tab.

See GitHub Marketplace.

Configuration

assay.yaml:

version: "2.0"
name: "mcp-default-gate"

allow: ["*"]

deny:
  - "exec*"
  - "shell*"

constraints:
  - tool: "read_file"
    params:
      path:
        matches: "^/app/.*|^/data/.*"

Python SDK

pip install assay
from assay import AssayClient, validate

# Record traces
client = AssayClient("traces.jsonl")
client.record_trace(tool_call)

# Validate
result = validate("policy.yaml", traces)
assert result["passed"]

Pytest plugin for automatic trace capture:

@pytest.mark.assay(trace_file="test_traces.jsonl")
def test_agent():
    pass

Documentation

Contributing

cargo test --workspace

See CONTRIBUTING.md.

License

MIT