assay-core 2.9.0

High-performance evaluation framework for LLM agents (Core)
Documentation

Assay

Crates.io CI License

Policy-as-Code for AI Agents. Deterministic testing, runtime enforcement, and verifiable evidence for the Model Context Protocol.

Install

curl -fsSL https://getassay.dev/install.sh | sh

Or via Cargo:

cargo install assay-cli

Core Workflow

1. Record → Replay → Validate

Record agent behavior once, replay deterministically in CI. No LLM calls, no flakiness.

# Capture traces from your agent
assay import --format mcp-inspector session.json --out trace.jsonl

# Validate against policy (milliseconds, $0 cost)
assay validate --config assay.yaml --trace-file trace.jsonl

# CI gate with SARIF output
assay run --config assay.yaml --format sarif

2. Generate Policies from Behavior

# Single trace → policy
assay generate -i trace.jsonl --heuristics

# Multi-run profiling for stable policies
assay profile init --output profile.yaml --name my-app
assay profile update --profile profile.yaml -i trace.jsonl --run-id ci-123
assay generate --profile profile.yaml --min-stability 0.8

3. Evidence Bundles (Audit/Compliance)

Tamper-evident bundles with content-addressed IDs. CloudEvents v1.0 format.

# Export evidence
assay evidence export --profile profile.yaml --out bundle.tar.gz

# Verify integrity
assay evidence verify bundle.tar.gz

# Lint for security issues (SARIF)
assay evidence lint bundle.tar.gz --format sarif

# Compare runs
assay evidence diff baseline.tar.gz current.tar.gz

Runtime Enforcement

MCP Server Proxy

# Start policy enforcement proxy
assay mcp-server --policy policy.yaml

Kernel-Level Sandbox (Linux)

# Landlock isolation (rootless)
assay sandbox --policy policy.yaml -- python agent.py

# eBPF/LSM enforcement (requires capabilities)
sudo assay monitor --policy policy.yaml --pid <agent-pid>

Configuration

assay.yaml:

version: "2.0"
name: "mcp-default-gate"

allow: ["*"]

deny:
  - "exec*"
  - "shell*"

constraints:
  - tool: "read_file"
    params:
      path:
        matches: "^/app/.*|^/data/.*"

Python SDK

pip install assay
from assay import AssayClient, validate

# Record traces
client = AssayClient("traces.jsonl")
client.record_trace(tool_call)

# Validate
result = validate("policy.yaml", traces)
assert result["passed"]

Pytest plugin for automatic trace capture:

@pytest.mark.assay(trace_file="test_traces.jsonl")
def test_agent():
    pass

Documentation

Contributing

cargo test --workspace

See CONTRIBUTING.md.

License

MIT