assay-cli-3.0.0 is not a library.

Assay

Policy-as-Code for AI agents.

Deterministic MCP governance, CI gates, and verifiable evidence bundles. Runs offline-first with no required hosted backend.

Assay validates tool-call behavior against explicit policy, records auditable decisions, and produces replayable evidence. It is built for teams that want hard gates and reviewable artifacts.

Why Assay

Deterministic gates for MCP-compatible agents in local runs and CI
Auditable evidence with export, verify, lint, diff, and replay flows
Runtime control on the tool-call path via assay mcp wrap
Offline-first workflow with portable outputs
DX-first CLI with SARIF, JUnit, PR-comment, and markdown outputs

Security Model (Bounded Claims)

Assay’s strongest wedge is deterministic governance on the tool-call route.

In the MCP fragmented-IPI experiment line, stateful sequence policy remained effective across payload fragmentation, tool-hopping, sink-failure pressure, and delayed cross-session sink attempts, where wrap-only lexical checks failed.

Assay does not claim to solve semantic hijacking in general, and it does not claim to block raw outbound network bytes by itself. The bounded claim is narrower: Assay governs sink-call routes with explicit policy decisions, audit-grade evidence, and low single-digit millisecond overhead in the published experiment line.

Results and rerun docs:

Open Core Boundary

Open core covers the engine, CLI, runtime governance, evidence flows, and baseline packs.

Compliance packs and organization-specific governance packs can be commercial. See ADR-016.

Quickstart

Install

cargo install assay-cli

From scratch

# Scaffold config + policy + CI
assay init --ci

# Run an offline smoke gate
assay ci --config eval.yaml --trace-file traces/hello.jsonl

From an existing trace

# Generate policy from recorded behavior
assay init --from-trace trace.jsonl

# Validate trace against config + policy
assay validate --config eval.yaml --trace-file trace.jsonl

From an MCP Inspector session

# Import Inspector session to Assay trace format
assay import --format inspector session.json --out-trace traces/session.jsonl

# Run policy checks
assay run --config eval.yaml --trace-file traces/session.jsonl

Demo

make demo   # full break/fix walkthrough
make test   # safe trace (PASS)
make fail   # unsafe trace (FAIL)

Core Commands

Testing and validation

Command	What it does
`assay run`	Execute a test suite against a trace and write run outputs.
`assay ci`	CI-mode run with SARIF, JUnit, and PR-comment outputs.
`assay validate`	Stateless policy validation with text, JSON, or SARIF output.
`assay replay`	Replay from a self-contained offline bundle.

Policy and config

Command	What it does
`assay init`	Scaffold policy, config, and CI workflow.
`assay generate`	Generate policy from traces or profiles.
`assay profile`	Multi-run stability profiling.
`assay doctor`	Diagnose config, trace, baseline, and runtime issues.
`assay explain`	Explain policy behavior against a trace.

Evidence and compliance

Command	What it does
`assay evidence export`	Create an evidence bundle.
`assay evidence verify`	Verify bundle integrity.
`assay evidence lint`	Lint evidence with optional packs and SARIF output.
`assay evidence diff`	Diff two verified bundles.
`assay evidence push/pull/list`	BYOS object storage flows.

Runtime

Command	What it does
`assay mcp wrap`	Wrap an MCP process with policy enforcement.
`assay sandbox`	Rootless Landlock sandbox execution on Linux.
`assay monitor`	eBPF/LSM runtime enforcement on Linux.

Misc

Command	What it does
`assay import`	Import traces from Inspector or JSON-RPC logs.
`assay tool sign/verify/keygen`	Local-key tool signing and verification.
`assay fix`	Interactive policy fix suggestions.

CI Integration

GitHub Actions

name: Assay Gate
on: [push, pull_request]

permissions:
  contents: read
  pull-requests: write
  security-events: write

jobs:
  assay:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@<PINNED_SHA>
      - uses: Rul1an/assay-action@v2

Assay Action installs Assay, runs the gate, uploads SARIF, and can publish PR-friendly outputs.

You can also generate a starter workflow:

assay init --ci github
assay init --ci gitlab

Or run manually:

assay ci \
  --config eval.yaml \
  --trace-file traces/golden.jsonl \
  --sarif reports/sarif.json \
  --junit reports/junit.xml \
  --pr-comment reports/pr-comment.md \
  --replay-strict

Exit codes:

0 pass
1 test failure
2 config or measurement error
3 infra error

Configuration

Assay usually works with two files:

eval.yaml for the test suite
policy.yaml for the allowed behavior

eval.yaml:

version: 1
suite: "my_agent"
model: "trace"
tests:
  - id: "deploy_args"
    input:
      prompt: "deploy_staging"
    expected:
      type: args_valid
      schema:
        deploy_service:
          type: object
          required: [env]
          properties:
            env:
              type: string
              enum: [staging, prod]

policy.yaml:

version: "1.0"
name: "my-policy"
allow: ["*"]
deny:
  - "exec"
  - "shell"
  - "bash"
constraints:
  - tool: "read_file"
    params:
      path:
        matches: "^/app/.*|^/data/.*"

Starter presets:

assay init --preset default
assay init --preset hardened
assay init --preset dev

Evidence Bundles

Assay produces tamper-evident .tar.gz bundles with manifests, hashes, and event streams.

assay evidence export --profile profile.yaml --out bundle.tar.gz
assay evidence verify bundle.tar.gz
assay evidence lint --pack eu-ai-act-baseline bundle.tar.gz
assay evidence diff baseline.tar.gz current.tar.gz

Python Package

The Python package is published as assay-it:

pip install assay-it

Standards and Related Projects

Assay is easier to evaluate when mapped to established specs and ecosystems:

These are interoperability references, not claims of full feature parity with each project.

Documentation

Getting started: docs/getting-started/quickstart.md
CI guide: docs/guides/github-action.md
MCP quickstart: docs/mcp/quickstart.md
Use cases: docs/use-cases/index.md
Experiment runbooks/results: docs/ops/
Architecture index: docs/architecture/index.md
ADR index: docs/architecture/adrs.md
Roadmap: docs/ROADMAP.md
Contributing docs: docs/contributing/index.md

Contributing

cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings

See CONTRIBUTING.md.

License

MIT

assay-cli 3.0.0