hen 0.15.0

Run protocol-aware API request collections from the command line or through MCP.
Documentation
# Roadmap

Hen already has the right core shape: a file-based request DSL, request dependencies, captures and assertions, parallel execution, machine-readable reporting, and an MCP server for agent-driven workflows.

The next stage is not adding random syntax. The goal is to make Hen the best tool for three concrete jobs:

1. Writing API tests as source-controlled assets.
2. Running those tests reliably in CI.
3. Using the same collections from editors, agents, and automation without translation.

This roadmap is ordered by leverage. Earlier phases remove the most shell glue, reduce the most manual test authoring, and improve the highest-friction production workflows first.

## Product Direction

Hen should become a contract-aware API testing framework with a small, readable authoring model and a strong non-interactive automation story.

That means prioritizing:

- Contract-first workflows over handwritten boilerplate.
- Built-in auth, environments, and secrets over shell-script workarounds.
- Reliable async and CI behavior over one-off local ergonomics.
- Better failure forensics over more syntax surface area.
- Shared execution semantics across CLI, MCP, and containerized CI.

## Current Strengths

These are already meaningful differentiators and should be preserved:

- File-based collections that are easy to review and version.
- Request dependencies modeled as a DAG.
- Response captures, assertions, conditional guards, and callbacks.
- Typed, structural, and schema-backed assertions with machine-readable mismatch output.
- Request fan-out through array-backed map execution.
- Non-interactive CLI paths with structured JSON, NDJSON, and JUnit output.
- Parallel execution with dependency awareness.
- Protocol-neutral execution shared across HTTP, GraphQL, MCP, SSE, and WebSocket.
- MCP integration that reuses the same execution engine as the CLI.

## Phase 1: Contract-First Testing

This is the highest-leverage gap between Hen and gold-standard API tooling.

### Contract Outcomes

- Generate starter collections from OpenAPI documents.
- Validate requests and responses against OpenAPI schemas.
- Detect spec drift between committed collections and current API contracts.
- Make contract validation available from both CLI and MCP.

### Why First

- It dramatically reduces the cost of onboarding new APIs.
- It moves Hen from request runner to contract-aware test framework.
- It creates a more defensible testing story for teams adopting Hen in CI.

### Contract Features

- `hen import openapi <spec>` to generate starter `.hen` collections.
- Schema assertions such as response body validation against a schema or operation.
- Optional request validation before execution.
- Contract drift reports in JSON and JUnit-friendly forms.
- Operation-aware examples that preserve tags, summaries, and auth hints from the spec.

### Contract Implementation Anchors

- CLI entry points in `src/main.rs`.
- MCP tool surface in `src/mcp.rs`.
- Assertion model in `src/request/assertion.rs`.
- Reporting and machine-readable output in `src/report.rs`.
- Collection parsing and validation in `src/parser/` and `src/collection/`.

### Contract Exit Criteria

- A user can point Hen at an OpenAPI spec and get a runnable starter collection.
- A collection can validate a response body against a contract without shelling out.
- Contract failures appear cleanly in JSON and JUnit outputs.

## Phase 2: Auth, Environments, and Secrets

Hen currently supports flexible variable resolution, but gold-standard API testing needs first-class primitives for common authentication and configuration patterns.

### Auth Outcomes

- Reduce callback and shell-script usage for standard auth flows.
- Make collections portable across local, CI, and MCP contexts.
- Prevent accidental secret leakage in logs and structured output.

### Auth Features

- Named environments with layered overrides, for example local, staging, and prod.
- Secret providers for env vars, files, and external secret backends.
- Redaction policies for headers, bodies, captures, and report outputs.
- Built-in OAuth2 and OIDC token acquisition flows.
- Cookie jar persistence across dependent requests.
- Support for common signing mechanisms such as mTLS, HMAC, and AWS SigV4.

### Auth Implementation Anchors

- Variable and template resolution in `src/request/template.rs`.
- Request execution and context propagation in `src/request/mod.rs` and `src/request/runner.rs`.
- CLI and MCP input handling in `src/main.rs` and `src/mcp.rs`.
- Structured reporting in `src/report.rs`.

### Auth Exit Criteria

- A typical login-and-refresh workflow can be modeled without shell callbacks.
- The same collection can be executed against multiple environments with explicit overrides.
- Sensitive values are redacted consistently in human and machine-readable outputs.

## Phase 3: Reliability for Real Test Suites

Hen already runs collections well. The next step is making large suites robust under eventual consistency, background processing, and flaky networks.

### Reliability Outcomes

- Support async API workflows without forcing ad hoc shell loops.
- Give CI owners explicit control over retries, timeouts, and failure policy.
- Preserve deterministic reporting even when behavior is retried or delayed.

### Reliability Features

- Poll-until assertions with timeout, interval, and backoff.
- Per-request timeout and retry policy configuration.
- Failure-classification support for transport, assertion, timeout, and dependency failures.
- Request-level idempotency hints for safer retry behavior.
- Suite-level controls for retrying flaky requests separately from hard assertion failures.

### Reliability Implementation Anchors

- Request execution flow in `src/request/mod.rs`.
- Plan and dependency orchestration in `src/request/planner.rs` and `src/request/runner.rs`.
- Failure reporting in `src/report.rs`.

### Reliability Exit Criteria

- A user can model eventual-consistency checks directly in the DSL.
- Retries and polling are visible in structured output and do not obscure root-cause failures.
- CI users can set policy without wrapping Hen in external scripts.

## Phase 4: Debugging and Failure Forensics

Status: in progress, with the core failure-detail foundation already landed.

A strong testing tool is judged heavily by how fast it explains failures. Hen already ships useful building blocks here: per-request start time and total duration in structured output, CLI elapsed and slowest-request summaries, truncated body previews, interruption-aware partial reports, and structured assertion mismatches with exact paths, types, and reasons. The remaining Phase 4 work is deeper trace export, artifact retention, and better visualization of multi-step runs.

### Forensics Outcomes

- Make failed runs diagnosable from retained artifacts and machine-readable reports alone.
- Reduce the need to rerun with ad hoc logging once deeper trace export exists.
- Improve multi-step workflow visibility across request dependencies and session-backed flows.

### Forensics Features

- Rich timing breakdowns for DNS, connect, TLS, first byte, and total duration where available. Total request duration is already exposed; the missing work is phase-level breakdown.
- Request and response artifact export suitable for retention outside the process. Normalized execution artifacts and transcripts already exist internally, but they are not yet first-class report outputs.
- HAR export or equivalent HTTP trace output.
- Dependency-aware execution trace reports for multi-step runs.
- Human-readable diffs between expected and actual payload fragments beyond the current path-aware mismatch summaries.
- Request-graph visualization for collections with dependencies.

### Forensics Implementation Anchors

- Execution snapshots in `src/request/response_capture.rs` and `src/request/mod.rs`.
- CLI output and previews in `src/main.rs`.
- Machine-readable reporting in `src/report.rs`.

### Forensics Exit Criteria

- A failed CI run can produce enough retained trace data to debug without reproducing locally.
- Multi-step collections expose a clear execution breadcrumb trail instead of only per-request summaries.
- Payload mismatches are presented as readable diffs when broad fixture or large-payload comparisons fail, not just as path-aware mismatch summaries.

Detailed plan: see [PHASE_4.md](PHASE_4.md).

## Phase 5: Snapshot and Fixture Workflows

Status: in progress, but most of the original Phase 5 assertion and data-modeling scope is already complete.

Hen already ships typed assertion evaluation, structural JSON matching, filtered selectors, schema-backed validation, and structured mismatch reporting. The remaining Phase 5 work is snapshot-style assertions for larger stable payloads and the reviewable fixture workflows that go with them.

- Add snapshot assertions with explicit update flows.
- Support redaction and normalization so fixtures stay reviewable and safe for source control.
- Preserve readable diffs plus structured mismatch output for automation.
- Keep fixture comparison explicit rather than turning it into the default assertion style.

Detailed plan: see [ROADMAP_ASSERTIONS_AND_EXTENSIBILITY.md](ROADMAP_ASSERTIONS_AND_EXTENSIBILITY.md).

## Phase 6: Finish Protocol Extensibility and Plugin Hooks

Status: in progress, but most of the built-in transport expansion is already landed.

Hen already has the transport-neutral execution boundary plus built-in GraphQL, MCP-over-HTTP target testing, SSE, and WebSocket support. The remaining Phase 6 work is to finish the highest-leverage ergonomic follow-ups and expose a narrow plugin model without widening the built-in protocol list again.

- Keep planning, execution, assertions, and reporting stable across the shipped protocol families.
- Finish the most valuable MCP follow-ups, especially where authors still need shell glue for discovery, auth, or session ergonomics.
- Expose a narrow plugin model only after the current seams remain stable under real use.
- Treat gRPC as a likely proving transport for the plugin model rather than a required built-in protocol milestone.
- Keep MCP-as-target work separate from changes to Hen's own MCP server interface.

Detailed plan: see [PROTOCOL_EXTENSION.md](PROTOCOL_EXTENSION.md).

## Phase 7: Editor and Authoring Experience

Hen's DSL is one of its strengths. Better authoring support will directly affect adoption.

### Authoring Outcomes

- Make `.hen` files easier to write correctly the first time.
- Reduce context switching between docs and editor.
- Improve discoverability of advanced features such as dependencies and contract validation.

### Authoring Features

- Linting and diagnostics for common authoring mistakes.
- Autocomplete for directives, operators, captures, and dependency names.
- Jump-to-definition for fragments and dependency references.
- Inline request graph visualization.
- Import from curl and import from OpenAPI inside editor workflows.
- Collection templates for common auth and test patterns.

### Authoring Exit Criteria

- New users can author useful collections with minimal documentation lookup.
- The editor can catch common collection mistakes before execution.
- OpenAPI-driven and hand-authored workflows feel equally first-class.

## Cross-Cutting Engineering Work

These tracks should accompany every phase rather than waiting for the end.

- Keep CLI and MCP semantics aligned so features do not drift.
- Execute the repository modularization plan in [MODULARIZATION_PLAN.md]MODULARIZATION_PLAN.md so core modules stay reviewable as features land.
- Preserve machine-readable outputs as stable automation contracts.
- Expand fixture-based integration coverage under `tests/` for every new DSL or reporting feature.
- Add focused examples under `examples/` for each major feature as it lands.
- Document every new feature in `README.md` and `syntax-reference.md` at ship time.
- Maintain deterministic behavior in parallel and dependency-driven runs.

## Suggested Execution Order

If only a few large workstreams can be funded, this is the right order:

1. Contract-first testing.
2. Auth, environments, and secrets.
3. Reliability features for async and CI-heavy suites.
4. Debugging and failure forensics.
5. Snapshot and fixture workflows.
6. Finish protocol extensibility and plugin hooks.
7. Editor and authoring experience.

## Explicit Non-Goals for Now

These may become useful later, but they should not displace the higher-leverage work above.

- Large visual dashboards as a primary interface.
- Embedded test data stores or broad test orchestration outside API workflows.
- Heavy DSL expansion that duplicates shell scripting without improving readability.
- Features that diverge between CLI and MCP execution semantics.

## Success Measures

The roadmap is working if it produces measurable changes in how Hen is adopted:

- Fewer shell callbacks in example and production collections.
- Faster onboarding from API spec to runnable test suite.
- More actionable CI artifacts when tests fail.
- Broader protocol and auth coverage without bloating basic request authoring.
- Stable automation usage through CLI, containers, and MCP integrations.