agent-first-http 0.4.2

Persistent HTTP client for AI agents — one request, one JSON line
Documentation
# Agent-First HTTP — Design

## Problem

AI agents call HTTP APIs through bash tool calls. With curl, every request spawns a new process, pays a full TCP+TLS handshake, and returns human-readable text that must be parsed. Agents need structured JSON output and — when making multiple calls — connection reuse.

### The cost of curl-per-request

```
Agent                       curl process              Server
  │                            │                        │
  ├─ spawn curl ──────────────→│                        │
  │                            ├─ TCP handshake ───────→│
  │                            ├─ TLS handshake ───────→│
  │                            ├─ HTTP request ────────→│
  │                            │←──── HTTP response ────┤
  │←── stdout (text) ─────────┤                        │
  │                            ╳ (process exits)        │
  │                                                     │
  ├─ spawn curl ──────────────→│                        │  ← another process
  │                            ├─ TCP handshake ───────→│  ← another handshake
  │                            ├─ TLS handshake ───────→│  ← another TLS
  │                            ├─ HTTP request ────────→│
  │                            │←──── HTTP response ────┤
  │←── stdout (text) ─────────┤                        │
  │                            ╳ (process exits)        │
```

10 requests to the same host = 10 TCP handshakes + 10 TLS negotiations. On a 200ms RTT link, that's 4 seconds of pure overhead.

## Two Modes

### CLI mode (default)

One bash tool call, one request, one JSON response, process exits:

```
Agent ──→ afhttp GET https://api.example.com/users ──→ JSON stdout ──→ Agent
```

Default output: `response` or `error` — one JSON line, process exits. For streaming: `chunk_start` → `chunk_data...` → `chunk_end`. Use `--verbose` for diagnostic output (startup, request, progress, retry, redirect).

This is how most agent tool calls work — fire a request, read the result, move on.

### Pipe mode (`--mode pipe`)

For workflows that benefit from connection reuse, concurrent requests, or WebSocket:

```
Agent ──→ afhttp --mode pipe (stdin JSONL ←→ stdout JSONL) ──→ Agent
```

A long-lived process. The agent sends request/config/send/cancel/close commands as JSONL to stdin, reads responses from stdout. Connections stay open between requests. Multiple requests in-flight simultaneously. `close` triggers shutdown by cancelling active work, waiting briefly for terminal events, then emitting a final `close` acknowledgement.

## Architecture

```
CLI mode:                           Pipe mode:

  argv ──→ parse_args()              stdin ──────────→ Request Parser (JSONL)
             │                                           │
             ▼                                           ▼
        ┌────────────┐              ┌─────────────────────────────┐
        │  reqwest    │              │  Connection Pool Manager     │
        │  Client     │              │  pool[host1] ─→ conn(h2)    │──→ host1:443
        │  (single    │──→ server    │  pool[host2] ─→ conn(h2)    │──→ host2:443
        │   request)  │              │  pool[host3] ─→ conn(h1)    │──→ host3:80
        └────────────┘              └─────────────────────────────┘
             │                                           │
             ▼                                           ▼
        stdout (JSON)                stdout ←──── Response Writer (JSONL)
```

All runtime protocol output goes to stdout as JSON. stderr is not a protocol channel.

### Shared core

Both modes share the same handler, chunked streaming, and WebSocket code. CLI mode builds a single request from argv, sends it through the same `execute_request()` path, and collects output via the same `mpsc` channel — just stripping `id`/`tag` fields before writing.

### Concurrency (pipe mode)

```
stdin reader (main task)
  │
  ├─ parse request 1 ──→ spawn tokio task ──→ client.send() ──→ write stdout
  ├─ parse request 2 ──→ spawn tokio task ──→ client.send() ──→ write stdout
  ├─ parse request 3 ──→ spawn tokio task ──→ client.send() ──→ write stdout
  │
  └─ (continues reading stdin without blocking)
```

Each request is an independent tokio task. The stdin reader never blocks on HTTP I/O. Responses are written to stdout as they complete, identified by `id`.

## Design Principles

### Server errors are errors

If the server violates HTTP protocol (e.g. sends non-ASCII bytes in a header), `afhttp` surfaces this as `code: "error"` with `error_code: "invalid_response"`. No silent patching, no lossy fallbacks. The agent receives accurate information and decides how to react.

### Errors are structured, not human text

Every error carries `error_code` (machine-readable, stable), `error` (human-readable detail), and `retryable` (bool). Agents match on `error_code` — not string-parsing `message`.

### Secret fields are redacted in config echo

All stdout lines go through `agent_first_data::output_json_with()` for consistent single-line JSON formatting with explicit redaction policy. Config/log output (`startup`, `config`, `log`) uses full `_secret` redaction so `key_pem_secret` never appears in plain text in config echo.

Server response data (response bodies, headers, WebSocket messages) is passed through unmodified. Redaction does not apply to server-originated content.

### Header scope safety boundary

`defaults.headers_for_any_hosts` is global and applies to every outbound host. It is restricted to non-sensitive public headers only (for example `User-Agent`, `Accept`).

Any credential material (`Authorization`, API keys, cookies, bearer tokens) must be scoped with `host_defaults[host].headers` so secrets cannot be sent to unrelated domains.

### Agent-First Data naming conventions for fields

Field names carry meaning through suffixes:

| Suffix | Meaning | Example |
|--------|---------|---------|
| `_ms` | milliseconds | `duration_ms`, `retry_base_delay_ms` |
| `_s` | seconds | `timeout_connect_s`, `timeout_idle_s` |
| `_bytes` | byte count | `response_save_above_bytes`, `received_bytes` |
| `_file` | file path | `body_file`, `cacert_file`, `key_file` |
| `_base64` | base64-encoded bytes | `body_base64`, `data_base64` |
| `_pem` | inline PEM-format text | `cacert_pem`, `cert_pem`, `key_pem_secret` |
| `_secret` (at end) | sensitive value — auto-redacted in output | `key_pem_secret` |

Inline and file-path variants are mutually exclusive per slot: setting one clears the other in stored config. Inline takes precedence when both are present in a patch.

### CLI flags: long only, no abbreviations

CLI flags use long form only (`--header`, `--body`, `--timeout-idle-s`). No single-letter short flags (`-H`, `-b`). This is deliberate — agents read and write flags by name, not by memorized shortcuts. Long flags are self-describing and less error-prone in generated commands.

CLI flag names correspond to JSON field names with hyphens replacing underscores (e.g. JSON `timeout_idle_s` → CLI `--timeout-idle-s`, JSON `body_base64` → CLI `--body-base64`).

Boolean flags that default to false are bare flags (`--verbose`, `--chunked`, `--tls-insecure`). Boolean flags that default to true take an explicit value (`--response-parse-json false`, `--response-decompress false`).

### Output formats via `--output`

CLI mode supports three output formats via `--output json|yaml|plain`:

- **json** (default): Single-line JSON via `agent_first_data::output_json_with()`. Config/log fields use full `_secret` redaction; server response payload fields remain raw.
- **yaml**: Multi-line YAML via `agent_first_data::output_yaml()`. Field name suffixes stripped (`duration_ms``duration`), values formatted (`10485760``"10.0MB"`).
- **plain**: Logfmt via `agent_first_data::output_plain()`. Same suffix stripping and value formatting as YAML but single-line.

**Server response body is never modified.** Non-string body values (parsed JSON objects/arrays) are serialized to a JSON string before passing to yaml/plain formatters, so the formatters treat them as opaque strings. This ensures the agent receives exact server data regardless of output format.

### No `unwrap` / `expect` / `panic` anywhere in the codebase

`#![deny(clippy::unwrap_used, clippy::expect_used, clippy::panic)]` is enforced at crate level. Every error case is handled explicitly — either propagated as a structured `error` output to the agent, or (for truly impossible cases) handled with a hardcoded fallback string rather than a panic.

## Dependencies

| Crate | Purpose |
|-------|---------|
| `tokio` | Async runtime, stdin reader, task spawning |
| `reqwest` | HTTP client with connection pooling and HTTP/2 |
| `tokio-tungstenite` | WebSocket client (upgrade handshake, framed read/write) |
| `clap` | CLI argument parsing (derive API) |
| `agent-first-data` | Agent-First Data output serialization with automatic `_secret` redaction |
| `serde_json` | JSON parsing and serialization |
| `base64` | Body encoding/decoding |
| `uuid` | Process-unique download directory |

## Future

- **HTTP/3 (QUIC)** — eliminates TCP head-of-line blocking, 0-RTT reconnection. Waiting for `hyper-h3` stabilization.
- **WebSocket TLS config** — apply custom TLS settings to WebSocket connections (currently uses system root store only).
- **Request pipelines** — declare request dependencies (`"after": "req-1"`) for sequential workflows.
- **Response caching** — optional ETag/Last-Modified caching per URL.