rivet-cli 0.16.0

# CLI Reference

## Global

```
rivet [--json-errors] [COMMAND] [OPTIONS]
```

```bash
rivet --version       # print version
rivet --help          # show help
```

| Flag | Description |
|------|-------------|
| `--json-errors` | Output errors as `{"error":"..."}` JSON to stderr instead of plain text. Applies to all subcommands. Useful for machine-readable orchestration and CI pipelines. |

```bash
rivet --json-errors run --config rivet.yaml
rivet run --config rivet.yaml --json-errors   # global flag accepted in any position
```

---

## `rivet run`

Run export jobs defined in a config file.

```bash
rivet run --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Run only a specific export by name |
| `--validate` | | bool | Validate output file row count after writing |
| `--reconcile` | | bool | Run `COUNT(*)` on source query and compare with exported rows |
| `--resume` | | bool | Resume an in-progress chunked export. Exits non-zero with an actionable message if no in-progress checkpoint exists — run without `--resume` to start fresh, or `rivet state reset-chunks` to clear a stuck run |
| `--force` | | bool | Override safety gates that would otherwise refuse the run. Today: with `--resume`, allows starting against a destination prefix whose `_SUCCESS` marker is already present (ADR-0012 M8). Without it, resume against a complete run refuses so an operator cannot accidentally re-export over a verified dataset |
| `--parallel-exports` | | bool | Run all exports concurrently (ignored with `--export`) |
| `--parallel-export-processes` | | bool | Run each export as a separate child process |
| `--summary-output` | | PATH | Write run aggregate to this file as JSON |
| `--json` | | bool | Print run aggregate to stdout as JSON after the run |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable). Substitutes `${key}` in queries |

### Examples

```bash
# Basic run
rivet run -c my_export.yaml

# Run with validation and reconciliation
rivet run -c my_export.yaml --validate --reconcile

# Run a single export
rivet run -c my_export.yaml -e orders_daily

# Resume interrupted chunked export
rivet run -c my_export.yaml -e big_table --resume

# Parameterized query
rivet run -c my_export.yaml -p region=us-east -p year=2026

# Parallel exports (all at once)
rivet run -c my_export.yaml --parallel-exports

# Parallel exports — one OS process per export, parent-side cards UI
rivet run -c my_export.yaml --parallel-export-processes
```

### `--parallel-export-processes` — one card per export

`--parallel-exports` runs every export in the same Rivet process on a separate
thread. That keeps logs simple, but every export shares the same source
connection pool / global allocator, and a panic in one export tears the whole
run down.

`--parallel-export-processes` instead spawns one `rivet` *child process* per
export — full memory and connection isolation, no shared allocator. The parent
process owns the screen and renders one **card** per export with a live
progress bar, ETA, row count, and elapsed time. When a child finishes, the
progress bar is replaced *in place* with the export's final metrics, so the
on-screen card becomes a self-contained per-export summary; below the cards a
single aggregated `Run summary` block prints once for the whole run.

![Parallel cards UI](../gifs/parallel-cards.gif)

```text
── orders ──────────────────────────────────────────────
  run_id:    orders_20260427T120000.069
  status:    running
  mode:      chunked
  tuning:    profile=balanced (default)
  batch_size: 1,000
  [====================>--------------] 11/20 chunks | 1.1M rows | 00:02:06 | ETA 66s
```

Children emit structured NDJSON events (`Started`, `ProgressInit`,
`Progress`, `Finished`) on stdout via the `RIVET_IPC_EVENTS=1` env var; the
parent multiplexes them into the cards UI. If a child crashes without a
`Finished` event, its card is marked `failed` with a synthetic warning so a
silent crash never leaves the run looking healthy.

---

## `rivet plan`

![Plan / apply walkthrough](../gifs/plan-apply.gif)

Generate a sealed execution plan artifact — no data is exported.

`rivet plan` runs preflight analysis (row estimate, index check, sparsity), computes chunk boundaries for chunked exports, snapshots the current cursor for incremental exports, and writes everything to a `PlanArtifact` JSON file. The artifact can be reviewed, committed, stored as a CI artifact, or passed to `rivet apply`.

```bash
rivet plan --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `--config` | `-c` | string | — | Path to YAML config file **(required)** |
| `--export` | `-e` | string | all | Plan only a specific export |
| `--param` | `-p` | KEY=VALUE | — | Query parameter (repeatable) |
| `--output` | `-o` | string | stdout | Write plan JSON to this file |
| `--format` | | `pretty`\|`json` | `pretty` | `pretty` prints a human summary; `json` writes the full artifact |

### Examples

```bash
# Human-readable summary (no file written)
rivet plan -c rivet.yaml

# Write full JSON artifact to a file
rivet plan -c rivet.yaml --format json --output plan.json

# Plan a single export
rivet plan -c rivet.yaml -e orders --format json -o orders_plan.json
```

### Pretty output (example)

```
  Plan ID  : a1b2c3d4e5f6...
  Created  : 2026-04-14 10:00:00 UTC
  Expires  : 2026-04-15 10:00:00 UTC
  Export   : orders
  Strategy : chunked
  Chunks   : 42
  Row est. : ~2,100,000
  Verdict  : Acceptable
  Profile  : balanced
  Warnings :
    • sparse id range: ~12% fill
  Resources:
    Batch size   :  10,000 rows
    Batch memory : ~2 MB (narrow) – ~95 MB (wide)
    RSS guard    : 4,096 MB
    Throttle     : 50 ms between batches
  Output   : local → ./out
  Format   : parquet + zstd
```

The **Resources** section shows:

| Line | Meaning |
|---|---|
| `Batch size` | Rows fetched per query. `adaptive` if `batch_size_memory_mb` is set. |
| `Batch memory` | Estimated range: narrow (~200 B/row) to wide (~10 KB/row) tables. |
| `RSS guard` | Process-level RSS threshold. Fetching pauses if exceeded (0 = disabled). |
| `Throttle` | Delay between batches to reduce source load (omitted when 0). |
| `⚠ Wide tables may use…` | Shown when the upper bound exceeds 128 MB/batch — consider `batch_size_memory_mb` or a lower `batch_size`. |

### Memory estimate methodology — advisory only

> **The memory estimate in `rivet plan` is a heuristic, not a guarantee.** Treat it as a planning signal, not a hard prediction.

`rivet plan` does not sample the table. It computes the batch memory range using two fixed assumptions:

- **Narrow bound** — 200 B per row (all INTEGER / BIGINT / TIMESTAMPTZ columns)
- **Wide bound** — 10 KB per row (all TEXT / JSONB / BYTEA columns)

For most real tables the actual per-row size falls between these two bounds. The narrow bound is a reliable floor for numeric-heavy schemas; the wide bound is a reliable ceiling for text-heavy schemas.

**What the estimate does not capture:**

| Factor | Effect on actual RSS |
|--------|---------------------|
| Highly variable TEXT/BLOB values | Actual batches can be 2–10× the wide estimate |
| Sparse nullable columns | Actual batches will be below the narrow estimate |
| Compression buffers in the Parquet writer | Adds 50–200 MB on top of the Arrow batch size |
| Tokio runtime, connection pool, jemalloc | Adds 50–150 MB baseline overhead |

**How to get a precise number:** run `rivet run` once with `RUST_LOG=info` against a representative sample, then check the `peak_rss` in the logged summary or in `rivet state metrics`. That measured value from your actual data is more reliable than any pre-run estimate.

**Planned enhancement:** a future `rivet plan --sample N` flag will query up to N rows to compute a data-driven row-width estimate. This will narrow the uncertainty for variable-width schemas without a full table scan.

### Plan artifact structure

The JSON artifact (`--format json`) contains:

```json
{
  "rivet_version": "0.4.0",
  "plan_id": "a1b2c3d4...",
  "created_at": "2026-04-14T10:00:00Z",
  "expires_at": "2026-04-15T10:00:00Z",
  "export_name": "orders",
  "strategy": "chunked",
  "plan_fingerprint": "0123456789abcdef",
  "resolved_plan": { ... },
  "computed": {
    "chunk_ranges": [[1, 50000], [50001, 100000], "..."],
    "chunk_count": 42,
    "cursor_snapshot": null,
    "row_estimate": 2100000
  },
  "diagnostics": {
    "verdict": "Acceptable",
    "warnings": ["sparse id range: ~12% fill"],
    "recommended_profile": "balanced"
  }
}
```

> **Security note**: `resolved_plan` embeds the full source connection config including credentials. Treat plan files with the same care as your rivet config file.

---

## `rivet apply`

Execute a sealed plan artifact, **or** run a config's exports wave-by-wave. The mode is chosen by the path's extension:

- **`.json`** → a sealed [`PlanArtifact`](#rivet-plan): deserialize, validate staleness + cursor integrity, then execute the single export using the artifact's pre-computed chunk boundaries — no `SELECT min/max` queries against the source.
- **`.yaml` / `.yml`** → a config: run every export **wave by wave** in ascending `wave:` order (the wave each export was assigned by `rivet plan`). See [Wave-ordered execution](#wave-ordered-execution-yaml-config) below.

```bash
rivet apply <PLAN_FILE | CONFIG> [OPTIONS]
```

| Argument/Flag | Type | Description |
|---|---|---|
| `PLAN_FILE` / `CONFIG` | string | Path to a plan JSON artifact, or a YAML config for wave-ordered execution **(required)** |
| `--force` | bool | Skip staleness check (allow plans older than 24 h) — JSON-artifact mode only |

### Staleness rules

| Plan age | Behavior |
|----------|----------|
| < 1 hour | Proceeds silently |
| 1–24 hours | Warns and proceeds |
| > 24 hours | Rejects — use `--force` to override |

### Cursor drift (Incremental exports)

If another `rivet run` completed after the plan was generated, the cursor will have advanced. `rivet apply` detects this and rejects the artifact to prevent re-exporting already-exported rows. Regenerate with `rivet plan`.

### Examples

```bash
# Apply the plan
rivet apply plan.json

# Apply an old plan (override staleness check)
rivet apply plan.json --force
```

### What apply does NOT do

- Does not re-read the config file
- Does not re-run preflight queries
- Does not recompute chunk boundaries (uses pre-computed ranges from the artifact)
- Does not enforce preflight verdict (diagnostics are advisory — see ADR-0005)

### State location

`rivet apply` opens `.rivet_state.db` from the directory containing the plan file. Place the plan file alongside the config file, or in the same directory, to ensure the correct state database is used.

### Wave-ordered execution (YAML config)

`rivet apply <config>.yaml` runs every export in the config **wave by wave**, lowest `wave:` first, with a barrier between waves — every export in wave 1 finishes before wave 2 starts. Exports with no `wave:` run last. `rivet plan` writes the `wave:` and `parallel_safe:` fields onto each export (you can hand-edit them; apply respects your order).

**Within-wave parallelism.** With `parallel_export_processes: true` in the config (or `rivet apply --parallel-export-processes`), the **cheap** exports within a wave — those `rivet plan` marked `parallel_safe: true` (cost class `Low`, < ~100K rows) — run concurrently as separate processes. A heavier export already chunk-parallelizes its own ranges internally, so it runs **alone** in its wave; two large tables at once would multiply load on the source. Each child still self-throttles via the adaptive governor. Without the flag, every export runs sequentially. `parallel_safe` also respects the campaign's `isolate_on_source` — a cheap export on a contended shared source still runs alone.

```bash
# plan assigns waves → you review/edit → apply executes them, lowest wave first
rivet plan  -c rivet.yaml
rivet apply rivet.yaml
```

A failing export does not stop its wave-mates: failures are collected and the run exits non-zero with the most stop-worthy error (data-integrity > schema-drift > retryable).

**Resuming after a partial failure.** Re-run with `rivet apply <config>.yaml --resume`: exports a prior run already completed (their destination carries a `_SUCCESS` marker) are **skipped**, and an incomplete chunked export continues from its checkpoint — so recovering a run that failed mid-way does not redo the tables that already succeeded. Without `--resume`, a re-run re-exports everything.

`partition_by` exports are not expanded in this path yet — use `rivet run` for those.

---

## `rivet validate`

Re-run manifest-aware verification against an existing destination — **no extraction**.

```bash
rivet validate --config <PATH> [OPTIONS]
```

The same M5/M6 checks `rivet run --validate` performs at end-of-run, exposed as a standalone command for between-run polling and triage. Reads `manifest.json` + `_SUCCESS` at the destination and head-checks every committed part for presence and recorded `size_bytes`. The **source is not queried** (use `rivet reconcile` for that). See [ADR-0013](../adr/0013-trust-flag-contract.md) §"Subcommand carveouts" and [ADR-0012](../adr/0012-cloud-manifest-contract.md) M5/M6.

By default `validate` resolves the destination prefix the same way `run` does (`{date}` becomes today's UTC date). Use `--date`, `--run-id`, or `--prefix` to point at a prior run instead.

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Validate only a specific export by name |
| `--format` | | string | Override the format used to resolve the part layout |
| `--output` | `-o` | PATH | Write the verification report to this file as JSON |
| `--date` | | YYYY-MM-DD | Resolve `{date}` to this date instead of today (UTC) |
| `--run-id` | | string | Point at a prior run's prefix by run id |
| `--prefix` | | string | Point at an explicit destination prefix |

Exits non-zero when the manifest references a part that is missing or whose size does not match. A legacy prefix (no manifest) falls back to the M6 reduced-guarantee path and is labelled `legacy_run: true`.

### Examples

```bash
# Verify today's run at the configured destination
rivet validate -c my_export.yaml

# Verify a prior run by id, JSON report to a file
rivet validate -c my_export.yaml --run-id orders_20260521T120000 -o verdict.json
```

## `rivet reconcile`

![Chunked + reconcile + repair walkthrough](../gifs/reconcile-repair.gif)

Partition/window reconciliation — re-runs per-chunk `COUNT(*)` on the source and compares with the stored per-chunk row counts from the last run. Surfaces **matches**, **mismatches**, and **repair candidates** without re-exporting data (Epic F).

```bash
rivet reconcile --config <PATH> --export <NAME> [OPTIONS]
```

| Flag | Short | Type | Description |
|---|---|---|---|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Export name to reconcile **(required)** |
| `--format` | | `pretty` \| `json` | Output format (default `pretty`) |
| `--output` | `-o` | string | Write JSON report to this file (use with `--format json`) |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable) |

### Scope (v1)

- **Chunked exports** — supported. Requires a previous run with `chunk_checkpoint: true` so per-chunk ranges and row counts are persisted in `.rivet_state.db`.
- **Time-window** — returns an error ("use chunked with `chunk_by_days`" for partition reconcile).
- **Snapshot / Incremental** — no natural partitions; use `rivet run --reconcile` for a whole-export count check.

### What it does

For each completed chunk task from the latest chunk run:

1. Rebuilds the exact chunk query the pipeline used (same `WHERE` predicate, same dense/range shape — `build_chunk_query_sql`).
2. Runs `SELECT COUNT(*) FROM (<chunk_query>) AS _rc`.
3. Compares the source count with the stored `rows_written` for that chunk.

Each partition is classified as:

- `match` — source and exported counts are equal.
- `mismatch` — counts differ; partition is a repair candidate (note includes `diff`).
- `unknown` — one of the counts is unavailable (chunk never completed, unparseable chunk keys); also a repair candidate.

### Examples

```bash
# Human-readable summary
rivet reconcile -c my_export.yaml -e orders

# JSON report to file
rivet reconcile -c my_export.yaml -e orders --format json -o reconcile.json
```

Reports are **advisory** — same policy as prioritization (ADR-0006) and plan artifacts (ADR-0005). They surface what needs repair; they do not re-export on their own.

### Verification strategy tradeoffs

Rivet has three verification mechanisms at different cost/precision tradeoffs:

| Mechanism | What it checks | Cost | When to use |
|---|---|---|---|
| `rivet run --reconcile` | `COUNT(*)` source vs exported rows for the whole export | 1 extra query | Snapshot / incremental exports; cheap sanity check after every run |
| `rivet reconcile` | Per-chunk `COUNT(*)` source vs stored chunk row counts | 1 query per chunk | Chunked exports with `chunk_checkpoint: true`; catches partial writes in individual chunks |
| `rivet check --type-report` | Column type fidelity + warehouse compatibility | 1 LIMIT-0 probe | Before first export of a new table; after source schema changes |

**Rule of thumb:**
- Use `--reconcile` always for snapshot/incremental exports — cost is negligible.
- Use `rivet reconcile` for chunked exports if data correctness is critical or the source is volatile.
- Use `rivet repair` only when `rivet reconcile` surfaces mismatches — it re-exports only the flagged chunks.

---

## `rivet repair`

Targeted repair of chunks flagged by reconcile. Prints a `RepairPlan` by default; with `--execute`, re-exports only the flagged chunk ranges (Epic H, [ADR-0009 RR1–RR8](../adr/0009-reconcile-and-repair-contracts.md)).

```bash
rivet repair --config <PATH> --export <NAME> [OPTIONS]
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Export name to repair (must be `mode: chunked`) **(required)** |
| `--report` | | path | Path to a reconcile JSON report (from `rivet reconcile --format json`). Omit to run reconcile in-process against the latest chunk run |
| `--execute` | | bool | Actually re-export the flagged chunk ranges. Without this flag, the plan is printed and nothing is executed (RR2) |
| `--format` | | `pretty` \| `json` | Output format for the plan / post-execute report (default `pretty`) |
| `--output` | `-o` | string | Write plan / report JSON to this file (with `--format json`) |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable) |

### Examples

```bash
# Dry run from the latest reconcile — prints the plan, nothing executes
rivet repair -c my_export.yaml -e orders

# Dry run from a saved reconcile report
rivet repair -c my_export.yaml -e orders --report reconcile.json

# Execute — re-runs only the flagged chunks
rivet repair -c my_export.yaml -e orders --report reconcile.json --execute
```

### What `--execute` does and does not do

- Re-runs only the flagged chunk ranges via `ChunkSource::Precomputed` — same SQL shape as extraction and reconcile (RR3).
- Writes **new** files alongside originals with `<export>_<ts>_chunk<idx>.<ext>` (RR5). Rivet does **not** delete or overwrite prior files. Downstream deduplication (or a versioned output prefix) is the operator's responsibility.
- Leaves `last_committed_*` untouched (RR4) — repair is corrective, not commitment. `last_verified_*` re-advances only if a subsequent clean `rivet reconcile` runs.

---

## `rivet check`

Preflight analysis: diagnose source health, estimate row counts, check indexes, recommend tuning. With `--type-report`, also introspects column types and validates them against a target warehouse.

```bash
rivet check --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Check only a specific export |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable) |
| `--type-report` | | bool | Run a type fidelity report: show each column's source type, Rivet type, Arrow type, and fidelity |
| `--strict` | | bool | Exit non-zero if any column mapping is lossy or unsupported (use with `--type-report`) |
| `--json` | | bool | Emit type report as newline-delimited JSON instead of a table |
| `--target` | | string | Validate types against a warehouse target. Currently supported: `bigquery` |

### Examples

```bash
# Standard preflight check
rivet check -c my_export.yaml

# Type fidelity report (human-readable table)
rivet check -c my_export.yaml --type-report

# Type report with BigQuery compatibility column
rivet check -c my_export.yaml --type-report --target bigquery

# Type report as JSON — pipe-friendly, one object per export
rivet check -c my_export.yaml --type-report --json

# Strict mode — exits 1 if any lossy or unsupported mapping exists
rivet check -c my_export.yaml --type-report --strict
```

### Type report output

```
Export: orders  [target: bigquery]

  Column        Source type        Rivet type       Arrow type            Fidelity        Target type   Status
  ----------    ----------------   ---------------  --------------------  --------------  -----------   ------
  id            int4               int4             Int32                 exact           INT64         ok
  amount        numeric(15,4)      decimal(15,4)    Decimal128(15, 4)     exact           NUMERIC       ok
  created_at    timestamptz        timestamp_tz     Timestamp(us, UTC)    exact           TIMESTAMP     ok
  metadata      jsonb              json             Utf8                  logical_string  STRING        ok ~
  tags          text[]             list<text>       List(Utf8)            exact           REPEATED…     ok
```

Fidelity levels:

| Level | Meaning |
|---|---|
| `exact` | Round-trips without loss |
| `compatible` | Structurally compatible; minor representation difference |
| `logical_string` | Serialized to STRING/text (no native Arrow type) |
| `lossy` | Precision or range reduction |
| `unsupported` | No mapping available; column is skipped |

Output includes: table existence, estimated row count, index analysis, tuning recommendation.

---

## `rivet doctor`

Verify source and destination connectivity/auth before running exports.

```bash
rivet doctor --config <PATH>
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |

### Example

```bash
rivet doctor -c my_export.yaml
```

Output:

```
rivet doctor: verifying auth for config 'my_export.yaml'

[OK]  Config parsed successfully
[OK]  Source auth (Postgres)
[OK]  Destination Local(./output)

All checks passed.
```

When `tls:` is omitted from `source:`, an extra line appears before the source check: `[WARN] source: TLS is not enforced — credentials and result rows cross the network in plaintext.` Silence it by adding `tls: { mode: disable }` (local dev only) or fix it for prod with `tls: { mode: verify-full }` — see [reference/config.md § TLS](config.md#tls).

---

## `rivet init`

Generate a YAML config scaffold (or a machine-readable discovery artifact) by connecting to PostgreSQL or MySQL and introspecting tables (read-only). Does **not** run an export. YAML scaffolds include **`meta_columns`** (`exported_at` / `row_hash` **on** by default); scaffolds with heuristic **`mode: chunked`** also include **`chunk_checkpoint: true`** — see [init.md](init.md).

```bash
rivet init (--source <URL> | --source-env <ENV_VAR> | --source-file <PATH>)
           [--table <NAME>] [--schema <NAME>] [-o <PATH>] [--discover]
```

Exactly one of `--source`, `--source-env`, `--source-file` must be provided (enforced by the argument group).

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--source` | | string | Connection URL: `postgresql://` or `mysql://`. **Visible in shell history / `ps`** — avoid in production |
| `--source-env` | | env var name | Name of an env var that holds the URL (e.g. `DATABASE_URL`). URL never hits the command line. **Recommended.** |
| `--source-file` | | path | Path to a file containing just the URL on one line. Credentials stay on disk |
| `--table` | | string | Single table; optional `schema.table` on PostgreSQL. Omit to scaffold **all** tables/views in a Postgres schema or MySQL database |
| `--schema` | | string | **PostgreSQL:** schema to list (default `public`). **MySQL:** database name if not in the URL, or override URL database |
| `--output` | `-o` | string | Write output to file (default: print to stdout) |
| `--discover` | | bool | Emit a machine-readable JSON discovery artifact instead of YAML — includes ranked cursor/chunk candidates, row estimates, on-disk sizes, and coalesce-fallback hints |

**Examples**

```bash
# One table → one export block
rivet init --source-env DATABASE_URL --table orders -o rivet.yaml

# PostgreSQL: entire schema (default public)
rivet init --source-env DATABASE_URL --schema public -o all_public.yaml

# MySQL: entire database from URL path
rivet init --source-file /run/secrets/mysql_url -o all_mydb.yaml

# JSON discovery artifact — ranked cursor/chunk candidates per table
rivet init --source-env DATABASE_URL --schema public --discover -o discovery.json
```

Narrative guide, heuristics, and Docker Compose examples: **[init.md](init.md)**.

---

## `rivet metrics`

Show export run history (duration, row count, file size, status).

```bash
rivet metrics --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `--config` | `-c` | string | — | Config file **(required)** |
| `--export` | `-e` | string | all | Filter by export name |
| `--last` | `-l` | integer | 20 | Number of recent runs to show |

### Example

```bash
rivet metrics -c my_export.yaml --last 10
rivet metrics -c my_export.yaml -e orders_daily
```

---

## `rivet journal`

Inspect the structured run journal for an export — per-run event log with status, file/row/byte summary, retries, quality issues, schema changes, and the first error line.

```bash
rivet journal --config <PATH> --export <NAME> [OPTIONS]
```

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `--config` | `-c` | string | — | Path to YAML config file **(required)** |
| `--export` | `-e` | string | — | Export name to inspect **(required)** |
| `--last` | `-l` | integer | 5 | Number of recent runs to show |
| `--run-id` | | string | — | Show a single specific run by ID |

### Examples

```bash
# Last 5 runs for the orders export
rivet journal -c my_export.yaml -e orders

# Last 10 runs
rivet journal -c my_export.yaml -e orders --last 10

# Single run by ID
rivet journal -c my_export.yaml -e orders --run-id orders_20260513T120000.123
```

### Output

Each run is shown as a block:

```
✓ orders_20260513T120000.123  succeeded  12.3s
  files: 3  rows: 150,000  bytes: 4.2 MB
```

`✓` = succeeded · `✗` = failed · `•` = partial / unknown.
Retries, quality issues, schema changes, and first-line error text are appended when present.

Journal entries are persisted to `.rivet_state.db` (SQLite, migration v7) at the end of every run. An empty result means the export has not run yet in this state DB, or `--run-id` does not match any stored run.

---

## `rivet state`

Manage export state (cursors, file manifests, chunk checkpoints).

### `rivet state show`

Show current cursor state for all incremental exports.

```bash
rivet state show --config <PATH>
```

### `rivet state reset`

Reset the cursor for a specific export (next run will re-export all rows).

```bash
rivet state reset --config <PATH> --export <NAME>
```

### `rivet state files`

List files produced by exports.

```bash
rivet state files --config <PATH> [--export <NAME>] [--last <N>]
```

| Flag | Short | Default | Description |
|------|-------|---------|-------------|
| `--export` | `-e` | all | Filter by export name |
| `--last` | `-l` | 50 | Number of recent files |

### `rivet state chunks`

Show chunk checkpoint status for a chunked export.

```bash
rivet state chunks --config <PATH> --export <NAME>
```

### `rivet state reset-chunks`

Clear persisted chunk checkpoint rows (`chunk_run` / `chunk_task`) so the next chunked run starts a fresh plan.

**One export** — same as targeting a single table name:

```bash
rivet state reset-chunks --config <PATH> --export <NAME>
```

**Every “stuck” export in this config** — resets checkpoints only when `chunk_run.status` is still `'in_progress'` (process killed mid-run, concurrent worker left state behind, etc.). Exports whose chunk run already finished normally (`completed`) are skipped. Names that appear in state but were removed from the YAML are skipped with a printed note.

```bash
rivet state reset-chunks --config <PATH> --stuck-checkpoints
```

Alias (same semantics — **checkpoint stuck**, not “last metric row failed”):

```bash
rivet state reset-chunks --config <PATH> --failed
```

Then run `rivet run --config <PATH> --resume` (or a normal run without `--resume`) as needed.

### `rivet state progression`

Show explicit **committed** and **verified** export boundaries (Epic G / [ADR-0008](../adr/0008-export-progression.md)).

```bash
rivet state progression --config <PATH> [--export <NAME>]
```

| Column | Meaning |
|---|---|
| `COMM MODE` / `COMMITTED` | Strategy (`incremental` / `chunked`) and boundary value (cursor string or `chunk #N`) durably committed to the destination |
| `COMMITTED AT` | UTC timestamp of the committing run |
| `VERI MODE` / `VERIFIED` | Same shape, but only advanced by a full-match `rivet reconcile` (zero mismatches, zero unknowns) |

The progression table is **advisory**: it does not gate `rivet run`, `rivet apply`, or `rivet reconcile`. Consumers are operators and external monitoring.

---

## `rivet completions`

Generate shell completion scripts.

```bash
rivet completions <SHELL>
```

| Shell | Command |
|-------|---------|
| Bash | `rivet completions bash > ~/.local/share/bash-completion/completions/rivet` |
| Zsh | `rivet completions zsh > ~/.zfunc/_rivet` |
| Fish | `rivet completions fish > ~/.config/fish/completions/rivet.fish` |
| PowerShell | `rivet completions powershell > _rivet.ps1` |

---

## `rivet schema`

Emit machine-readable schemas for Rivet's data contracts.

```bash
rivet schema config
```

Today `rivet schema config` prints the JSON Schema for the `rivet.yaml` config to stdout. The schema is generated from the running binary's Rust types, so it always matches the config grammar this version accepts. Pipe it to a file and reference it via a `# yaml-language-server: $schema=…` header so VS Code / Neovim's YAML language server highlights invalid keys, suggests enum values, and surfaces required fields as you edit:

```bash
rivet schema config > rivet.schema.json
# then, at the top of rivet.yaml:
# yaml-language-server: $schema=./rivet.schema.json
```

## State backend

By default Rivet keeps all run state (cursors, metrics, manifests, chunk checkpoints, schema drift, run journal, progression) in a SQLite file — `.rivet_state.db` — placed next to the config file. This works for local and single-node deployments.

For **stateless containers / Kubernetes** where the rivet pod is ephemeral or replicated, set `RIVET_STATE_URL` to a PostgreSQL connection string:

```bash
export RIVET_STATE_URL=postgresql://rivet:rivet@localhost:5433/rivet_state
rivet run --config rivet.yaml
```

Rivet creates all state tables automatically on first connect (migrations `v1`–`v7`, same schema version sequence as SQLite). No manual DDL required.

### Docker Compose (local dev)

`docker-compose.yaml` includes a dedicated `postgres-state` service on port **5433** (separate from the source `postgres` service on port 5432 so data and state never mix):

```bash
docker compose up -d postgres-state
export RIVET_STATE_URL=postgresql://rivet:rivet@localhost:5433/rivet_state
rivet run --config pilot.yaml
```

### Security

- Passwords are **redacted** from all log and error messages: `postgresql://user:***@host/db`.
- A `WARN` is emitted when connecting to a non-localhost host **without TLS**. For production use a `sslmode=require` URL:

```bash
export RIVET_STATE_URL="postgresql://rivet:secret@db.internal/rivet_state?sslmode=require"
```

- The `RIVET_STATE_URL` value is **not** embedded in plan artifacts or config files. It is resolved from the environment at runtime.

---

## Environment variables

| Variable | Description |
|----------|-------------|
| `RUST_LOG` | Log level: `error`, `warn`, `info`, `debug`, `trace` |
| `DATABASE_URL` | Commonly used with `url_env: DATABASE_URL` in source config |
| `RIVET_STATE_URL` | PostgreSQL URL for the state backend. When set (and starts with `postgres`), activates the PG backend instead of the default SQLite file. Example: `postgresql://rivet:rivet@localhost:5433/rivet_state` |

### Example: verbose logging

```bash
RUST_LOG=debug rivet run -c my_export.yaml
```

### Example: PostgreSQL state backend

```bash
export RIVET_STATE_URL=postgresql://rivet:rivet@localhost:5433/rivet_state
RUST_LOG=info rivet run -c my_export.yaml
```

---

## Exit codes

| Code | Meaning |
|------|---------|
| 0 | All exports succeeded |
| 1 | One or more exports failed |
| 2 | Config parsing / validation error |