rivet-cli 0.9.4

Rivet: PostgreSQL/MySQL/SQL Server → Parquet/CSV (local, S3, GCS, Azure). Crate name rivet-cli; binary rivet.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
# CLI Reference

## Global

```
rivet [--json-errors] [COMMAND] [OPTIONS]
```

```bash
rivet --version       # print version
rivet --help          # show help
```

| Flag | Description |
|------|-------------|
| `--json-errors` | Output errors as `{"error":"..."}` JSON to stderr instead of plain text. Applies to all subcommands. Useful for machine-readable orchestration and CI pipelines. |

```bash
rivet --json-errors run --config rivet.yaml
rivet run --config rivet.yaml --json-errors   # global flag accepted in any position
```

---

## `rivet run`

Run export jobs defined in a config file.

```bash
rivet run --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Run only a specific export by name |
| `--validate` | | bool | Validate output file row count after writing |
| `--reconcile` | | bool | Run `COUNT(*)` on source query and compare with exported rows |
| `--resume` | | bool | Resume an in-progress chunked export. Exits non-zero with an actionable message if no in-progress checkpoint exists — run without `--resume` to start fresh, or `rivet state reset-chunks` to clear a stuck run |
| `--force` | | bool | Override safety gates that would otherwise refuse the run. Today: with `--resume`, allows starting against a destination prefix whose `_SUCCESS` marker is already present (ADR-0012 M8). Without it, resume against a complete run refuses so an operator cannot accidentally re-export over a verified dataset |
| `--parallel-exports` | | bool | Run all exports concurrently (ignored with `--export`) |
| `--parallel-export-processes` | | bool | Run each export as a separate child process |
| `--summary-output` | | PATH | Write run aggregate to this file as JSON |
| `--json` | | bool | Print run aggregate to stdout as JSON after the run |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable). Substitutes `${key}` in queries |

### Examples

```bash
# Basic run
rivet run -c my_export.yaml

# Run with validation and reconciliation
rivet run -c my_export.yaml --validate --reconcile

# Run a single export
rivet run -c my_export.yaml -e orders_daily

# Resume interrupted chunked export
rivet run -c my_export.yaml -e big_table --resume

# Parameterized query
rivet run -c my_export.yaml -p region=us-east -p year=2026

# Parallel exports (all at once)
rivet run -c my_export.yaml --parallel-exports

# Parallel exports — one OS process per export, parent-side cards UI
rivet run -c my_export.yaml --parallel-export-processes
```

### `--parallel-export-processes` — one card per export

`--parallel-exports` runs every export in the same Rivet process on a separate
thread. That keeps logs simple, but every export shares the same source
connection pool / global allocator, and a panic in one export tears the whole
run down.

`--parallel-export-processes` instead spawns one `rivet` *child process* per
export — full memory and connection isolation, no shared allocator. The parent
process owns the screen and renders one **card** per export with a live
progress bar, ETA, row count, and elapsed time. When a child finishes, the
progress bar is replaced *in place* with the export's final metrics, so the
on-screen card becomes a self-contained per-export summary; below the cards a
single aggregated `Run summary` block prints once for the whole run.

![Parallel cards UI](../gifs/parallel-cards.gif)

```text
── orders ──────────────────────────────────────────────
  run_id:    orders_20260427T120000.069
  status:    running
  mode:      chunked
  tuning:    profile=balanced (default)
  batch_size: 1,000
  [====================>--------------] 11/20 chunks | 1.1M rows | 00:02:06 | ETA 66s
```

Children emit structured NDJSON events (`Started`, `ProgressInit`,
`Progress`, `Finished`) on stdout via the `RIVET_IPC_EVENTS=1` env var; the
parent multiplexes them into the cards UI. If a child crashes without a
`Finished` event, its card is marked `failed` with a synthetic warning so a
silent crash never leaves the run looking healthy.

---

## `rivet plan`

![Plan / apply walkthrough](../gifs/plan-apply.gif)

Generate a sealed execution plan artifact — no data is exported.

`rivet plan` runs preflight analysis (row estimate, index check, sparsity), computes chunk boundaries for chunked exports, snapshots the current cursor for incremental exports, and writes everything to a `PlanArtifact` JSON file. The artifact can be reviewed, committed, stored as a CI artifact, or passed to `rivet apply`.

```bash
rivet plan --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `--config` | `-c` | string || Path to YAML config file **(required)** |
| `--export` | `-e` | string | all | Plan only a specific export |
| `--param` | `-p` | KEY=VALUE || Query parameter (repeatable) |
| `--output` | `-o` | string | stdout | Write plan JSON to this file |
| `--format` | | `pretty`\|`json` | `pretty` | `pretty` prints a human summary; `json` writes the full artifact |

### Examples

```bash
# Human-readable summary (no file written)
rivet plan -c rivet.yaml

# Write full JSON artifact to a file
rivet plan -c rivet.yaml --format json --output plan.json

# Plan a single export
rivet plan -c rivet.yaml -e orders --format json -o orders_plan.json
```

### Pretty output (example)

```
  Plan ID  : a1b2c3d4e5f6...
  Created  : 2026-04-14 10:00:00 UTC
  Expires  : 2026-04-15 10:00:00 UTC
  Export   : orders
  Strategy : chunked
  Chunks   : 42
  Row est. : ~2,100,000
  Verdict  : Acceptable
  Profile  : balanced
  Warnings :
    • sparse id range: ~12% fill
  Resources:
    Batch size   :  10,000 rows
    Batch memory : ~2 MB (narrow) – ~95 MB (wide)
    RSS guard    : 4,096 MB
    Throttle     : 50 ms between batches
  Output   : local → ./out
  Format   : parquet + zstd
```

The **Resources** section shows:

| Line | Meaning |
|---|---|
| `Batch size` | Rows fetched per query. `adaptive` if `batch_size_memory_mb` is set. |
| `Batch memory` | Estimated range: narrow (~200 B/row) to wide (~10 KB/row) tables. |
| `RSS guard` | Process-level RSS threshold. Fetching pauses if exceeded (0 = disabled). |
| `Throttle` | Delay between batches to reduce source load (omitted when 0). |
| `⚠ Wide tables may use…` | Shown when the upper bound exceeds 128 MB/batch — consider `batch_size_memory_mb` or a lower `batch_size`. |

### Memory estimate methodology — advisory only

> **The memory estimate in `rivet plan` is a heuristic, not a guarantee.** Treat it as a planning signal, not a hard prediction.

`rivet plan` does not sample the table. It computes the batch memory range using two fixed assumptions:

- **Narrow bound** — 200 B per row (all INTEGER / BIGINT / TIMESTAMPTZ columns)
- **Wide bound** — 10 KB per row (all TEXT / JSONB / BYTEA columns)

For most real tables the actual per-row size falls between these two bounds. The narrow bound is a reliable floor for numeric-heavy schemas; the wide bound is a reliable ceiling for text-heavy schemas.

**What the estimate does not capture:**

| Factor | Effect on actual RSS |
|--------|---------------------|
| Highly variable TEXT/BLOB values | Actual batches can be 2–10× the wide estimate |
| Sparse nullable columns | Actual batches will be below the narrow estimate |
| Compression buffers in the Parquet writer | Adds 50–200 MB on top of the Arrow batch size |
| Tokio runtime, connection pool, jemalloc | Adds 50–150 MB baseline overhead |

**How to get a precise number:** run `rivet run` once with `RUST_LOG=info` against a representative sample, then check the `peak_rss` in the logged summary or in `rivet state metrics`. That measured value from your actual data is more reliable than any pre-run estimate.

**Planned enhancement:** a future `rivet plan --sample N` flag will query up to N rows to compute a data-driven row-width estimate. This will narrow the uncertainty for variable-width schemas without a full table scan.

### Plan artifact structure

The JSON artifact (`--format json`) contains:

```json
{
  "rivet_version": "0.4.0",
  "plan_id": "a1b2c3d4...",
  "created_at": "2026-04-14T10:00:00Z",
  "expires_at": "2026-04-15T10:00:00Z",
  "export_name": "orders",
  "strategy": "chunked",
  "plan_fingerprint": "0123456789abcdef",
  "resolved_plan": { ... },
  "computed": {
    "chunk_ranges": [[1, 50000], [50001, 100000], "..."],
    "chunk_count": 42,
    "cursor_snapshot": null,
    "row_estimate": 2100000
  },
  "diagnostics": {
    "verdict": "Acceptable",
    "warnings": ["sparse id range: ~12% fill"],
    "recommended_profile": "balanced"
  }
}
```

> **Security note**: `resolved_plan` embeds the full source connection config including credentials. Treat plan files with the same care as your rivet config file.

---

## `rivet apply`

Execute a previously-generated plan artifact.

`rivet apply` deserializes the artifact, validates staleness and cursor integrity, then executes the export using the pre-computed chunk boundaries from the artifact — no `SELECT min/max` queries are run against the source.

```bash
rivet apply <PLAN_FILE> [OPTIONS]
```

| Argument/Flag | Type | Description |
|---|---|---|
| `PLAN_FILE` | string | Path to plan JSON file **(required)** |
| `--force` | bool | Skip staleness check (allow plans older than 24 h) |

### Staleness rules

| Plan age | Behavior |
|----------|----------|
| < 1 hour | Proceeds silently |
| 1–24 hours | Warns and proceeds |
| > 24 hours | Rejects — use `--force` to override |

### Cursor drift (Incremental exports)

If another `rivet run` completed after the plan was generated, the cursor will have advanced. `rivet apply` detects this and rejects the artifact to prevent re-exporting already-exported rows. Regenerate with `rivet plan`.

### Examples

```bash
# Apply the plan
rivet apply plan.json

# Apply an old plan (override staleness check)
rivet apply plan.json --force
```

### What apply does NOT do

- Does not re-read the config file
- Does not re-run preflight queries
- Does not recompute chunk boundaries (uses pre-computed ranges from the artifact)
- Does not enforce preflight verdict (diagnostics are advisory — see ADR-0005)

### State location

`rivet apply` opens `.rivet_state.db` from the directory containing the plan file. Place the plan file alongside the config file, or in the same directory, to ensure the correct state database is used.

---

## `rivet validate`

Re-run manifest-aware verification against an existing destination — **no extraction**.

```bash
rivet validate --config <PATH> [OPTIONS]
```

The same M5/M6 checks `rivet run --validate` performs at end-of-run, exposed as a standalone command for between-run polling and triage. Reads `manifest.json` + `_SUCCESS` at the destination and head-checks every committed part for presence and recorded `size_bytes`. The **source is not queried** (use `rivet reconcile` for that). See [ADR-0013](../adr/0013-trust-flag-contract.md) §"Subcommand carveouts" and [ADR-0012](../adr/0012-cloud-manifest-contract.md) M5/M6.

By default `validate` resolves the destination prefix the same way `run` does (`{date}` becomes today's UTC date). Use `--date`, `--run-id`, or `--prefix` to point at a prior run instead.

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Validate only a specific export by name |
| `--format` | | string | Override the format used to resolve the part layout |
| `--output` | `-o` | PATH | Write the verification report to this file as JSON |
| `--date` | | YYYY-MM-DD | Resolve `{date}` to this date instead of today (UTC) |
| `--run-id` | | string | Point at a prior run's prefix by run id |
| `--prefix` | | string | Point at an explicit destination prefix |

Exits non-zero when the manifest references a part that is missing or whose size does not match. A legacy prefix (no manifest) falls back to the M6 reduced-guarantee path and is labelled `legacy_run: true`.

### Examples

```bash
# Verify today's run at the configured destination
rivet validate -c my_export.yaml

# Verify a prior run by id, JSON report to a file
rivet validate -c my_export.yaml --run-id orders_20260521T120000 -o verdict.json
```

## `rivet reconcile`

![Chunked + reconcile + repair walkthrough](../gifs/reconcile-repair.gif)

Partition/window reconciliation — re-runs per-chunk `COUNT(*)` on the source and compares with the stored per-chunk row counts from the last run. Surfaces **matches**, **mismatches**, and **repair candidates** without re-exporting data (Epic F).

```bash
rivet reconcile --config <PATH> --export <NAME> [OPTIONS]
```

| Flag | Short | Type | Description |
|---|---|---|---|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Export name to reconcile **(required)** |
| `--format` | | `pretty` \| `json` | Output format (default `pretty`) |
| `--output` | `-o` | string | Write JSON report to this file (use with `--format json`) |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable) |

### Scope (v1)

- **Chunked exports** — supported. Requires a previous run with `chunk_checkpoint: true` so per-chunk ranges and row counts are persisted in `.rivet_state.db`.
- **Time-window** — returns an error ("use chunked with `chunk_by_days`" for partition reconcile).
- **Snapshot / Incremental** — no natural partitions; use `rivet run --reconcile` for a whole-export count check.

### What it does

For each completed chunk task from the latest chunk run:

1. Rebuilds the exact chunk query the pipeline used (same `WHERE` predicate, same dense/range shape — `build_chunk_query_sql`).
2. Runs `SELECT COUNT(*) FROM (<chunk_query>) AS _rc`.
3. Compares the source count with the stored `rows_written` for that chunk.

Each partition is classified as:

- `match` — source and exported counts are equal.
- `mismatch` — counts differ; partition is a repair candidate (note includes `diff`).
- `unknown` — one of the counts is unavailable (chunk never completed, unparseable chunk keys); also a repair candidate.

### Examples

```bash
# Human-readable summary
rivet reconcile -c my_export.yaml -e orders

# JSON report to file
rivet reconcile -c my_export.yaml -e orders --format json -o reconcile.json
```

Reports are **advisory** — same policy as prioritization (ADR-0006) and plan artifacts (ADR-0005). They surface what needs repair; they do not re-export on their own.

### Verification strategy tradeoffs

Rivet has three verification mechanisms at different cost/precision tradeoffs:

| Mechanism | What it checks | Cost | When to use |
|---|---|---|---|
| `rivet run --reconcile` | `COUNT(*)` source vs exported rows for the whole export | 1 extra query | Snapshot / incremental exports; cheap sanity check after every run |
| `rivet reconcile` | Per-chunk `COUNT(*)` source vs stored chunk row counts | 1 query per chunk | Chunked exports with `chunk_checkpoint: true`; catches partial writes in individual chunks |
| `rivet check --type-report` | Column type fidelity + warehouse compatibility | 1 LIMIT-0 probe | Before first export of a new table; after source schema changes |

**Rule of thumb:**
- Use `--reconcile` always for snapshot/incremental exports — cost is negligible.
- Use `rivet reconcile` for chunked exports if data correctness is critical or the source is volatile.
- Use `rivet repair` only when `rivet reconcile` surfaces mismatches — it re-exports only the flagged chunks.

---

## `rivet repair`

Targeted repair of chunks flagged by reconcile. Prints a `RepairPlan` by default; with `--execute`, re-exports only the flagged chunk ranges (Epic H, [ADR-0009 RR1–RR8](../adr/0009-reconcile-and-repair-contracts.md)).

```bash
rivet repair --config <PATH> --export <NAME> [OPTIONS]
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Export name to repair (must be `mode: chunked`) **(required)** |
| `--report` | | path | Path to a reconcile JSON report (from `rivet reconcile --format json`). Omit to run reconcile in-process against the latest chunk run |
| `--execute` | | bool | Actually re-export the flagged chunk ranges. Without this flag, the plan is printed and nothing is executed (RR2) |
| `--format` | | `pretty` \| `json` | Output format for the plan / post-execute report (default `pretty`) |
| `--output` | `-o` | string | Write plan / report JSON to this file (with `--format json`) |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable) |

### Examples

```bash
# Dry run from the latest reconcile — prints the plan, nothing executes
rivet repair -c my_export.yaml -e orders

# Dry run from a saved reconcile report
rivet repair -c my_export.yaml -e orders --report reconcile.json

# Execute — re-runs only the flagged chunks
rivet repair -c my_export.yaml -e orders --report reconcile.json --execute
```

### What `--execute` does and does not do

- Re-runs only the flagged chunk ranges via `ChunkSource::Precomputed` — same SQL shape as extraction and reconcile (RR3).
- Writes **new** files alongside originals with `<export>_<ts>_chunk<idx>.<ext>` (RR5). Rivet does **not** delete or overwrite prior files. Downstream deduplication (or a versioned output prefix) is the operator's responsibility.
- Leaves `last_committed_*` untouched (RR4) — repair is corrective, not commitment. `last_verified_*` re-advances only if a subsequent clean `rivet reconcile` runs.

---

## `rivet check`

Preflight analysis: diagnose source health, estimate row counts, check indexes, recommend tuning. With `--type-report`, also introspects column types and validates them against a target warehouse.

```bash
rivet check --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |
| `--export` | `-e` | string | Check only a specific export |
| `--param` | `-p` | KEY=VALUE | Query parameter (repeatable) |
| `--type-report` | | bool | Run a type fidelity report: show each column's source type, Rivet type, Arrow type, and fidelity |
| `--strict` | | bool | Exit non-zero if any column mapping is lossy or unsupported (use with `--type-report`) |
| `--json` | | bool | Emit type report as newline-delimited JSON instead of a table |
| `--target` | | string | Validate types against a warehouse target. Currently supported: `bigquery` |

### Examples

```bash
# Standard preflight check
rivet check -c my_export.yaml

# Type fidelity report (human-readable table)
rivet check -c my_export.yaml --type-report

# Type report with BigQuery compatibility column
rivet check -c my_export.yaml --type-report --target bigquery

# Type report as JSON — pipe-friendly, one object per export
rivet check -c my_export.yaml --type-report --json

# Strict mode — exits 1 if any lossy or unsupported mapping exists
rivet check -c my_export.yaml --type-report --strict
```

### Type report output

```
Export: orders  [target: bigquery]

  Column        Source type        Rivet type       Arrow type            Fidelity        Target type   Status
  ----------    ----------------   ---------------  --------------------  --------------  -----------   ------
  id            int4               int4             Int32                 exact           INT64         ok
  amount        numeric(15,4)      decimal(15,4)    Decimal128(15, 4)     exact           NUMERIC       ok
  created_at    timestamptz        timestamp_tz     Timestamp(us, UTC)    exact           TIMESTAMP     ok
  metadata      jsonb              json             Utf8                  logical_string  STRING        ok ~
  tags          text[]             list<text>       List(Utf8)            exact           REPEATED…     ok
```

Fidelity levels:

| Level | Meaning |
|---|---|
| `exact` | Round-trips without loss |
| `compatible` | Structurally compatible; minor representation difference |
| `logical_string` | Serialized to STRING/text (no native Arrow type) |
| `lossy` | Precision or range reduction |
| `unsupported` | No mapping available; column is skipped |

Output includes: table existence, estimated row count, index analysis, tuning recommendation.

---

## `rivet doctor`

Verify source and destination connectivity/auth before running exports.

```bash
rivet doctor --config <PATH>
```

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--config` | `-c` | string | Path to YAML config file **(required)** |

### Example

```bash
rivet doctor -c my_export.yaml
```

Output:

```
rivet doctor: verifying auth for config 'my_export.yaml'

[OK]  Config parsed successfully
[OK]  Source auth (Postgres)
[OK]  Destination Local(./output)

All checks passed.
```

When `tls:` is omitted from `source:`, an extra line appears before the source check: `[WARN] source: TLS is not enforced — credentials and result rows cross the network in plaintext.` Silence it by adding `tls: { mode: disable }` (local dev only) or fix it for prod with `tls: { mode: verify-full }` — see [reference/config.md § TLS](config.md#tls).

---

## `rivet init`

Generate a YAML config scaffold (or a machine-readable discovery artifact) by connecting to PostgreSQL or MySQL and introspecting tables (read-only). Does **not** run an export. YAML scaffolds include **`meta_columns`** (`exported_at` / `row_hash` **on** by default); scaffolds with heuristic **`mode: chunked`** also include **`chunk_checkpoint: true`** — see [init.md](init.md).

```bash
rivet init (--source <URL> | --source-env <ENV_VAR> | --source-file <PATH>)
           [--table <NAME>] [--schema <NAME>] [-o <PATH>] [--discover]
```

Exactly one of `--source`, `--source-env`, `--source-file` must be provided (enforced by the argument group).

| Flag | Short | Type | Description |
|------|-------|------|-------------|
| `--source` | | string | Connection URL: `postgresql://` or `mysql://`. **Visible in shell history / `ps`** — avoid in production |
| `--source-env` | | env var name | Name of an env var that holds the URL (e.g. `DATABASE_URL`). URL never hits the command line. **Recommended.** |
| `--source-file` | | path | Path to a file containing just the URL on one line. Credentials stay on disk |
| `--table` | | string | Single table; optional `schema.table` on PostgreSQL. Omit to scaffold **all** tables/views in a Postgres schema or MySQL database |
| `--schema` | | string | **PostgreSQL:** schema to list (default `public`). **MySQL:** database name if not in the URL, or override URL database |
| `--output` | `-o` | string | Write output to file (default: print to stdout) |
| `--discover` | | bool | Emit a machine-readable JSON discovery artifact instead of YAML — includes ranked cursor/chunk candidates, row estimates, on-disk sizes, and coalesce-fallback hints |

**Examples**

```bash
# One table → one export block
rivet init --source-env DATABASE_URL --table orders -o rivet.yaml

# PostgreSQL: entire schema (default public)
rivet init --source-env DATABASE_URL --schema public -o all_public.yaml

# MySQL: entire database from URL path
rivet init --source-file /run/secrets/mysql_url -o all_mydb.yaml

# JSON discovery artifact — ranked cursor/chunk candidates per table
rivet init --source-env DATABASE_URL --schema public --discover -o discovery.json
```

Narrative guide, heuristics, and Docker Compose examples: **[init.md](init.md)**.

---

## `rivet metrics`

Show export run history (duration, row count, file size, status).

```bash
rivet metrics --config <PATH> [OPTIONS]
```

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `--config` | `-c` | string || Config file **(required)** |
| `--export` | `-e` | string | all | Filter by export name |
| `--last` | `-l` | integer | 20 | Number of recent runs to show |

### Example

```bash
rivet metrics -c my_export.yaml --last 10
rivet metrics -c my_export.yaml -e orders_daily
```

---

## `rivet journal`

Inspect the structured run journal for an export — per-run event log with status, file/row/byte summary, retries, quality issues, schema changes, and the first error line.

```bash
rivet journal --config <PATH> --export <NAME> [OPTIONS]
```

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `--config` | `-c` | string || Path to YAML config file **(required)** |
| `--export` | `-e` | string || Export name to inspect **(required)** |
| `--last` | `-l` | integer | 5 | Number of recent runs to show |
| `--run-id` | | string || Show a single specific run by ID |

### Examples

```bash
# Last 5 runs for the orders export
rivet journal -c my_export.yaml -e orders

# Last 10 runs
rivet journal -c my_export.yaml -e orders --last 10

# Single run by ID
rivet journal -c my_export.yaml -e orders --run-id orders_20260513T120000.123
```

### Output

Each run is shown as a block:

```
✓ orders_20260513T120000.123  succeeded  12.3s
  files: 3  rows: 150,000  bytes: 4.2 MB
```

`✓` = succeeded · `✗` = failed · `•` = partial / unknown.
Retries, quality issues, schema changes, and first-line error text are appended when present.

Journal entries are persisted to `.rivet_state.db` (SQLite, migration v7) at the end of every run. An empty result means the export has not run yet in this state DB, or `--run-id` does not match any stored run.

---

## `rivet state`

Manage export state (cursors, file manifests, chunk checkpoints).

### `rivet state show`

Show current cursor state for all incremental exports.

```bash
rivet state show --config <PATH>
```

### `rivet state reset`

Reset the cursor for a specific export (next run will re-export all rows).

```bash
rivet state reset --config <PATH> --export <NAME>
```

### `rivet state files`

List files produced by exports.

```bash
rivet state files --config <PATH> [--export <NAME>] [--last <N>]
```

| Flag | Short | Default | Description |
|------|-------|---------|-------------|
| `--export` | `-e` | all | Filter by export name |
| `--last` | `-l` | 50 | Number of recent files |

### `rivet state chunks`

Show chunk checkpoint status for a chunked export.

```bash
rivet state chunks --config <PATH> --export <NAME>
```

### `rivet state reset-chunks`

Clear persisted chunk checkpoint rows (`chunk_run` / `chunk_task`) so the next chunked run starts a fresh plan.

**One export** — same as targeting a single table name:

```bash
rivet state reset-chunks --config <PATH> --export <NAME>
```

**Every “stuck” export in this config** — resets checkpoints only when `chunk_run.status` is still `'in_progress'` (process killed mid-run, concurrent worker left state behind, etc.). Exports whose chunk run already finished normally (`completed`) are skipped. Names that appear in state but were removed from the YAML are skipped with a printed note.

```bash
rivet state reset-chunks --config <PATH> --stuck-checkpoints
```

Alias (same semantics — **checkpoint stuck**, not “last metric row failed”):

```bash
rivet state reset-chunks --config <PATH> --failed
```

Then run `rivet run --config <PATH> --resume` (or a normal run without `--resume`) as needed.

### `rivet state progression`

Show explicit **committed** and **verified** export boundaries (Epic G / [ADR-0008](../adr/0008-export-progression.md)).

```bash
rivet state progression --config <PATH> [--export <NAME>]
```

| Column | Meaning |
|---|---|
| `COMM MODE` / `COMMITTED` | Strategy (`incremental` / `chunked`) and boundary value (cursor string or `chunk #N`) durably committed to the destination |
| `COMMITTED AT` | UTC timestamp of the committing run |
| `VERI MODE` / `VERIFIED` | Same shape, but only advanced by a full-match `rivet reconcile` (zero mismatches, zero unknowns) |

The progression table is **advisory**: it does not gate `rivet run`, `rivet apply`, or `rivet reconcile`. Consumers are operators and external monitoring.

---

## `rivet completions`

Generate shell completion scripts.

```bash
rivet completions <SHELL>
```

| Shell | Command |
|-------|---------|
| Bash | `rivet completions bash > ~/.local/share/bash-completion/completions/rivet` |
| Zsh | `rivet completions zsh > ~/.zfunc/_rivet` |
| Fish | `rivet completions fish > ~/.config/fish/completions/rivet.fish` |
| PowerShell | `rivet completions powershell > _rivet.ps1` |

---

## `rivet schema`

Emit machine-readable schemas for Rivet's data contracts.

```bash
rivet schema config
```

Today `rivet schema config` prints the JSON Schema for the `rivet.yaml` config to stdout. The schema is generated from the running binary's Rust types, so it always matches the config grammar this version accepts. Pipe it to a file and reference it via a `# yaml-language-server: $schema=…` header so VS Code / Neovim's YAML language server highlights invalid keys, suggests enum values, and surfaces required fields as you edit:

```bash
rivet schema config > rivet.schema.json
# then, at the top of rivet.yaml:
# yaml-language-server: $schema=./rivet.schema.json
```

## State backend

By default Rivet keeps all run state (cursors, metrics, manifests, chunk checkpoints, schema drift, run journal, progression) in a SQLite file — `.rivet_state.db` — placed next to the config file. This works for local and single-node deployments.

For **stateless containers / Kubernetes** where the rivet pod is ephemeral or replicated, set `RIVET_STATE_URL` to a PostgreSQL connection string:

```bash
export RIVET_STATE_URL=postgresql://rivet:rivet@localhost:5433/rivet_state
rivet run --config rivet.yaml
```

Rivet creates all state tables automatically on first connect (migrations `v1`–`v7`, same schema version sequence as SQLite). No manual DDL required.

### Docker Compose (local dev)

`docker-compose.yaml` includes a dedicated `postgres-state` service on port **5433** (separate from the source `postgres` service on port 5432 so data and state never mix):

```bash
docker compose up -d postgres-state
export RIVET_STATE_URL=postgresql://rivet:rivet@localhost:5433/rivet_state
rivet run --config pilot.yaml
```

### Security

- Passwords are **redacted** from all log and error messages: `postgresql://user:***@host/db`.
- A `WARN` is emitted when connecting to a non-localhost host **without TLS**. For production use a `sslmode=require` URL:

```bash
export RIVET_STATE_URL="postgresql://rivet:secret@db.internal/rivet_state?sslmode=require"
```

- The `RIVET_STATE_URL` value is **not** embedded in plan artifacts or config files. It is resolved from the environment at runtime.

---

## Environment variables

| Variable | Description |
|----------|-------------|
| `RUST_LOG` | Log level: `error`, `warn`, `info`, `debug`, `trace` |
| `DATABASE_URL` | Commonly used with `url_env: DATABASE_URL` in source config |
| `RIVET_STATE_URL` | PostgreSQL URL for the state backend. When set (and starts with `postgres`), activates the PG backend instead of the default SQLite file. Example: `postgresql://rivet:rivet@localhost:5433/rivet_state` |

### Example: verbose logging

```bash
RUST_LOG=debug rivet run -c my_export.yaml
```

### Example: PostgreSQL state backend

```bash
export RIVET_STATE_URL=postgresql://rivet:rivet@localhost:5433/rivet_state
RUST_LOG=info rivet run -c my_export.yaml
```

---

## Exit codes

| Code | Meaning |
|------|---------|
| 0 | All exports succeeded |
| 1 | One or more exports failed |
| 2 | Config parsing / validation error |