rivet-cli 0.16.3

Rivet: PostgreSQL/MySQL/SQL Server → Parquet/CSV (local, S3, GCS, Azure). Crate name rivet-cli; binary rivet.
Documentation
# Row-size RSS sweep — rivet vs ingestr

Generated by `harness/sweep.py` (driven by `../scenarios.yaml`). The point isn't a
single throughput number — it's the **shape of the memory curve** as the row count
grows, measured the same way on the same machine.

## Method
- Fixture: the wide 20-column `content_items`, sliced to each scale with `CREATE
  TABLE … AS SELECT … LIMIT n`.
- Tools: **rivet 0.14.0** (release binary, `mode: full` → local Parquet/snappy) vs
  **ingestr 1.0.43** (`postgres → parquet`, its default 100k-row Arrow batches).
- Per (scale, tool): **1 warmup run discarded, then the median of 3 measured runs.**
- Wall + peak RSS via `/usr/bin/time -l` (external, not the tool's self-report).
- Box: macOS arm64 (a head-to-head — same machine, not the published Linux bench).

## Result

| scale | tool | wall (s) | peak RSS (MB) | RSS vs rivet |
| --- | --- | --- | --- | --- |
| 100,000 | rivet | 3.0 | **47** ||
| 100,000 | ingestr | 3.1 | 882 | **19×** |
| 500,000 | rivet | 14.4 | **56** ||
| 500,000 | ingestr | 10.3 | 1322 | **24×** |
| 1,000,000 | rivet | 30.1 | **70** ||
| 1,000,000 | ingestr | 19.6 | 1261 | **18×** |

## Read it honestly

- **Memory: rivet uses ~18–24× less RAM at every scale.** Both curves are roughly
  *flat* with row count — rivet because it works to a byte budget, ingestr because
  its peak is one fixed 100k-row batch. So the gap is **structural and constant**,
  not a small-data artifact: rivet wins ~20× at 100k *and* at 1M.
- **This sweep does NOT show ingestr "climbing"** — and it shouldn't. ingestr's RSS
  scales with row **width × batch_rows**, not row count. The follow-up that shows it
  *diverging* is a **width sweep** (narrow → wide fixtures), not this row-count one.
- **Wall: ingestr is ~1.5× faster at 500k / 1M** (native pgx + big Arrow batches),
  tied at 100k. We don't hide it — different trade-off: rivet spends ~1.5× wall to
  hold RAM ~20× lower and width-independent.

## Caveats
macOS, single box (not the Linux bench machine). rivet uses `mode: full` here for a
single-file apples-to-apples with ingestr; `mode: chunked` (byte budget) lands a
touch lower still. Re-run with `10_000_000` in `scenarios.yaml` on a box with the
disk for it (~34 GB of wide rows) to extend the curve.