kobold-csv 0.1.0

# kobold-csv

**Forensic CSV/delimited evidence for COBOL record migration, reconciliation, ETL, and analyst review.**
Export raw COBOL records + their copybook into deterministic delimited text (CSV / `|` / TSV) that preserves
**both the semantic value AND the storage truth** — `pic` / `offset` / `length` / `raw_hex` /
`copybook_hash` / `record_hash` / `findings` — parse a Compact extract back into the **exact** record bytes,
prove the round trip, and compute the batch **control totals** banks reconcile on. **Fail-closed.**

> [!IMPORTANT]
> **kobold-csv is clean-room and zero-libcob.** It links no COBOL runtime, contains no libcob-derived or
> GnuCOBOL-derived code, and depends on **no crate at all** (`std`-only) — in particular not on
> `gnucobol-rs`, `gnucobol-rs-json`, `kobold-json`, or `kobold-xml`. It is **NOT a generic CSV library** and
> **NOT a GnuCOBOL 3.2 parity claim.** It answers *"what should forensic delimited-file evidence for a COBOL
> record migration / reconciliation look like?"* Evidence here is the `KOBOLD.CSV.*` court namespace.

## Why

A batch migration must be **auditable**: when a COBOL file becomes CSV — for an analyst, an ETL pipeline, or
a target-system reload — you must be able to prove (a) what each field *means*, (b) that no byte was silently
lost, coerced, or truncated, and (c) that the money still balances. kobold-csv carries the raw bytes
alongside the decoded value and **emits a `Finding` instead of silently coercing** when a numeric field is
not valid digits, when a value overflows its PIC, when a quoted cell is malformed, or when a header does not
match the copybook.

## Export modes (`KOBOLD.CSV.EXPORT.1`)

| Mode       | Shape | Header |
|------------|-------|--------|
| `Compact`  | one row **per record** | field names — the classic `ACCOUNT_NO,BALANCE,STATUS` extract |
| `Audit`    | **tall**, one row per (record, field) | `field,value,pic,offset,length,raw_hex,findings` |
| `Evidence` | tall + custody hashes | `record_hash,copybook_hash,field,value,pic,offset,length,raw_hex,findings` |

A wide row cannot carry per-field metadata, so custody data goes **tall**. Hashes are `sha256:`-prefixed.
Numeric rendering: leading zeros stripped, decimal point at the implied `scale`, leading `-` if negative
(zoned sign overpunch recognized); `raw_hex` is lowercase hex of the exact field bytes. A non-digit numeric
field emits a `NUMERIC_NONDIGIT` finding — never a silent coercion — and `raw_hex` preserves the truth.

## Quoting (`KOBOLD.CSV.ESCAPE`)

RFC-4180 style: a field is quoted when it contains the delimiter, the quote char, CR, or LF; embedded quotes
are doubled (`"` → `""`). The writer (`write_field`) and the **fail-closed** reader (`parse_row`) are exact
inverses — an unterminated quote or stray text after a closing quote is a `Finding`, never a guess. Dialects:
`Dialect::csv()` (comma), `Dialect::pipe()` (`|`), `Dialect::tab()` (TSV).

## Courts

- `KOBOLD.CSV.EXPORT.1` — records + copybook → delimited evidence (Compact / Audit / Evidence).
- `KOBOLD.CSV.PARSE.1` — a Compact extract + copybook → reconstructed record bytes, **fail-closed** (a value
  too long for its field, a non-numeric value into a numeric field, a header mismatch, or a wrong column
  count yields a `Finding`, not bytes).
- `KOBOLD.CSV.ROUNDTRIP.1` — records → Compact CSV → `parse_into` → **identical** bytes (a value-only extract
  that cannot preserve a non-canonical stored form is reported honestly, not faked).
- `KOBOLD.CSV.DIFF.1` — row/field-wise differences between a source table and a target table (columns aligned
  by name); a `DiffEntry { row, field, source, target }` per differing cell.
- `KOBOLD.CSV.CONTROLTOTAL.1` — exact **integer-scaled** field sums (no float drift) + record count: the
  batch control totals a reconciliation balances on.

## Dependency-free

No crate dependencies, `std`-only. SHA-256 (for `copybook_hash` / `record_hash`) is a small pure-Rust
implementation in `src/sha256.rs`, tested against published vectors. `#![forbid(unsafe_code)]`.

License: Apache-2.0.