Skip to main content

Module structured_compact

Module structured_compact 

Source
Expand description

Lossless compaction of structured data (JSON / JSON Lines).

Pretty-printed JSON is whitespace-heavy: indentation, spaces after : and ,, and newlines can be 20-50% of the bytes. Code read modes (map, signatures) don’t apply to data files, so JSON historically fell through to the line-based aggressive path and saved ~0% (measured).

This module strips only insignificant whitespace — the bytes that sit outside string literals. It is genuinely lossless:

  • key order is preserved (we operate on the original text, not a parsed serde_json::Value, which would re-sort keys);
  • number formatting is preserved (e.g. 1.0, 1e3, trailing zeros);
  • string contents (including any whitespace inside them) are untouched.

We validate that the input parses as JSON before touching it, so malformed data is never altered, and we only return output that is strictly smaller.

Functions§

compact_json
Compacts a single JSON document by removing insignificant whitespace.
compact_jsonl
Compacts JSON Lines (one JSON value per line). Returns Some only when every non-empty line is valid JSON and the joined result is strictly smaller.
compact_structured
Best-effort lossless compaction selected by file extension.