Expand description
NSV (Newline-Separated Values) format implementation for Rust
Fast implementation using memchr, with optional parallel parsing via rayon. See https://nsv-format.org for the specification.
§Parallel Parsing Strategy
For files larger than 64KB, we use a chunked parallel approach:
- Pick N evenly-spaced byte positions (one per CPU core)
- For each, scan forward to the nearest
\n\nrow boundary — O(avg_row_len) - Each worker independently parses its chunk (boundary scan + cell split + unescape)
This works because literal 0x0A bytes in NSV are always structural (never escaped),
so row alignment recovery from any byte position is a trivial forward scan.
The sequential phase is O(N), not O(input_len) — all real work is parallel.
For smaller files, we use a sequential fast path to avoid thread overhead.
Modules§
- util
- Utility functions for NSV structural operations.
Structs§
- Reader
- Streaming NSV reader. Yields one complete row of byte vectors at a time.
- Warning
- A single warning found during validation.
- Writer
- Streaming NSV writer. Wraps any
W: Writeand writes one row at a time.
Enums§
Constants§
Functions§
- check
- Report edge cases in raw NSV input without altering parsing behavior.
- decode
- Decode an NSV string into a seqseq.
- decode_
bytes - Decode raw bytes into a seqseq of byte slices. No encoding assumption — works with any ASCII-compatible encoding.
- decode_
bytes_ projected - Decode only the specified columns from raw bytes.
- encode
- Encode a seqseq into an NSV string.
- encode_
bytes - Encode a seqseq of byte vectors into raw NSV bytes.
- escape
- Escape a single NSV cell.
- escape_
bytes - Escape a single raw cell (byte-level).
- unescape
- Unescape a single NSV cell.
- unescape_
bytes - Unescape a single raw cell (byte-level).