parallel only.Expand description
Parallel multi-document YAML parsing via Rayon. Gated by the
parallel feature.
Parallel multi-document YAML parsing — the “MapReduce” path.
For massive multi-document streams (telemetry logs, audit
exports, Kubernetes-resource snapshots, anything emitting ----
separated documents at scale), even the fastest single-threaded
parser is bounded by one CPU core. This module pre-scans the
input on the main thread, splits it into per-document slices,
then dispatches each document to a Rayon worker.
Gated behind the parallel Cargo feature.
§Linear scaling
The pre-scan runs in O(input_len) with no allocation; the
parse-per-document work is the dominant cost and parallelises
naturally across cores. Expect near-linear speedup with the
number of cores up to the point where document size starts to
dominate (very large single documents see less benefit because
one document still parses on one thread).
§Document-boundary contract
The pre-scanner recognises --- document-start markers that
begin at column 0 and are followed by \n, \r, , \t, or
end-of-input. This matches the YAML 1.2.2 §9.1.2 grammar for
c-directives-end. The scanner does not recognise:
---inside a literal (|) or folded (>) block scalar that is column-0-aligned (extremely rare in practice; the YAML spec does not actually permit such a literal because block scalars must indent past the parent)....document-end markers — they are advisory in YAML 1.2, and the document-start scan picks up the next document anyway.
Inputs that violate the column-0 rule fall back to the
conservative single-document slice (everything before the next
valid --- is treated as one document).
§Examples
let yaml = "---\nid: 1\n---\nid: 2\n---\nid: 3\n";
#[derive(serde::Deserialize, Debug)]
struct Record { id: u32 }
let records: Vec<Record> = noyalib::parallel::parse(yaml).unwrap();
assert_eq!(records.len(), 3);§API shape
parse— typed deserialise intoVec<T>.values— dynamic-tree variant returningVec<Value>.split— standalone document-boundary pre-scanner for callers driving their own concurrency primitives.
Names are kept short on purpose — the parallel namespace
already encodes the concurrency contract, so the function
verb stays single-word: parallel::parse reads as one
sentence.
Functions§
- parse
- Deserialise every YAML document in
inputintoT, parsing in parallel via Rayon’s global thread pool. - split
- Split
inputinto per-document byte slices on YAML 1.2---markers. Single-passO(input.len()). Public so callers that drive their own concurrency primitives (async tasks, custom thread pools) can reuse the same boundary scan. - values
- Dynamic-tree variant of
parse: returns aVec<crate::Value>. Use when the caller wants to route documents to different typed handlers post-parse.