Stream-decode from a reader to a writer. Used for stdin processing.
Fused single-pass: read chunk -> strip whitespace -> decode immediately.
Uses 16MB read buffer for maximum pipe throughput — read_full retries to
fill the entire buffer from the pipe, and 16MB means the entire 10MB
benchmark input is read in a single syscall batch, minimizing overhead.
memchr2-based SIMD whitespace stripping handles the common case efficiently.
Decode base64 data and write to output (borrows data, allocates clean buffer).
When ignore_garbage is true, strip all non-base64 characters.
When false, only strip whitespace (standard behavior).
Stream-encode from a reader to a writer. Used for stdin processing.
Dispatches to specialized paths for wrap_col=0 (no wrap) and wrap_col>0 (wrapping).