Expand description
shuflr — streaming shuffled JSONL.
See docs/design/002-revised-plan.md for the authoritative v1 specification,
plus amendments 003-compression-formats.md, 004-convert-subcommand.md,
and 005-serve-multi-transport.md.
Re-exports§
pub use error::Error;pub use error::Result;pub use framing::OnError;pub use framing::Stats;pub use index::Fingerprint;pub use index::IndexFile;pub use sampling::SamplingReader;pub use seed::Seed;
Modules§
- analyze
shuflr analyze— detect source-order locality in a seekable-zstd file that would make--shuffle=chunk-shuffleda bad choice (ML review 02 §1, 002 §6.4).- error
- Library-wide error type, per 002 §10.3.
- framing
- Record-framing primitives: how bytes become lines, and what to do on mishaps.
- index
.shuflr-idxbyte-offset index for--shuffle=index-perm(002 §2.2).- io
- Input sources.
- json_
validate - Minimal JSON syntactic validator used by
shuflr verify --deep. - pipeline
- Engine pipelines. Each module here is a complete shuffle mode or orchestrated flow. v1 modes arrive in the order documented by 002 §2 and 004 §9.
- sampling
- Record-level sampling transforms that wrap any
Readand re-expose aReadwith filtered contents. Three orthogonal modes, composable: - seed
- PRF hierarchy rooted at a master seed (002 §3).
Functions§
- physical_
cores - Physical CPU core count (not logical/SMT). Defaults to 1 on systems
where detection fails. Preferred over
std::thread::available_parallelismfor compute-heavy workloads like zstd compression; seedocs/bench/001-edgar-31gb-gzip.md§thread-scaling.