randstream: Reproducible Random Stream Generator and Validator
randstream is a high-performance command-line utility for creating and
validating reproducible, pseudo-random data streams. It is designed for use
cases such as verifying storage integrity, benchmarking I/O performance, or
generating large, arbitrary datasets for testing.
The utility uses a seed to ensure that the generated data is reproducible. In order to be validatable without regeneration on the data, the stream is processed in chunks (32KB by default), and each chunk includes a checksum of 4 bytes at its end for integrity verification. It also uses parallel processing to ensure maximum throughput on modern hardware, while keeping the output identical independently of the number of parallel tasks.
Installation
Download the archive for your platform from the releases page.
Or install the binary with cargo-binstall:
Or install from source:
Usage
randstream has two main commands: generate and validate.
Generating a Random Stream (generate)
Use the generate command to create a reproducible stream of pseudo-random data.
Validating a Random Stream (validate)
Use the validate command to verify that an existing stream has not been
corrupted or altered. The validation process will re-generate the data
internally using the same seed and compare it byte-for-byte with the input
stream.
Examples
Fill a whole block device:
Generate a 100 GB file using a specific seed and 2 parallel tasks:
Validate a previously generated stream: