sample-lines 1.0.0

Command-line tool to sample lines from a file or stdin without replacement. It runs in one pass without reading the whole input into memory using reservoir sampling.
sample-lines-1.0.0 is not a library.

sample

sample is a fast, reliable command-line tool to randomly sample lines from a file or standard input using reservoir sampling. It samples without replacement.

Good for:

  • Downsampling large datasets
  • Sampling logs for debugging
  • Creating reproducible random subsets of data

๐Ÿ“ฆ Installation

If you have Rust installed, you can install sample with:

cargo install sample

Or build from source:

git clone https://github.com/stringertheory/sample.git
cd sample
cargo build --release

๐Ÿš€ Usage

sample -n <NUM> [--seed <SEED>] [FILE]

Here are a few examples:

sample --help
cat data.txt | sample -n 10
sample -n 10 data.txt
sample -n 10 < data.txt
sample -n 10 --seed 17 < data.txt

Options

Option Description
-n <NUM> Number of lines to sample (required)
--seed <SEED> Optional seed for reproducible sampling
-h, --help Show help message

๐Ÿงช Testing

This project includes unit and integration tests:

cargo test

๐Ÿ“ License

Licensed under the MIT License.

๐Ÿค Contributing

Issues and pull requests welcome! If you have an idea, a feature request, or a bug report, feel free to open an issue or PR.