sample-lines-1.2.0 is not a library.
sample-lines
samp is a fast command-line tool to randomly sample
lines from a file or standard input using reservoir
sampling. It
samples uniformly without replacement.
Good for:
- Downsampling large datasets
- Sampling logs for debugging
- Creating reproducible random subsets of data
Installation
If you have Rust installed, you can install samp with:
Or build it from source:
Usage
Here are a few examples:
|
|
Options
| Option | Description |
|---|---|
-n <NUM> |
Number of lines to sample (required) |
--seed <SEED> |
Optional seed for reproducible sampling |
-p, --preserve-headers [N] |
Preserve the first N lines as headers (default: 1 if flag is used) |
-h, --help |
Show help message |
--version |
Show the version number |
Testing
License
Licensed under the MIT License.
Contributing
Issues and pull requests welcome! If you have an idea, a feature request, or a bug report, feel free to open an issue or PR.