disty
disty is a quick CLI for getting an idea of the distribution of a list of numbers.
disty is a rewrite of Michael Knyszek's distx, which is itself an extension of Austin Clements' dist. I use distx for all sorts of quick numeric checks, from checking the distribution of database segments to analyzing request latency. I ran into performance issues for processing large lists (>1m records) which inspired the rewrite.
Compared to distx, this version:
-
Is very fast: ~55x faster in local testing. This boils down to parallelizing the KDE plotting, using
mmapto parallelize parsing the number list, and reducing unnecessary copying.| ) ) ) ) -
Has marginally better plotting, which mostly comes down to setting a higher resolution than distx uses by default.

-
Has less features. I haven't ported the output options or alternative plotting (CDFs), because I don't really use them.
Installing
Install via Cargo.
# From Crates.io
# From source
Usage
)
Development
Running Tests
Running Benchmarks
The project includes criterion benchmarks for parsing, statistics computation, and KDE evaluation:
# Run all benchmarks
# Run specific benchmark suite
# Run benchmarks with different sample sizes
Benchmarks test with various input sizes (1K, 10K, 100K, 1M elements) to understand performance characteristics at different scales.