Powerlaw
A Rust library and command-line tool for analyzing power-law distributions in empirical data.
Overview
powerlaw is a high-performance Rust library developed to assist in parameter estimation and hypothesis testing of power-law distributed data. Such distributions are of interest in numerous fields of study, from natural to social sciences.
The methodology is heavily based on the techniques and statistical framework described in the paper 'Power-Law Distributions in Empirical Data' by Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman.
Features
- Parameter Estimation: Estimates the parameters (
x_min,alpha) of a power-law distribution from data. - Goodness-of-Fit: Uses the Kolmogorov-Smirnov (KS) statistic to find the best-fitting parameters.
- Hypothesis Testing: Performs a hypothesis test to determine if the power-law model is a plausible fit for the data.
- High Performance: Computationally intensive tasks are parallelized using Rayon for significant speedups.
- Dual Use: Can be used as a simple command-line tool or as a library in other Rust projects.
Requirements
Rust 2021 or greater.
Installation
You can install the CLI tool directly from the Git repository:
Or from crates.io:
CLI Usage
The powerlaw CLI provides two main subcommands: fit and test.
fit subcommand
Use fit to perform the initial analysis, finding the maximum likelihood estimates for the x_min and alpha parameters. This command does not perform the computationally intensive hypothesis test.
Command:
Example:
$ powerlaw fit Data/reference_data/blackouts.txt
Data: Data/reference_data/blackouts.txt
n: 211
Pareto Type I parameters - alpha: 1.2726372198302858 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
Generic Power-Law [Cx^(-alpha)] parameters - alpha: 2.272637219830286 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
test subcommand
Use test to perform the full analysis, including the hypothesis test to determine if the data is plausibly drawn from a power-law distribution. This command requires a --precision argument for the p-value calculation.
Caution: This function can be very slow depending on the data.
Command:
Example:
$ powerlaw test Data/reference_data/blackouts.txt --precision 0.01
Data: Data/reference_data/blackouts.txt
Precision: 0.01
n: 211
Pareto Type I parameters - alpha: 1.2726372198302858 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
Generic Power-Law [Cx^(-alpha)] parameters - alpha: 2.272637219830286 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
Calculating the degree of uncertainty of the parameters...
x_min std: 75388.370780452 alpha std: 0.2543727775138083
Testing the null hypothesis H0 that a Power-Law is a plausible fit to the data...
Generating M = 2500 simulated datasets of length n = 211 with tail size 59 and probability of the tail P(tail|data) = 0.2796208530805687
Qty of simulations with KS statistic > empirical data = 1941
p-value: 0.7764
Fail to reject the null H0: Power-Law distribution is a plausible fit to the data.
Getting Help
# General help
# Help for a specific subcommand
Library Usage
You can also use powerlaw as a library in your own Rust projects.
1. Add to Cargo.toml:
[]
= "0.0.3" # Or the version you need
2. Example:
use ;
Building from Source
# Clone the repository
# Build in release mode
# Run tests
# Run benchmarks
License
This project is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.