# Powerlaw
[](https://crates.io/crates/powerlaw)
[](https://docs.rs/powerlaw)
[](./LICENSE-MIT)
A Rust library and command-line tool for analyzing power-law distributions in empirical data.
## Overview
`powerlaw` is a high-performance Rust library developed to assist in parameter estimation and hypothesis testing of power-law distributed data. Such distributions are of interest in numerous fields of study, from natural to social sciences.
The methodology is heavily based on the techniques and statistical framework described in the paper ['Power-Law Distributions in Empirical Data'](https://doi.org/10.1137/070710111) by Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman.
## Features
- **Parameter Estimation**: Estimates the parameters (`x_min`, `alpha`) of a power-law distribution from data.
- **Goodness-of-Fit**: Uses the Kolmogorov-Smirnov (KS) statistic to find the best-fitting parameters.
- **Hypothesis Testing**: Performs a hypothesis test to determine if the power-law model is a plausible fit for the data.
- **High Performance**: Computationally intensive tasks are parallelized using Rayon for significant speedups.
- **Dual Use**: Can be used as a simple command-line tool or as a library in other Rust projects.
## Installation
You can install the CLI tool directly from the Git repository:
```bash
cargo install --git https://github.com/aulichny3/powerlaw.git
```
Once published, it can be installed from [crates.io](https://crates.io):
```bash
cargo install powerlaw
```
## CLI Usage
The `powerlaw` CLI provides two main subcommands: `fit` and `test`.
### `fit` subcommand
Use `fit` to perform the initial analysis, finding the maximum likelihood estimates for the `x_min` and `alpha` parameters. This command does not perform the computationally intensive hypothesis test.
**Command:**
```bash
powerlaw fit <FILEPATH>
```
**Example:**
```
$ powerlaw fit Data/reference_data/blackouts.txt
Data: Data/reference_data/blackouts.txt
n: 211
Pareto Type I parameters - alpha: 1.2726372198302858 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
Generic Power-Law [Cx^(-alpha)] parameters - alpha: 2.272637219830286 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
```
### `test` subcommand
Use `test` to perform the full analysis, including the hypothesis test to determine if the data is plausibly drawn from a power-law distribution. This command requires a `--precision` argument for the p-value calculation.
**Caution: This function can be very slow depending on the data**.
**Command:**
```bash
powerlaw test <FILEPATH> --precision <VALUE>
```
**Example:**
```
$ powerlaw test Data/reference_data/blackouts.txt --precision 0.01
Data: Data/reference_data/blackouts.txt
Precision: 0.01
n: 211
Pareto Type I parameters - alpha: 1.2726372198302858 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
Generic Power-Law [Cx^(-alpha)] parameters - alpha: 2.272637219830286 x_min 230000.0 KS stat: 0.06067379629443781 tail length: 59
Calculating the level of uncertainty of the parameters...
x_min std: 83965.71686981615 alpha std: 0.25568337126755664
Testing the hypothesis that a powerlaw is a plausible fit to the data...
Generating 2500 simulations of length 211 for size 527500
SimParams { num_sims_m: 2500, sim_len_n: 211, n_tail: 58, p_tail: 0.27488151658767773 }
Qty of sims with KS statistic > empirical data 1958
Total sims 2500
p-value: 0.7832
Powerlaw distribution is a plausible fit to the data.
```
### Getting Help
```bash
# General help
powerlaw --help
# Help for a specific subcommand
powerlaw test --help
```
## Library Usage
You can also use `powerlaw` as a library in your own Rust projects.
**1. Add to `Cargo.toml`:**
```toml
[dependencies]
powerlaw = "0.0.1" # Or the version you need
```
**2. Example:**
```rust
use powerlaw::{dist, util};
fn main() {
// 1. Read your data into a Vec<f64>
let mut data = util::read_csv("path/to/your/data.csv").unwrap();
// 2. Find the MLE alphas for all potential x_mins
let alphas = dist::pareto::find_alphas_fast(&mut data);
// 3. Find the best fit (x_min, alpha) pair based on the KS statistic
let best_fit = dist::pareto::gof(&data, &alphas.0, &alphas.1);
println!("Best fit found: x_min = {}, alpha = {}", best_fit.x_min, best_fit.alpha);
// 4. Optionally, run the hypothesis test
let precision = 0.01;
let h_0 = dist::pareto::hypothesis_test(
data,
precision,
best_fit.alpha,
best_fit.x_min,
best_fit.D,
);
println!("Hypothesis test p-value: {}", h_0.p);
if h_0.p > 0.1 {
println!("The power-law model is a plausible fit.");
} else {
println!("The power-law model is not a plausible fit.");
}
}
```
## Building from Source
```bash
# Clone the repository
git clone https://github.com/<your-username>/powerlaw.git
cd powerlaw
# Build in release mode
cargo build --release
# Run tests
cargo test
# Run benchmarks
cargo bench
```
## License
This project is licensed under either of
- Apache License, Version 2.0, ([LICENSE-APACHE](./LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](./LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.