# lawkit
[日本語](README.ja.md)
[](https://github.com/kako-jun/lawkit/actions/workflows/ci.yml)
[](https://crates.io/crates/lawkit)
[](LICENSE)
Statistical law analysis toolkit. Analyze data for Benford's law, Pareto principle, Zipf's law, Normal and Poisson distributions. Detect anomalies and assess data quality.
## Installation
```bash
cargo install lawkit
```
## Supported Laws
### Benford's Law (Fraud Detection)
```bash
$ lawkit benf financial_data.csv
Benford Law Analysis Results
Dataset: financial_data.csv
Numbers analyzed: 1000
[LOW] Dataset analysis
First Digit Distribution:
1: ███████████████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 30.1% (expected: 30.1%)
2: █████████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 17.6% (expected: 17.6%)
3: ██████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 12.5% (expected: 12.5%)
...
```
### Pareto Principle (80/20 Rule)
```bash
$ lawkit pareto sales.csv
Pareto Principle (80/20 Rule) Analysis Results
Dataset: sales.csv
Numbers analyzed: 500
[LOW] Dataset analysis
Lorenz Curve (Cumulative Distribution):
20%: ███████████████████████████████████████┃░░░░░░░░░░ 79.2% cumulative (80/20 point)
40%: █████████████████████████████████████████████░░░░░ 91.5% cumulative
...
80/20 Rule: Top 20% owns 79.2% of total wealth (Ideal: 80.0%, Ratio: 0.99)
```
### Zipf's Law (Frequency Distribution)
```bash
$ lawkit zipf word_frequencies.csv
Zipf Law Analysis Results
Dataset: word_frequencies.csv
Numbers analyzed: 1000
[LOW] Dataset analysis
Rank-Frequency Distribution:
# 1: █████████████████████████████████████████████████┃ 11.50% (expected: 11.50%)
# 2: █████████████████████████┃░░░░░░░░░░░░░░░░░░░░░░░ 5.75% (expected: 5.75%)
# 3: █████████████████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 3.83% (expected: 3.83%)
...
Zipf Exponent: 1.02 (ideal: 1.0), Correlation: 0.998
```
### Normal Distribution (Quality Control)
```bash
$ lawkit normal measurements.csv
Normal Distribution Analysis Results
Dataset: measurements.csv
Numbers analyzed: 200
Quality Level: High
Distribution Histogram:
-2.50- -1.89: ██████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11.5%
-1.89- -1.28: █████████████████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 34.0%
-1.28- -0.67: ███████████████████████████████████┃░░░░░░░░░░░░░░ 69.8%
...
Distribution: μ=0.02, σ=1.01, Range: [-2.89, 3.12]
1σ: 68.5%, 2σ: 95.5%, 3σ: 99.7%
```
### Poisson Distribution (Rare Events)
```bash
$ lawkit poisson events.csv
Poisson Distribution Analysis Results
Dataset: events.csv
Numbers analyzed: 100
Quality Level: High
Probability Distribution:
P(X= 0): ██████████████████┃░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.095
P(X= 1): ███████████████████████████████████████████┃░░░░░░ 0.224
P(X= 2): █████████████████████████████████████████████████┃ 0.263
...
λ=2.35, Variance/Mean=1.02 (ideal: 1.0), Fit Score=0.95
```
## Usage
```bash
# Single law analysis
lawkit benf data.csv # Benford's law
lawkit pareto data.csv # Pareto principle
lawkit zipf data.csv # Zipf's law
lawkit normal data.csv # Normal distribution
lawkit poisson data.csv # Poisson distribution
# Multi-law analysis
lawkit analyze data.csv # Run all applicable laws
lawkit validate data.csv # Data validation
lawkit diagnose data.csv # Detailed diagnostics
# Generate test data
lawkit generate benf -s 1000
lawkit generate pareto -s 500
# Utility
lawkit list # List available laws
lawkit selftest # Run self-test
```
## Input Sources
```bash
lawkit benf data.csv # File
lawkit benf https://example.com/data.json # URL
Formats: CSV, JSON, YAML, plain text (one number per line)
## Main Options
```bash
-f, --format <FORMAT> # Output: text, csv, json, yaml, toml, xml
-q, --quiet # Minimal output
-v, --verbose # Detailed output
--filter <RANGE> # Filter numbers (e.g., >=100, <1000, 50-500)
-c, --min-count <N> # Minimum data count (default: 10)
--no-color # Disable colors
```
## Risk Levels & Exit Codes
| LOW | Data conforms to expected distribution | 0 |
| MEDIUM | Minor deviation, likely normal | 0 |
| HIGH | Significant deviation (p ≤ 0.05) | 10 |
| CRITICAL | Severe anomaly (p ≤ 0.01) | 11 |
## CI/CD Usage
```bash
# Detect anomalies in financial data
if ! lawkit benf transactions.csv --quiet; then
echo "Anomaly detected"
lawkit benf transactions.csv --format json > report.json
fi
# Validate distribution
lawkit validate data.csv --cross-validation
```
## Standalone Tools
For focused single-law analysis:
- [benf](https://crates.io/crates/benf) - Benford's Law only
- [pareto](https://crates.io/crates/pareto) - Pareto Principle only
## Examples
See [lawkit-cli/tests/cmd/](lawkit-cli/tests/cmd/) for executable examples:
- [Benford's Law](lawkit-cli/tests/cmd/benford.md)
- [Pareto Principle](lawkit-cli/tests/cmd/pareto.md)
- [Zipf's Law](lawkit-cli/tests/cmd/zipf.md)
- [Normal Distribution](lawkit-cli/tests/cmd/normal.md)
- [Poisson Distribution](lawkit-cli/tests/cmd/poisson.md)
- [Output formats](lawkit-cli/tests/cmd/output.md)
- [Options](lawkit-cli/tests/cmd/options.md)
- [Test data generation](lawkit-cli/tests/cmd/generate.md)
## Documentation
- [CLI Specification](docs/specs/cli.md)
- [Core API Specification](docs/specs/core.md)
## License
MIT