Statify
A lightweight and versatile statistics library for Rust that provides essential statistical functions for data analysis.
Features
- Descriptive Statistics: Mean, median, mode, variance, standard deviation (both sample and population)
- Distribution Metrics: Percentiles, quartiles, interquartile range (IQR)
- Range Statistics: Min, max, range, sum
- Correlation Analysis: Pearson correlation coefficient and covariance
- Normalization: Min-max normalization, standard normalization, custom range scaling
- Linear Regression: Simple linear regression with slope, intercept, R², and predictions
- Normal Distribution: Probability density function (PDF) and cumulative distribution function (CDF)
- Advanced Metrics: Skewness, kurtosis, coefficient of variation, standard error
- Standardization: Z-scores for individual values or entire datasets
- Type Support: Works with both
f64andf32floating-point types - Error Handling: Robust error handling with descriptive error types
Installation
Add this to your Cargo.toml:
[]
= "0.1.0"
Usage
The library extends Vec<f64> and Vec<f32> with the Stats trait, making it simple to calculate statistics on your data:
use Stats;
Correlation and Covariance
use ;
let x = vec!;
let y = vec!;
let corr = correlation.unwrap;
let cov = covariance.unwrap;
println!;
println!;
Z-Scores
use ;
// Single value z-score
let score = z_score.unwrap;
println!;
// Z-scores for entire dataset
let data = vec!;
let scores = z_scores.unwrap;
println!;
Normalization
use ;
let data = vec!;
// Min-max normalization (0 to 1)
let normalized = normalize_min_max.unwrap;
// Standard normalization (z-scores)
let standardized = normalize_standard.unwrap;
// Custom range normalization (-1 to 1)
let custom = normalize_range.unwrap;
Linear Regression
use linear_regression;
let x = vec!;
let y = vec!;
let result = linear_regression.unwrap;
println!;
println!;
println!;
// Make predictions
let prediction = result.predict;
println!;
Normal Distribution
use ;
// Custom normal distribution (mean=100, std_dev=15)
let pdf = normal_pdf.unwrap;
let cdf = normal_cdf.unwrap;
// Standard normal distribution (mean=0, std_dev=1)
let std_pdf = standard_normal_pdf;
let std_cdf = standard_normal_cdf;
println!; // ~0.975
Advanced Metrics
use ;
let data = vec!;
let skew = skewness.unwrap;
let kurt = kurtosis.unwrap;
let cv = coefficient_of_variation.unwrap;
let se = standard_error.unwrap;
println!;
println!;
println!;
println!;
API Overview
Trait Methods (Stats)
All methods return a StatsResult<T> which handles errors gracefully:
mean()- Arithmetic meanmedian()- Middle value when sortedmode()- Most frequent valuesvariance()- Sample variancestd_dev()- Sample standard deviationvariance_pop()- Population variancestd_dev_pop()- Population standard deviationmin()- Minimum valuemax()- Maximum valuerange()- Difference between max and minsum()- Sum of all valuespercentile(p)- Value at the p-th percentilequartile_1()- 25th percentilequartile_3()- 75th percentileiqr()- Interquartile range (Q3 - Q1)
Standalone Functions
Correlation & Covariance
correlation(x, y)- Pearson correlation coefficientcovariance(x, y)- Covariance between two datasets
Normalization
normalize_min_max(data)- Min-max normalization (0 to 1)normalize_standard(data)- Standard normalization (z-scores)normalize_range(data, min, max)- Normalize to custom range
Linear Regression
linear_regression(x, y)- ReturnsLinearRegressionResultwith:slope- Regression line slopeintercept- Y-interceptr_squared- Coefficient of determinationpredict(x)- Predict y for given xpredict_many(x_values)- Predict multiple values
Normal Distribution
normal_pdf(x, mean, std_dev)- Probability density functionnormal_cdf(x, mean, std_dev)- Cumulative distribution functionstandard_normal_pdf(x)- Standard normal PDF (μ=0, σ=1)standard_normal_cdf(x)- Standard normal CDF (μ=0, σ=1)
Standardization
z_score(value, mean, std_dev)- Standard score for a single valuez_scores(data)- Standard scores for all values in a dataset
Advanced Metrics
standard_error(data)- Standard error of the meancoefficient_of_variation(data)- CV expressed as percentageskewness(data)- Measure of distribution asymmetrykurtosis(data)- Measure of distribution tailedness (excess kurtosis)
Error Handling
The library uses a custom StatsError enum for error handling:
EmptyDataset- Dataset is emptyInsufficientData- Not enough data for the operationDivisionByZero- Division by zero would occur
All statistical functions return StatsResult<T> which is a Result<T, StatsError>.
License
MIT
Contributing
Contributions are welcome. Please ensure tests pass before submitting pull requests.