statoxide 0.2.2

High-performance statistical computing library written in Rust, exposed to Python via PyO3
statoxide-0.2.2 is not a library.
Visit the last successful build: statoxide-0.3.0

StatOxide: High-performance statistical computing in Rust with Python bindings

StatOxide is a modern, high-performance statistical computing library written in Rust, with comprehensive Python bindings via PyO3. Designed for data scientists, statisticians, and researchers who need both performance and productivity.

๐Ÿš€ Features

๐Ÿ“Š Core Data Structures

  • Series: Columnar data with metadata (name, dtype, levels)
  • DataFrame: Tabular data structure with column operations
  • Formula: R-style formula parsing for model specification

๐Ÿ“ˆ Statistical Functions

  • Descriptive Statistics: Mean, variance, skewness, kurtosis, quantiles
  • Probability Distributions: 12 continuous + 6 discrete distributions
  • Statistical Tests: t-test, chi-square, ANOVA, correlation tests
  • Correlation Measures: Pearson, Spearman, Kendall tau

๐Ÿงฎ Statistical Models

  • Linear Models: OLS, Ridge, Lasso, Elastic Net with proper inference
  • Generalized Linear Models: Logistic, Poisson, Gamma, Negative Binomial regression
  • Mixed Effects Models: Linear and GLMMs with EM algorithm estimation
  • Robust Statistics: M-estimators, S-estimators, MM-estimators
  • Nonparametric Methods: Kernel regression, local regression, smoothing splines

๐Ÿ“‰ Time Series Analysis

  • Core Structures: TimeSeries with datetime indexing
  • ARIMA Models: AR, MA, ARMA, ARIMA, SARIMA
  • GARCH Models: ARCH, GARCH for volatility modeling
  • Decomposition: STL, moving averages, Hodrick-Prescott filter
  • Forecasting: Point forecasts, prediction intervals

๐Ÿ› ๏ธ Utilities

  • Linear Algebra: Matrix operations, solvers, decompositions
  • Random Generation: Distributions, bootstrap, train-test split
  • Data Validation: Type checking, missing value detection
  • Numerical Methods: Softmax, standardization, normalization

๐Ÿ Python API

StatOxide provides a complete Python interface through PyO3 bindings:

import statoxide
import statoxide.core as soc
import statoxide.stats as sos

# Core data structures
df = soc.DataFrame({
    "x": [1.0, 2.0, 3.0, 4.0, 5.0],
    "y": [2.0, 4.0, 5.0, 4.0, 5.0]
})

series = df.get_column("x")
print(f"Mean of x: {series.mean():.2f}")
print(f"Std of x: {series.std(1.0):.2f}")

# Statistical functions
print(f"Correlation: {sos.correlation(df.get_column('x').to_list(), 
                                      df.get_column('y').to_list()):.3f}")

summary = sos.descriptive_summary([1.0, 2.0, 3.0, 4.0, 5.0])
print(f"Summary: {summary}")

# Formula parsing
formula = soc.Formula("y ~ x + x^2")
print(f"Formula variables: {formula.variables()}")

# Models
import statoxide.models as som
result = som.linear_regression([[1, 1], [1, 2], [1, 3]], [5, 8, 11])
print(f"Regression coefficients: {result['coefficients']}")

# Mixed effects models
mixed_results = som.mixed_effects("y ~ x + (1 | group)", data)
print(f"Random effect variance: {mixed_results.random_variances}")

# Time series
import statoxide.tsa as sot
arima_result = sot.fit_arima([1.0, 2.0, 3.0, 4.0, 5.0], 1, 0, 1)
print(f"ARIMA AIC: {arima_result['aic']}")

# Utilities
import statoxide.utils as sou
train, test = sou.train_test_split([1.0, 2.0, 3.0, 4.0, 5.0], 0.2)
print(f"Train: {train}, Test: {test}")

๐Ÿ—๏ธ Architecture

StatOxide is organized as a multi-crate Rust workspace:

statoxide/
โ”œโ”€โ”€ Cargo.toml              # Workspace configuration
โ”œโ”€โ”€ crates/
โ”‚   โ”œโ”€โ”€ so-core/           # Core data structures & formula parsing
โ”‚   โ”œโ”€โ”€ so-linalg/         # Linear algebra abstraction
โ”‚   โ”œโ”€โ”€ so-stats/          # Statistical functions & distributions
โ”‚   โ”œโ”€โ”€ so-models/         # Statistical models (regression, GLM, mixed effects, etc.)
โ”‚   โ”œโ”€โ”€ so-tsa/            # Time series analysis
โ”‚   โ”œโ”€โ”€ so-utils/          # Utility functions
โ”‚   โ””โ”€โ”€ so-python/         # Python bindings (PyO3)
โ”œโ”€โ”€ assets/logo.png        # Project logo
โ”œโ”€โ”€ LICENSE-MIT           # MIT license
โ””โ”€โ”€ LICENSE-APACHE-2.0    # Apache 2.0 license

๐Ÿ“ฆ Installation

Prerequisites

  1. Rust Toolchain: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
  2. Python Development Files:
    • Ubuntu/Debian: sudo apt-get install python3-dev python3.11-dev
    • macOS: brew install python@3.11
  3. Maturin (recommended): pip install maturin

Building from Source

# Clone the repository
git clone https://github.com/EthanNOV56/StatOxide.git
cd StatOxide

# Build Python bindings with maturin
cd crates/so-python
maturin develop  # Editable install for development
# or
maturin build --release  # Build wheel for distribution

Direct Cargo Build

cd /path/to/statoxide
export PYO3_PYTHON=python3.11
cargo build --release --package so-python

The shared library will be at target/release/libso_python.so.

๐Ÿงช Testing

Rust Tests

cargo test --all

Python Tests

After installation:

python -c "import statoxide; print(statoxide.version())"
python crates/so-python/test_api.py  # API demonstration

๐Ÿ“š Documentation

  • API Reference: Run cargo doc --all --no-deps --open for Rust documentation
  • Python Docstrings: All Python functions include detailed docstrings
  • Examples: See crates/so-python/test_api.py for usage examples

๐ŸŽฏ Design Principles

  1. Performance: Leverage Rust's zero-cost abstractions and LLVM optimizations
  2. Safety: Memory safety guarantees without garbage collection
  3. Interoperability: Seamless Python integration with minimal overhead
  4. Modularity: Independent crates for clear separation of concerns
  5. API Consistency: Familiar interfaces inspired by R, pandas, and statsmodels

๐Ÿ”ง Development Status

Module Status Notes
so-core โœ… Complete Data structures, formula parsing
so-linalg โœ… Complete Linear algebra abstraction
so-stats โœ… Complete Statistical functions & distributions
so-models โœ… Complete Regression, GLM, mixed effects, robust, nonparametric
so-tsa โœ… Complete ARIMA, GARCH, decomposition, forecasting
so-utils โœ… Complete Random generation, validation, numerical methods
so-python โœ… Complete Full Python bindings implemented

๐Ÿ“„ License

StatOxide is dual-licensed under both:

You may use StatOxide under either license at your option.

๐Ÿ™ Acknowledgments

  • R and statsmodels for statistical API inspiration
  • pandas for DataFrame design patterns
  • PyO3 team for excellent Rust-Python interop
  • ndarray and faer for numerical computing foundations

๐Ÿค Contributing

Contributions are welcome!

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: cargo test --all
  5. Submit a pull request

๐Ÿ“ž Support