simd_kernels/kernels/scientific/distributions/mod.rs
1// Copyright Peter Bower 2025. All Rights Reserved.
2// Licensed under Mozilla Public License (MPL) 2.0.
3
4//! # **Statistical Distributions Module** - *Comprehensive Probability Distribution Computing*
5//!
6//! Advanced statistical distribution kernels providing high-performance probability density
7//! functions (PDFs), cumulative distribution functions (CDFs), quantile functions, and
8//! random sampling with SIMD acceleration and numerical precision guarantees.
9//!
10//! ## Distribution Categories
11//! - **Univariate distributions**: Complete coverage of common continuous and discrete distributions
12//! - **Multivariate distributions**: Multivariate normal, Student-t, Wishart, and advanced distributions
13//! - **Parametric families**: Beta, gamma, normal, exponential, and related distribution families
14//! - **Discrete distributions**: Binomial, Poisson, geometric, and hypergeometric variants
15//!
16//! ## Core Statistical Functions
17//! Each distribution provides a complete statistical interface:
18//! - **Probability density/mass functions**: Optimised PDF/PMF evaluation with numerical stability
19//! - **Cumulative distribution functions**: CDF computation with extended precision algorithms
20//! - **Quantile functions**: Inverse CDF calculation using robust bracketing and refinement
21//! - **Random sampling**: High-quality pseudorandom generation with distributional correctness
22//!
23//! ## Computational Architecture
24//! Distribution calculations employ sophisticated numerical techniques for accuracy and performance:
25//! - **SIMD vectorisation**: Hardware-accelerated evaluation of distribution functions
26//! - **Rational approximations**: Optimised polynomial and rational function approximations
27//! - **Series expansions**: Convergent series with adaptive truncation for transcendental functions
28//! - **Numerical integration**: Gauss-Kronrod quadrature for complex distribution functions
29//!
30//! ## Arrow Integration and Null Handling
31//! The module integrates seamlessly with Apache Arrow's memory model and null semantics:
32//! - **Null-aware processing**: Efficient handling of missing values with validity bitmasks
33//! - **SIMD-accelerated masking**: Vectorised null propagation without conditional branches
34//! - **Arrow-compatible layouts**: Direct operation on Arrow array structures
35//! - **Memory efficiency**: Zero-copy operations where mathematically valid
36//!
37//! ### Null Value Philosophy
38//! Rather than assume, we choose to recognise inf and NaN as valid float values
39//! (consistent with Apache Arrow semantics), leaving it to the user to subsequently
40//! treat them as nulls if they wish, given that there are numerical scenarios where
41//! they represent information gain. This approach avoids computational overhead in
42//! the hot path whilst preserving mathematical correctness for edge cases.
43//!
44//! ## Numerical Precision and Stability
45//! All distribution implementations prioritise numerical accuracy across parameter ranges.
46//! See `./tests` for any specific tolerance requirements, where it is measured against Scipy.
47//! Whilst these pass on the development machine, platform specific difference may impact your
48//! test results, and thus one should keep this in mind when evaluating this library's fit for your use case.
49//!
50//! ## Disclaimer
51//! This implementation is provided on a best-effort basis and is intended for
52//! general scientific and engineering use. While every attempt has been made to
53//! match the accuracy and behaviour of established libraries such as SciPy, we
54//! make no guarantees as to correctness, fitness for any particular purpose, or
55//! suitability for uses such as in life-critical, safety-critical, or financial applications.
56//!
57//! Results may differ from other libraries due to platform, compiler, or implementation
58//! differences. Edge cases and special values are handled explicitly for compatibility
59//! with SciPy (v1.16) but users are responsible for independently verifying that this
60//! function meets their accuracy and reliability requirements.
61//!
62//! By using these functions, you accept all responsibility for outcomes or decisions
63//! based upon its results.
64
65/// # **Shared Distribution Utilities** - *Common Infrastructure for Distribution Computing*
66///
67/// Foundational utilities, constants, and helper functions shared across all probability
68/// distributions, providing consistent numerical methods and sampling infrastructure.
69///
70/// This module contains the core mathematical building blocks that enable efficient
71/// and accurate distribution computation across all statistical functions.
72///
73/// ## Modules
74/// - **`constants`**: Mathematical constants and precomputed values
75/// - **`sampler`**: Random number generation and sampling utilities
76/// - **`scalar`**: Special functions and mathematical utilities
77pub mod shared {
78 pub mod constants;
79 pub mod sampler;
80 pub mod scalar;
81}
82
83/// # **Univariate Distributions** - *Single-Variable Probability Distributions*
84///
85/// Complete collection of univariate probability distributions covering both continuous
86/// and discrete families with comprehensive statistical function implementations.
87///
88/// Each distribution provides PDF/PMF, CDF, quantile functions, and random sampling
89/// with SIMD acceleration and numerical precision guarantees.
90///
91/// ## Distribution Categories
92/// - **Continuous**: beta, cauchy, chi-squared, exponential, gamma, laplace, logistic, lognormal, normal, student_t, uniform, weibull
93/// - **Discrete**: binomial, discrete_uniform, geometric, hypergeometric, multinomial, neg_binomial, poisson
94/// - **Common utilities**: Shared patterns and mathematical building blocks
95pub mod univariate {
96 // common kernel patterns
97 pub mod common;
98
99 // distributions
100 pub mod beta;
101 pub mod binomial;
102 pub mod cauchy;
103 pub mod chi_squared;
104 /// Discrete uniform distribution kernels - equal probability over finite integer range.
105 pub mod discrete_uniform;
106 /// Exponential distribution kernels - continuous distribution for inter-arrival times.
107 pub mod exponential;
108 pub mod gamma;
109 pub mod geometric;
110 pub mod gumbell;
111 pub mod hypergeometric;
112 pub mod laplace;
113 pub mod logistic;
114 pub mod lognormal;
115 pub mod multinomial;
116 pub mod neg_binomial;
117 pub mod normal;
118 pub mod poisson;
119 pub mod student_t;
120 pub mod uniform;
121 pub mod weibull;
122}
123
124#[cfg(feature = "linear_algebra")]
125pub mod multivariate;