Skip to main content

Module statistics

Module statistics 

Source
Expand description

Statistics and probability: descriptive stats, distributions, hypothesis testing, regression, Bayesian inference, random number generation, information theory.

Structs§

BetaBernoulli
Beta-Bernoulli conjugate model.
BetaDist
Beta distribution (Johnk’s method for sampling).
BinomialDist
Binomial distribution.
CauchyDist
Cauchy distribution.
ChiSquaredDist
Chi-squared distribution.
ExponentialDist
Exponential distribution.
GammaDist
Gamma distribution (Marsaglia-Tsang method for alpha >= 1).
GaussianGaussian
Gaussian-Gaussian conjugate model (known variance).
Lcg
Linear Congruential Generator.
LogNormalDist
Log-Normal distribution.
NormalDist
Normal (Gaussian) distribution.
Pcg32
PCG32 — Permuted Congruential Generator.
PoissonDist
Poisson distribution.
SplitMix64
SplitMix64 — fast 64-bit generator suitable as seed scrambler.
StudentTDist
Student’s t-distribution.
UniformDist
Continuous uniform distribution.
WeibullDist
Weibull distribution.
Xorshift64
Xorshift64 — fast, simple 64-bit RNG.

Traits§

Rng
Trait for random number generators.

Functions§

akaike_information_criterion
Akaike Information Criterion.
bayesian_information_criterion
Bayesian Information Criterion.
betainc
Regularized incomplete beta function I_x(a,b).
chi_squared_test
Chi-squared goodness-of-fit test. Returns (chi2-statistic, p-value).
covariance
Sample covariance.
credible_interval
Equal-tailed credible interval for Beta distribution.
cross_entropy
Cross-entropy H(P, Q) = -sum_x P(x) log Q(x).
entropy
Shannon entropy in nats (natural log base).
erf
Error function erf(x).
erfc
Complementary error function erfc(x).
gamma
Gamma function.
gammainc_lower
Regularized incomplete gamma function P(a, x) — lower.
iqr
Interquartile range.
jensen_shannon_divergence
Jensen-Shannon divergence — symmetric, bounded [0, ln(2)].
kl_divergence
KL divergence D_KL(P || Q) = sum_x P(x) log(P(x)/Q(x)).
ks_test
Kolmogorov-Smirnov test against a theoretical CDF. Returns (D-statistic, approximate p-value).
kurtosis
Sample excess kurtosis.
lgamma
Natural log of gamma function (Lanczos approximation).
linear_regression
Simple linear regression: y = slope * x + intercept.
logistic_regression
Logistic regression via gradient descent. x is n_samples × n_features, y is bool labels. Returns weight vector (n_features + 1, including intercept).
mann_whitney_u
Mann-Whitney U test (non-parametric, two-sample). Returns (U-statistic, approximate two-tailed p-value).
mean
Arithmetic mean.
median
Median (sorts the slice in place).
mode
Mode(s) — returns all values that appear most frequently.
multiple_linear_regression
Multiple linear regression (OLS). X is n_samples × n_features. Returns coefficient vector (including intercept as first element).
mutual_information
Mutual information I(X;Y) from joint probability matrix.
p_value_from_chi2
p-value from chi-squared statistic with k degrees of freedom.
p_value_from_t
Two-tailed p-value from t statistic with df degrees of freedom.
pearson_r
Pearson correlation coefficient.
percentile
p-th percentile (p in [0,100]).
polynomial_regression
Polynomial regression of given degree. Returns coefficients [a0, a1, …, a_deg].
posterior_mean
Posterior mean of Beta-Bernoulli model.
probit
Inverse normal CDF (probit function) via rational approximation.
r_squared
R-squared coefficient of determination.
ridge_regression
Ridge regression (L2 regularized OLS). Returns coefficients.
sample_without_replacement
Sample k distinct indices from 0..n without replacement (Knuth’s algorithm S).
shapiro_wilk_stat
Shapiro-Wilk test statistic W for normality. Uses first 20 a-coefficients approximation.
shuffle
Fisher-Yates shuffle.
skewness
Sample skewness.
spearman_rho
Spearman rank correlation.
std_dev
Sample standard deviation.
t_test_one_sample
One-sample t-test against mu0. Returns (t-statistic, two-tailed p-value).
t_test_two_sample
Welch’s two-sample t-test. Returns (t-statistic, two-tailed p-value).
update_beta_bernoulli
Update Beta prior with new Bernoulli observations.
variance
Sample variance (Bessel’s correction, n-1 denominator).
weighted_sample
Weighted sampling — draw one index proportional to weights.