Skip to main content

Module probabilistic

Module probabilistic 

Source
Expand description

§Probabilistic Programming Module

This module provides comprehensive probabilistic programming infrastructure for NumRS2, including advanced inference algorithms, Bayesian utilities, and probabilistic graphical models.

§Overview

The probabilistic module offers production-ready implementations of:

  • MCMC Inference: Metropolis-Hastings, Gibbs sampling, Hamiltonian Monte Carlo (HMC), No-U-Turn Sampler (NUTS), and Parallel Tempering
  • Variational Inference: Mean-field VI, Automatic Differentiation VI (ADVI), ELBO optimization
  • Bayesian Utilities: Conjugate priors, posterior computation, model comparison (BIC, DIC, WAIC), credible intervals, hypothesis testing
  • Graphical Models: Bayesian networks, Markov Random Fields, Hidden Markov Models, Gaussian Processes
  • Extended Distributions: Beta, Gamma, Dirichlet, Student’s t, Wishart, Inverse-Wishart, Von Mises, and more

§Mathematical Background

§Bayesian Inference

Bayesian inference provides a principled framework for updating beliefs about parameters θ given observed data D using Bayes’ theorem:

p(θ|D) = p(D|θ)p(θ) / p(D)

where:

  • p(θ|D) is the posterior distribution
  • p(D|θ) is the likelihood
  • p(θ) is the prior distribution
  • p(D) is the marginal likelihood (evidence)

§Markov Chain Monte Carlo (MCMC)

When the posterior distribution cannot be computed analytically, MCMC methods construct a Markov chain whose stationary distribution is the target posterior. Common algorithms include:

  • Metropolis-Hastings: Generic MCMC using proposal distributions
  • Gibbs Sampling: Samples from conditional distributions
  • Hamiltonian Monte Carlo: Uses gradient information for efficient exploration
  • NUTS: Adaptive HMC with automatic step size tuning

§Variational Inference

Variational inference approximates the posterior p(θ|D) with a simpler distribution q(θ) by minimizing the Kullback-Leibler divergence:

KL(q||p) = ∫ q(θ) log(q(θ)/p(θ|D)) dθ

This is equivalent to maximizing the Evidence Lower BOund (ELBO):

ELBO = E_q[log p(D,θ)] - E_q[log q(θ)]

§SCIRS2 Policy Compliance

This module strictly follows SCIRS2 ecosystem policies:

  • Random Number Generation: ALWAYS use scirs2_core::random (NEVER direct rand/rand_distr)
  • Array Operations: ALWAYS use scirs2_core::ndarray (NEVER direct ndarray)
  • Parallel Processing: ALWAYS use scirs2_core::parallel_ops (NEVER direct rayon)
  • Statistical Functions: Use scirs2_stats for statistical computations
  • Linear Algebra: Use scirs2_linalg for matrix operations (Pure Rust via OxiBLAS)

§Usage Examples

§Example 1: Metropolis-Hastings Sampling

use numrs2::new_modules::probabilistic::{MetropolisHastings, GaussianProposal};
use scirs2_core::random::default_rng;

// Define log-posterior function
let log_posterior = |theta: &[f64]| -> f64 {
    // Log-likelihood + log-prior
    -0.5 * theta[0].powi(2) // Standard normal prior
};

// Create sampler with Gaussian proposal
let mut rng = default_rng();
let proposal = GaussianProposal::new(0.5)?; // Proposal std dev
let mut sampler = MetropolisHastings::new(log_posterior, proposal);

// Run MCMC for 10,000 iterations
let initial_state = vec![0.0];
let samples = sampler.sample(&initial_state, 10000, 1000, &mut rng)?;

§Example 2: Bayesian Linear Regression

use numrs2::new_modules::probabilistic::{BayesianLinearRegression, NormalInverseGammaPrior};
use numrs2::prelude::*;

// Data: y = 2*x + 1 + noise
let x = linspace(0.0, 10.0, 100).reshape(&[100, 1]);
let y = x.multiply_scalar(2.0).add_scalar(1.0).add(&randn(&[100]));

// Set up conjugate prior
let prior = NormalInverseGammaPrior::default();

// Compute posterior
let posterior = prior.update(&x, &y)?;

// Sample from posterior predictive
let x_new = linspace(10.0, 15.0, 50).reshape(&[50, 1]);
let y_pred = posterior.predict(&x_new)?;

§Example 3: Model Comparison with WAIC

use numrs2::new_modules::probabilistic::{ModelComparison, waic};

// Compute WAIC for model selection
let log_likelihood_samples = /* MCMC samples of log-likelihood */;
let waic_score = waic(&log_likelihood_samples)?;
println!("WAIC: {}", waic_score.waic);
println!("Effective parameters: {}", waic_score.p_waic);

§Performance Considerations

  • SIMD Optimization: Distribution operations use SIMD when applicable
  • Parallel MCMC: Multiple chains can run in parallel using scirs2_core::parallel_ops
  • Memory Efficiency: Streaming algorithms for large-scale inference
  • Numerical Stability: Log-space computations to prevent underflow

§References

  • Gelman, A., et al. (2013). Bayesian Data Analysis (3rd ed.). Chapman and Hall/CRC.
  • Neal, R. M. (2011). MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo.
  • Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn Sampler. Journal of Machine Learning Research.
  • Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational Inference: A Review for Statisticians.
  • Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

Re-exports§

pub use bayesian::*;
pub use distributions::*;
pub use graphical::*;
pub use inference::*;

Modules§

bayesian
Bayesian Inference Utilities
distributions
Extended Probability Distributions
graphical
Probabilistic Graphical Models
inference
Probabilistic Inference Algorithms

Enums§

ProbabilisticError
Comprehensive error type for probabilistic programming operations

Functions§

validate_non_negative
Helper function to validate non-negative parameter values
validate_positive
Helper function to validate positive parameter values
validate_probability
Helper function to validate probability values
validate_shape
Helper function to validate array shapes match

Type Aliases§

Result
Result type for probabilistic operations