Expand description
§Probabilistic Programming Module
This module provides comprehensive probabilistic programming infrastructure for NumRS2, including advanced inference algorithms, Bayesian utilities, and probabilistic graphical models.
§Overview
The probabilistic module offers production-ready implementations of:
- MCMC Inference: Metropolis-Hastings, Gibbs sampling, Hamiltonian Monte Carlo (HMC), No-U-Turn Sampler (NUTS), and Parallel Tempering
- Variational Inference: Mean-field VI, Automatic Differentiation VI (ADVI), ELBO optimization
- Bayesian Utilities: Conjugate priors, posterior computation, model comparison (BIC, DIC, WAIC), credible intervals, hypothesis testing
- Graphical Models: Bayesian networks, Markov Random Fields, Hidden Markov Models, Gaussian Processes
- Extended Distributions: Beta, Gamma, Dirichlet, Student’s t, Wishart, Inverse-Wishart, Von Mises, and more
§Mathematical Background
§Bayesian Inference
Bayesian inference provides a principled framework for updating beliefs about parameters θ given observed data D using Bayes’ theorem:
p(θ|D) = p(D|θ)p(θ) / p(D)where:
- p(θ|D) is the posterior distribution
- p(D|θ) is the likelihood
- p(θ) is the prior distribution
- p(D) is the marginal likelihood (evidence)
§Markov Chain Monte Carlo (MCMC)
When the posterior distribution cannot be computed analytically, MCMC methods construct a Markov chain whose stationary distribution is the target posterior. Common algorithms include:
- Metropolis-Hastings: Generic MCMC using proposal distributions
- Gibbs Sampling: Samples from conditional distributions
- Hamiltonian Monte Carlo: Uses gradient information for efficient exploration
- NUTS: Adaptive HMC with automatic step size tuning
§Variational Inference
Variational inference approximates the posterior p(θ|D) with a simpler distribution q(θ) by minimizing the Kullback-Leibler divergence:
KL(q||p) = ∫ q(θ) log(q(θ)/p(θ|D)) dθThis is equivalent to maximizing the Evidence Lower BOund (ELBO):
ELBO = E_q[log p(D,θ)] - E_q[log q(θ)]§SCIRS2 Policy Compliance
This module strictly follows SCIRS2 ecosystem policies:
- Random Number Generation: ALWAYS use
scirs2_core::random(NEVER direct rand/rand_distr) - Array Operations: ALWAYS use
scirs2_core::ndarray(NEVER direct ndarray) - Parallel Processing: ALWAYS use
scirs2_core::parallel_ops(NEVER direct rayon) - Statistical Functions: Use
scirs2_statsfor statistical computations - Linear Algebra: Use
scirs2_linalgfor matrix operations (Pure Rust via OxiBLAS)
§Usage Examples
§Example 1: Metropolis-Hastings Sampling
use numrs2::new_modules::probabilistic::{MetropolisHastings, GaussianProposal};
use scirs2_core::random::default_rng;
// Define log-posterior function
let log_posterior = |theta: &[f64]| -> f64 {
// Log-likelihood + log-prior
-0.5 * theta[0].powi(2) // Standard normal prior
};
// Create sampler with Gaussian proposal
let mut rng = default_rng();
let proposal = GaussianProposal::new(0.5)?; // Proposal std dev
let mut sampler = MetropolisHastings::new(log_posterior, proposal);
// Run MCMC for 10,000 iterations
let initial_state = vec![0.0];
let samples = sampler.sample(&initial_state, 10000, 1000, &mut rng)?;§Example 2: Bayesian Linear Regression
use numrs2::new_modules::probabilistic::{BayesianLinearRegression, NormalInverseGammaPrior};
use numrs2::prelude::*;
// Data: y = 2*x + 1 + noise
let x = linspace(0.0, 10.0, 100).reshape(&[100, 1]);
let y = x.multiply_scalar(2.0).add_scalar(1.0).add(&randn(&[100]));
// Set up conjugate prior
let prior = NormalInverseGammaPrior::default();
// Compute posterior
let posterior = prior.update(&x, &y)?;
// Sample from posterior predictive
let x_new = linspace(10.0, 15.0, 50).reshape(&[50, 1]);
let y_pred = posterior.predict(&x_new)?;§Example 3: Model Comparison with WAIC
use numrs2::new_modules::probabilistic::{ModelComparison, waic};
// Compute WAIC for model selection
let log_likelihood_samples = /* MCMC samples of log-likelihood */;
let waic_score = waic(&log_likelihood_samples)?;
println!("WAIC: {}", waic_score.waic);
println!("Effective parameters: {}", waic_score.p_waic);§Performance Considerations
- SIMD Optimization: Distribution operations use SIMD when applicable
- Parallel MCMC: Multiple chains can run in parallel using
scirs2_core::parallel_ops - Memory Efficiency: Streaming algorithms for large-scale inference
- Numerical Stability: Log-space computations to prevent underflow
§References
- Gelman, A., et al. (2013). Bayesian Data Analysis (3rd ed.). Chapman and Hall/CRC.
- Neal, R. M. (2011). MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo.
- Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn Sampler. Journal of Machine Learning Research.
- Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational Inference: A Review for Statisticians.
- Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
Re-exports§
pub use bayesian::*;pub use distributions::*;pub use graphical::*;pub use inference::*;
Modules§
- bayesian
- Bayesian Inference Utilities
- distributions
- Extended Probability Distributions
- graphical
- Probabilistic Graphical Models
- inference
- Probabilistic Inference Algorithms
Enums§
- Probabilistic
Error - Comprehensive error type for probabilistic programming operations
Functions§
- validate_
non_ negative - Helper function to validate non-negative parameter values
- validate_
positive - Helper function to validate positive parameter values
- validate_
probability - Helper function to validate probability values
- validate_
shape - Helper function to validate array shapes match
Type Aliases§
- Result
- Result type for probabilistic operations