Crate rmt

Crate rmt 

Source
Expand description

§rmt

Random Matrix Theory: eigenvalue distributions and spectral statistics.

§The Core Idea

When you have a large random matrix, its eigenvalues follow predictable distributions. This is surprising: randomness at the element level produces order at the spectral level.

§Key Distributions

DistributionMatrix TypeDensity
[marchenko_pastur]Wishart (X^T X)Bounded support
[wigner_semicircle]Symmetric randomSemicircle
[tracy_widom]Largest eigenvalueSkewed

§Quick Start

use rmt::{marchenko_pastur_density, wigner_semicircle_density, sample_wishart};
use ndarray::Array2;

// Marchenko-Pastur: eigenvalue density of X^T X / n
let ratio = 0.5;  // p/n (dimensions / samples)
let density = marchenko_pastur_density(1.5, ratio, 1.0);

// Wigner semicircle: eigenvalue density of symmetric matrix
let density = wigner_semicircle_density(0.5, 1.0);

// Sample a Wishart matrix
let (n, p) = (100, 50);
let wishart = sample_wishart(n, p);

§Why RMT for ML?

  • Covariance matrices: Sample covariance eigenvalues follow Marchenko-Pastur
  • Neural networks: Weight matrix spectra reveal training dynamics
  • PCA: Distinguish signal from noise eigenvalues
  • Regularization: Set shrinkage based on spectral distribution

§The Marchenko-Pastur Law

For a matrix X (n samples, p features), the eigenvalues of X^T X / n cluster in [λ_-, λ_+] where:

λ_± = σ² (1 ± √(p/n))²

Density: ρ(λ) = (1/(2πσ²)) × √((λ_+ - λ)(λ - λ_-)) / (γλ)

When p/n → 0, this converges to a point mass at σ² (classical regime). When p/n > 0, eigenvalues spread (high-dimensional regime).

§The Wigner Semicircle

For a symmetric matrix with i.i.d. entries, eigenvalues follow:

ρ(λ) = (1/(2πσ²)) × √(4σ² - λ²)  for |λ| ≤ 2σ

§Connections

  • wass: Wishart matrices → covariance → transport costs
  • lapl: Graph Laplacian eigenvalues follow RMT under random graphs
  • rkhs: Kernel matrix eigenspectra for kernel PCA

§What Can Go Wrong

  1. Finite size effects: MP/semicircle are asymptotic. Small n deviates.
  2. Not centered: MP assumes zero-mean data. Center your features.
  3. Correlated features: MP assumes independence. Correlated data has different spectrum.
  4. Ratio out of range: MP needs p/n ∈ (0, ∞). Tracy-Widom for edge.
  5. Numerical eigendecomposition: For large matrices, use iterative methods.

§References

  • Marchenko & Pastur (1967). “Distribution of eigenvalues for some sets of random matrices”
  • Wigner (1955). “Characteristic vectors of bordered matrices with infinite dimensions”
  • Johnstone (2001). “On the distribution of the largest eigenvalue in PCA”

Enums§

Error

Functions§

empirical_spectral_density
Empirical spectral density via histogram.
level_spacing_ratios
Level spacing ratio for eigenvalue sequence.
marchenko_pastur_density
Marchenko-Pastur density at point λ.
marchenko_pastur_support
Marchenko-Pastur support bounds [λ_-, λ_+].
mean_spacing_ratio
Mean level spacing ratio.
sample_goe
Sample a GOE (Gaussian Orthogonal Ensemble) matrix.
sample_goe_faer
Sample a GOE (Gaussian Orthogonal Ensemble) matrix.
sample_wishart
Sample a Wishart matrix: W = X^T X where X is n × p Gaussian.
sample_wishart_faer
Sample a Wishart matrix: W = X^T X where X is n × p Gaussian.
stieltjes_transform
Stieltjes transform: m(z) = (1/n) Σ 1/(λ_i - z)
wigner_semicircle_density
Wigner semicircle density at point λ.

Type Aliases§

Result