Module natural_gradient

Expand description

Natural gradient optimization on statistical manifolds

§Natural Gradient Optimization

This module implements natural gradient descent algorithms for optimization on statistical manifolds and Riemannian manifolds, leveraging information geometry principles for enhanced convergence properties.

§Mathematical Background

Natural gradient descent modifies standard gradient descent by using the Fisher information matrix (or more generally, a Riemannian metric) to precondition the gradient updates:

θ_{t+1} = θ_t - α G^{-1}(θ_t) ∇f(θ_t)

where G(θ) is the Fisher information matrix or Riemannian metric tensor.

For statistical manifolds, the Fisher information matrix is:

G_{ij}(θ) = E[∂_i log p(x|θ) ∂_j log p(x|θ)]

This approach provides invariance under reparameterization and often exhibits superior convergence properties compared to standard gradient descent.

Modules§

info_geom: Information geometry utilities for statistical manifolds

Structs§

NaturalGradientConfig: Configuration for natural gradient optimization
NaturalGradientOptimizer: Natural gradient optimizer for statistical manifolds
NaturalGradientResult: Results from natural gradient optimization

Traits§

ObjectiveWithFisher: Trait for defining objective functions with Fisher information