Crate gaussian_kde

Crate gaussian_kde 

Source
Expand description

gaussian_kde provides multivariate kernel density estimation (KDE) with Gaussian kernels and optionally weighed data points.

Given a dataset $X = {x_1, \cdots, x_n}$ sampled from an arbitrary probability density function (PDF), the underlying PDF is estimated as a sum of kernel functions $K$ centered at the points of the original dataset: \[ f_\mathrm{KDE}(x) = \frac{1}{\sum_i w_i} \sum_{i=1}^n w_i \, K_H\left(\bm{x} - \bm{x}_i\right). \] Here, $H$ is the bandwidth matrix.

Specifically, this crate implements KDE with multivariate normal kernels and covariance based bandwidths, \[ K_H(\bm{y}) = \frac{1}{\sqrt{(2\pi)^d \det H}} \exp\left(- \frac{1}{2} \bm{y}^\top H^{-1} \bm{y}\right) \quad \text{and} \quad H = h^2 V,\] where $h$ is the scalar bandwidth factor and $V$ is the dataset’s covariance matrix. Inserting this into the equation above, the density estimation reads \[ f_\mathrm{KDE}(x) = \frac{1}{h^d \sqrt{(2\pi)^d \det V} \sum_i w_i} \sum_{i=1}^n w_i \, \exp\left(- \frac{1}{2h^2}(\bm{x} - \bm{x}_i)^\top V^{-1}(\bm{x} - \bm{x}_i)\right). \] For more details on (multivariate) kernel density estimation, see e.g. [1, 2].

This implementation is largely based on the one in scipy.


[1] Gramacki, Artur. Nonparametric Kernel Density Estimation and Its Computational Aspects. Vol. 37. Studies in Big Data. Springer, 2018.

[2] Scott, David W. Multivariate Density Estimation: Theory, Practice, and Visualization. Second edition. Wiley, 2014.

Structs§

GaussianKDE
Multivariate kernel density estimation with Gaussian kernels and optionally weighed data points.
KDEError
General error type for any kind of error appearing during KDE calculation.
ScottBandwidth
Select the scalar bandwidth factor according to Scott’s rule.
SilvermanBandwidth
Select the scalar bandwidth factor according to Silverman’s rule of thumb.

Enums§

ErrorKind

Traits§

Bandwidth
General trait to customize the selection of the scalar bandwidth $h$.