Expand description
This crate implements a Probabilistic Principal Component Analysis in pure Rust, based on a
legacy Python version, which was just too slow for the job (even with numpy
“C code” to spice it up!)
If you want all the comfort that the Python ecosystem can provide, you are welcome to check out the python wrapper in PyPI. However, if you want top performance and great debugging support, this crate is for you.
To get to know more about the PPCA algorithm for missing data, please check the references below:
- https://www.robots.ox.ac.uk/~cvrg/hilary2006/ppca.pdf: first result on Google.
- http://perception.inrialpes.fr/people/Horaud/Courses/pdf/Horaud-MLSVP9.pdf: lecture slides, also covering PPCA mixtures.
- https://www.youtube.com/watch?v=lJ0cXPoEozg: video on YouTube explaining the topic.
Structs§
- Dataset
- Represents a dataset. This is a wrapper over a 2D array of dimensions
(n_samples, n_features)
. - Inferred
Masked - The inferred probability distribution in the state space of a given sample.
- Inferred
Masked Mix - The inferred probability distribution in the state space of a given sample of a PPCA Mixture
Model. This class is the analogous of
InferredMasked
for thePPCAMix
model. - Masked
Sample - A data sample with potentially missing values.
- PPCAMix
- A mixture of PPCA models. Each PPCA model is associated with a prior probability expressed in log-scale. This models allows for modelling of data clustering and non-linear learning of data. However, it will use significantly more memory and is not guaranteed to converge to a global maximum.
- PPCA
Model - A PPCA model which can infer missing values.
- Posterior
Sampler - Samples from the posterior distribution of an inferred sample. The sample is smoothed, that is, it does not include the model isotropic noise.
- Posterior
Sampler Mix - Samples from the posterior distribution of an inferred sample. The sample is smoothed, that is, it does not include the model isotropic noise.
- Prior
- A prior for the PPCA model. Use this class to mitigate overfit on training (especially on frequently masked dimensions) and to input a priori knowledge on what the PPCA should look like.