Module lace_cc::alg

source ·
Expand description

Data types for choosing different methods of sampling in CrossCat

There are currently four algorithms that have their own special considerations.

§Gibbs

This is the collapsed Gibbs sampler outlined by Radford Neal. In this method rows/columns are randomly removed and then reinserted into the table according to the contribution to the partition and their likelihood under each component.

§Pros

  • Simple
  • Valid MCMC

§Cons

  • Slow due to random access and moderately expensive posterior predictive computation
  • Mixes slowly due to moving only one row/column at a time

§Citations

Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of computational and graphical statistics, 9(2), 249-265.

§Sequential Adaptive Merge-Split (Sams)

Two rows/columns are selected at random. If they are in the same component, a split is proposed using an adaptive restricted Gibbs sweep; if they are in different components, a merge of the two components is proposed.

§Pros

  • Fast since only subsets of the data are looked at
  • Potential for big moves since Sams operates on swaths of data

§Cons

  • Many proposals are rejected, especially merge proposals
  • Difficult to do fine-grained moves. It’s best to pair with another algorithms.

§Citations

Jain, S., & Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. Journal of computational and Graphical Statistics, 13(1), 158-182.

§Finite Approximation (FiniteCpu)

In this algorithm a new component is proposed from the prior, then each row/column is reassigned according to its likelihood under each component. This algorithm does not marginalize away the component parameters like Sams and Gibbs, but explicitly represent the component parameters.

§Pros

  • Extremely fast due to cache efficiency and parallelization
  • Simple

§Cons

  • Not a valid MCMC method. It only approximates the distribution so it tends to create fewer components than there should be, which can cause under-fitting.
  • Slow to mix since rows/columns are moved individually and since the new component is drawn from the prior, it is never guaranteed to fit to anything particularly well.

§Slice sampler (Slice)

This is nearly identical to FiniteCpu, but uses a stick-breaking prior and a slice (or beam) variable to approximate the correct distribution.

NOTE: It is best to use a Gamma prior (not InvGamma) on the CRP α when using this algorithm.

§Pros

  • Fast
  • Correct MCMC

§Cons

  • Users must be careful when using a prior on CRP α that has high or infinite variance. If α is too high, the stick will break indefinitely and cause an panic.

Enums§