ll_dirichlet

Function ll_dirichlet 

Source
pub fn ll_dirichlet<T>(data1: &[T], data2: &[T]) -> T
where T: Float + Sum,
Expand description

Calculates the symmetric relative log likelihood (log Dirichlet likelihood) of rolling data2 versus data1 in n2 trials on a die that rolled data1 in n1 trials.

The formula used is based on the Dirichlet-Multinomial model, and it computes the difference in likelihood between the two sets of data under a Dirichlet distribution assumption. This measure is useful for comparing the distribution of counts between two categorical datasets, typically for hypothesis testing or evaluating model performance when categorical data is involved.

The equation is as follows:

..math:: D(data1, data2) = \sqrt{ \frac{1}{n2} \left( \log \beta(data1, data2) - \log \beta(n1, n2) - ( \text{self_denom2} - \log \text{single_beta}(n2) ) \right) + \frac{1}{n1} \left( \log \beta(data2, data1) - \log \beta(n2, n1) - ( \text{self_denom1} - \log \text{single_beta}(n1) ) \right) }

§Arguments

  • data1 - A slice of T values representing the first data set (e.g., counts from one die roll).
  • data2 - A slice of T values representing the second data set (e.g., counts from another die roll).

§Returns

Returns a T value representing the log likelihood of data2 relative to data1. A higher value indicates that data2 is more likely given data1.

§Examples

use fast_distances::*;
let data1: Vec<f64> = vec![1.0, 2.0, 3.0, 4.0];
let data2: Vec<f64> = vec![5.0, 6.0, 7.0, 8.0];
let result = ll_dirichlet(&data1, &data2);
println!("Log Dirichlet likelihood: {}", result);