pub fn ll_dirichlet<T>(data1: &[T], data2: &[T]) -> TExpand description
Calculates the symmetric relative log likelihood (log Dirichlet likelihood) of rolling
data2 versus data1 in n2 trials on a die that rolled data1 in n1 trials.
The formula used is based on the Dirichlet-Multinomial model, and it computes the difference in likelihood between the two sets of data under a Dirichlet distribution assumption. This measure is useful for comparing the distribution of counts between two categorical datasets, typically for hypothesis testing or evaluating model performance when categorical data is involved.
The equation is as follows:
..math:: D(data1, data2) = \sqrt{ \frac{1}{n2} \left( \log \beta(data1, data2) - \log \beta(n1, n2) - ( \text{self_denom2} - \log \text{single_beta}(n2) ) \right) + \frac{1}{n1} \left( \log \beta(data2, data1) - \log \beta(n2, n1) - ( \text{self_denom1} - \log \text{single_beta}(n1) ) \right) }
§Arguments
data1- A slice ofTvalues representing the first data set (e.g., counts from one die roll).data2- A slice ofTvalues representing the second data set (e.g., counts from another die roll).
§Returns
Returns a T value representing the log likelihood of data2 relative to data1. A higher value indicates that data2 is more likely given data1.
§Examples
use fast_distances::*;
let data1: Vec<f64> = vec![1.0, 2.0, 3.0, 4.0];
let data2: Vec<f64> = vec![5.0, 6.0, 7.0, 8.0];
let result = ll_dirichlet(&data1, &data2);
println!("Log Dirichlet likelihood: {}", result);