Crate xicor

Crate xicor 

Source
Expand description

This crate provides a reasonably efficient implementation of Sourav Chatterjee’s xi-correlation coefficient, based on the original paper.

Chatterjee’s xi provides a measure of one variable’s dependence on another in a much more general sense than, for example, Pearson’s correlation coefficient. Suppose we have some sequence of random x values uniformly distributed from zero to tau. For each one, we compute y = sin(x). Pearson’s correlation coefficient will be roughly zero for this data, as it measures linear dependence. On the other hand, Chatterjee’s xi will be close to 1, representing that y is strongly a function of x, regardless of what function that may be.

§Highlights

  • Extremely simple to use (just call xicor(), xicorf(), etc, with two slices containing the data)
  • Generic over Ord, as xi does not require calculations on the elements themselves, only the ability to compare them. In principle even strings could be correlated in this manner (lexicographically), for example.
  • Quite fast. In release mode on a 12-year-old machine (Dell M4700), xicorf was able to process 1,000,000 pairs in 0.33 seconds. Profiling revealed that 80% of this calculation lay in the standard library’s sorting routines.

§Progress

  • Calculation of the xi coefficient itself
  • P-values for testing independence

Functions§

xicor
Calculate the xi-correlation of two sequences whose values are orderable (they implement Ord).
xicor_norm
Calculate the normalised xi-correlation of two sequences whose values are orderable (they implement Ord).
xicorf
Calculate the xi-correlation of two floating-point sequences.
xicorf_norm
Calculate the normalised xi-correlation of two floating-point sequences.