pub fn pearson_correlation(a: &[f32], b: &[f32]) -> f32Expand description
Computes the Pearson correlation coefficient between two vectors a and b.
Two equivalent formulas:
-
Using deviations from mean (implemented here for better numerical stability):
$r = \frac{\sum(x - \bar{x})(y - \bar{y})}{\sqrt{\sum(x - \bar{x})^2\sum(y - \bar{y})^2}}$ -
Direct computation:
$r = \frac{n\sum xy - \sum x\sum y}{\sqrt{(n\sum x^2 - (\sum x)^2)(n\sum y^2 - (\sum y)^2)}}$
where $\bar{x}$ and $\bar{y}$ are the means of vectors $x$ and $y$ respectively,
and $n$ is the length of the vectors.
Note: Formula 1 is used in this implementation because it:
- Reduces the risk of numerical overflow by centering the data
- Provides better numerical stability for large values
§Arguments
a- The first vector.b- The second vector.
§Returns
The Pearson correlation coefficient between a and b.
If either vector is empty or their lengths do not match, returns NaN.
§Examples
let a = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];
let b = [10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0];
let correlation = nwr::pearson_correlation(&a, &b);
assert_eq!(format!("{:.4}", correlation), "-1.0000".to_string()); // Perfect negative correlation
let empty: [f32; 0] = [];
assert!(nwr::pearson_correlation(&empty, &empty).is_nan()); // Check handling of empty vectors