pub fn kl_divergence(p: &[f64], q: &[f64]) -> f64
KL divergence: D_KL(P || Q) = sum(P * log(P/Q))
Both P and Q must be probability distributions (sum to 1)