pub fn l2_normalize(data: &Array2<f64>) -> Result<Array2<f64>>
L2 normalization - normalize each sample (row) to unit L2 norm Each row will have length 1, useful for cosine similarity