standardize_simd

Function standardize_simd 

Source
pub fn standardize_simd<F>(x: &ArrayView1<'_, F>) -> Array1<F>
where F: Float + SimdUnifiedOps,
Expand description

Standardize a 1D array to zero mean and unit variance (SIMD-accelerated).

Computes (x - μ) / σ where μ is the mean and σ is the sample standard deviation. The resulting array will have mean ≈ 0 and standard deviation ≈ 1.

§Arguments

  • x - Input 1D array to standardize

§Returns

Array1<F> with the same length as input, standardized to zero mean and unit variance. Returns zero array if input has <= 1 element or zero standard deviation.

§Performance

  • SIMD: Automatically used for large arrays (1000+ elements)
  • Scalar: Used for small arrays or when SIMD unavailable
  • Speedup: 2-4x for large f32 arrays on AVX2 systems

§Mathematical Definition

standardize(x) = (x - μ) / σ
where:
  μ = (1/n) Σ xᵢ         (sample mean)
  σ = sqrt((1/(n-1)) Σ (xᵢ - μ)²)  (sample std, ddof=1)

§Examples

use scirs2_core::ndarray::array;
use scirs2_core::ndarray_ext::preprocessing::standardize_simd;

let x = array![2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0];
let result = standardize_simd(&x.view());

// Verify mean ≈ 0
let mean: f64 = result.iter().sum::<f64>() / result.len() as f64;
assert!(mean.abs() < 1e-10);

// Verify std ≈ 1 (sample std with ddof=1)
let variance: f64 = result.iter()
    .map(|&x| x * x)
    .sum::<f64>() / (result.len() - 1) as f64;
let std = variance.sqrt();
assert!((std - 1.0).abs() < 1e-10);

§Edge Cases

  • Empty array: Returns empty array
  • Single element: Returns zero array (std undefined for n=1)
  • Constant array: Returns zero array (std = 0)
  • NaN values: Returns zero array

§Applications

  • Machine Learning: Feature preprocessing for models assuming normally distributed features
  • Statistical Analysis: Z-score computation for outlier detection
  • Time Series: Detrending and variance normalization
  • Data Science: Preparing features for PCA, clustering, regression
  • Neural Networks: Input normalization for faster convergence

§Implementation Notes

Uses sample standard deviation (ddof=1, Bessel’s correction) rather than population standard deviation (ddof=0). This is consistent with NumPy, SciPy, and pandas default behavior.