Module preprocessing

Module preprocessing 

Source
Expand description

Data preprocessing operations with SIMD acceleration (normalization, standardization) SIMD-accelerated preprocessing operations for array normalization and standardization

This module provides high-performance implementations of common data preprocessing operations that are critical for machine learning pipelines, statistical analysis, and scientific computing.

§Operations

  • L2 Normalization (normalize_simd): Converts vectors to unit length
  • Z-Score Standardization (standardize_simd): Zero mean, unit variance
  • Value Clipping (clip_simd): Bounds values to a specified range

§Performance

All operations automatically use SIMD acceleration when:

  • Platform supports AVX2 (x86_64) or NEON (ARM)
  • Array size is large enough to benefit from vectorization
  • Array memory layout is contiguous

Falls back to scalar implementations for small arrays or unsupported platforms.

§Examples

use scirs2_core::ndarray::array;
use scirs2_core::ndarray_ext::preprocessing::{normalize_simd, standardize_simd, clip_simd};

// L2 normalization - convert to unit vector
let x = array![3.0, 4.0];  // norm = 5
let normalized = normalize_simd(&x.view());
// Result: [0.6, 0.8]

// Z-score standardization
let data = array![2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0];
let standardized = standardize_simd(&data.view());
// Result: mean ≈ 0, std ≈ 1

// Value clipping
let values = array![-10.0, -5.0, 0.0, 5.0, 10.0];
let clipped = clip_simd(&values.view(), -3.0, 7.0);
// Result: [-3.0, -3.0, 0.0, 5.0, 7.0]

Functions§

clip_simd
Clip (clamp) array values to a specified range (SIMD-accelerated).
leaky_relu_simd
Compute Leaky ReLU activation with SIMD acceleration.
normalize_simd
Normalize a 1D array to unit length using L2 norm (SIMD-accelerated).
relu_simd
Compute ReLU (Rectified Linear Unit) activation with SIMD acceleration.
softmax_simd
Compute softmax activation function with SIMD acceleration (Phase 33).
standardize_simd
Standardize a 1D array to zero mean and unit variance (SIMD-accelerated).