softmax_simd

Function softmax_simd 

Source
pub fn softmax_simd<F>(x: &ArrayView1<'_, F>) -> Array1<F>
where F: Float + SimdUnifiedOps,
Expand description

Compute softmax activation function with SIMD acceleration (Phase 33).

The softmax function converts a vector of real numbers into a probability distribution where all values are in (0, 1) and sum to 1. It is numerically stable using the max-subtraction trick.

§Arguments

  • x - Input 1D array

§Returns

Array1<F> containing the softmax probabilities, where:

  • All values are in the range (0, 1)
  • Sum of all values equals 1.0
  • Empty input returns empty array

§Performance

  • SIMD: Automatically used for large arrays (1000+ elements)
  • Scalar: Used for small arrays or when SIMD unavailable
  • Speedup: 4-8x for large arrays on AVX2/NEON systems
  • Uses newly implemented min_simd (Phase 29) and sum_simd (Phase 30)

§Mathematical Definition

softmax(x)ᵢ = exp(xᵢ - max(x)) / Σⱼ exp(xⱼ - max(x))

The max-subtraction trick (xᵢ - max(x)) ensures numerical stability by preventing overflow in the exponential function.

§Examples

use scirs2_core::ndarray::array;
use scirs2_core::ndarray_ext::preprocessing::softmax_simd;

let x = array![1.0, 2.0, 3.0];
let result = softmax_simd(&x.view());

// Verify probabilities sum to 1
let sum: f64 = result.iter().sum();
assert!((sum - 1.0).abs() < 1e-10);

// Verify all values in (0, 1)
for &val in result.iter() {
    assert!(val > 0.0 && val < 1.0);
}

§Applications

  • Attention Mechanisms: Compute attention weights in Transformers (PRIMARY USE CASE)
  • Multi-class Classification: Convert logits to class probabilities
  • Neural Networks: Final layer activation for multi-class problems
  • Reinforcement Learning: Action probability distributions
  • Natural Language Processing: Token probability distributions

§Numerical Stability

The implementation subtracts the maximum value before exponentiation to prevent overflow. This is mathematically equivalent to the standard softmax but numerically stable even for large input values.

§Example: Attention Weights

use scirs2_core::ndarray::array;
use scirs2_core::ndarray_ext::preprocessing::softmax_simd;

// Attention scores (before softmax)
let scores = array![2.0, 4.0, 1.0, 3.0];
let attention_weights = softmax_simd(&scores.view());

// Highest score (4.0) gets highest probability
let max_idx = attention_weights
    .iter()
    .enumerate()
    .max_by(|(_, a), (_, b)| a.partial_cmp(b).unwrap())
    .map(|(idx, _)| idx)
    .unwrap();
assert_eq!(max_idx, 1); // Index of score 4.0

// All weights sum to 1
let sum: f64 = attention_weights.iter().sum();
assert!((sum - 1.0).abs() < 1e-10);