Crate tenrso_kernels

Crate tenrso_kernels 

Source
Expand description

§tenrso-kernels

High-performance tensor kernel operations for TenRSo.

Version: 0.1.0-alpha.2 Tests: 138 passing (100%) Status: Production-ready with comprehensive statistical toolkit

§Overview

This crate provides optimized implementations of fundamental tensor operations used in tensor decompositions (CP-ALS, Tucker, TT) and tensor computations.

Key Features:

  • Khatri-Rao product - Column-wise Kronecker product (serial & parallel)
  • Kronecker product - Tensor product of matrices (serial & parallel)
  • Hadamard product - Element-wise multiplication (allocating & in-place)
  • N-mode products - Tensor-matrix multiplication along any mode
  • Tensor-Tensor Product (TTT) - General tensor contraction operation
  • MTTKRP - Core CP-ALS kernel (standard, blocked, fused, parallel variants)
  • Outer products - Tensor construction from vectors
  • Tucker operator - Multi-mode products with automatic optimization
  • Tensor Train (TT) operations - TT orthogonalization, norm, dot product
  • Blocked/tiled operations - Cache-efficient implementations
  • Tensor contractions - Generalized tensor contraction primitives
  • Tensor reductions - Sum, mean, variance, std, norms, percentiles, median, skewness, kurtosis, covariance, correlation

§Quick Start

use scirs2_core::ndarray_ext::{Array, Array2};
use tenrso_core::DenseND;
use tenrso_kernels::{khatri_rao, mttkrp, nmode_product};

// Khatri-Rao product (for CP decomposition)
let a = Array2::<f64>::ones((10, 5));
let b = Array2::<f64>::ones((8, 5));
let kr = khatri_rao(&a.view(), &b.view());
assert_eq!(kr.shape(), &[80, 5]);

// N-mode product (tensor-matrix multiplication)
let tensor = DenseND::<f64>::ones(&[3, 4, 5]);
let matrix = Array2::<f64>::ones((2, 3));
let result = nmode_product(&tensor.view(), &matrix.view(), 0).unwrap();
assert_eq!(result.shape(), &[2, 4, 5]); // mode-0 changed from 3 to 2

// MTTKRP (core of CP-ALS)
let factors = vec![
    Array2::<f64>::ones((3, 2)),
    Array2::<f64>::ones((4, 2)),
    Array2::<f64>::ones((5, 2)),
];
let factor_views: Vec<_> = factors.iter().map(|f| f.view()).collect();
let mttkrp_result = mttkrp(&tensor.view(), &factor_views, 1).unwrap();
assert_eq!(mttkrp_result.shape(), &[4, 2]);

§Performance

All operations are highly optimized with:

  • SIMD acceleration via scirs2_core
  • Parallel execution for large problems (feature-gated)
  • Cache-efficient tiling for MTTKRP
  • Zero-copy views to minimize allocations

Typical performance (see PERFORMANCE.md for details):

  • Khatri-Rao: 1.5 Gelem/s (serial), 3× speedup (parallel)
  • MTTKRP: 13.3 Gelem/s (blocked parallel)
  • N-mode: >5 Gelem/s sustained
  • Hadamard in-place: 11 Gelem/s, 2.7× faster than allocating

§Usage Recommendations

OperationWhen to Use ParallelNotes
khatri_rao_parallelMatrices ≥200 rows2-3× speedup
kronecker_parallelRarely beneficialUse serial version
hadamard_inplaceAlways2-3× faster than allocating
mttkrp_blockedTensors ≥20³Cache-efficient
mttkrp_blocked_parallelTensors ≥30³4-5× speedup

§Examples

The examples/ directory contains comprehensive demonstrations:

  • khatri_rao.rs - Khatri-Rao product with parallel speedup measurements
  • mttkrp_cp.rs - CP-ALS iteration and MTTKRP variants
  • nmode_tucker.rs - Tucker decomposition and compression

Run with:

cargo run --example khatri_rao --features parallel
cargo run --example mttkrp_cp --features parallel
cargo run --example nmode_tucker

§Features

  • parallel (default) - Enable parallel implementations using rayon

§SciRS2 Integration

This crate uses scirs2-core for all array operations and numerical computations. Direct use of ndarray, rand, or num-traits is not permitted. See SCIRS2_INTEGRATION_POLICY.md for details.

Re-exports§

pub use error::KernelError;
pub use error::KernelResult;
pub use contractions::*;
pub use hadamard::*;
pub use khatri_rao::*;
pub use kronecker::*;
pub use mttkrp::*;
pub use nmode::*;
pub use outer::*;
pub use randomized::*;
pub use reductions::*;
pub use tt_ops::*;
pub use utils::*;

Modules§

contractions
Tensor contraction operations
error
Error types for tensor kernel operations
hadamard
Hadamard (element-wise) product implementation
khatri_rao
Khatri-Rao product (column-wise Kronecker product)
kronecker
Kronecker product implementation
mttkrp
MTTKRP (Matricized Tensor Times Khatri-Rao Product) implementation
nmode
N-mode product implementation (TTM - Tensor Times Matrix)
outer
Outer product operations for tensor construction
randomized
Randomized tensor operations for large-scale decompositions.
reductions
Tensor reduction operations
tt_ops
Tensor Train (TT) operations for TT decomposition and manipulation.
utils
Utility functions and helpers for tensor kernel operations