tenrso_kernels/
lib.rs

1//! # tenrso-kernels
2//!
3//! High-performance tensor kernel operations for TenRSo.
4//!
5//! **Version:** 0.1.0-alpha.2
6//! **Tests:** 138 passing (100%)
7//! **Status:** Production-ready with comprehensive statistical toolkit
8//!
9//! ## Overview
10//!
11//! This crate provides optimized implementations of fundamental tensor operations
12//! used in tensor decompositions (CP-ALS, Tucker, TT) and tensor computations.
13//!
14//! **Key Features:**
15//! - ✅ **Khatri-Rao product** - Column-wise Kronecker product (serial & parallel)
16//! - ✅ **Kronecker product** - Tensor product of matrices (serial & parallel)
17//! - ✅ **Hadamard product** - Element-wise multiplication (allocating & in-place)
18//! - ✅ **N-mode products** - Tensor-matrix multiplication along any mode
19//! - ✅ **Tensor-Tensor Product (TTT)** - General tensor contraction operation
20//! - ✅ **MTTKRP** - Core CP-ALS kernel (standard, blocked, fused, parallel variants)
21//! - ✅ **Outer products** - Tensor construction from vectors
22//! - ✅ **Tucker operator** - Multi-mode products with automatic optimization
23//! - ✅ **Tensor Train (TT) operations** - TT orthogonalization, norm, dot product
24//! - ✅ **Blocked/tiled operations** - Cache-efficient implementations
25//! - ✅ **Tensor contractions** - Generalized tensor contraction primitives
26//! - ✅ **Tensor reductions** - Sum, mean, variance, std, norms, percentiles, median, skewness, kurtosis, covariance, correlation
27//!
28//! ## Quick Start
29//!
30//! ```rust
31//! use scirs2_core::ndarray_ext::{Array, Array2};
32//! use tenrso_core::DenseND;
33//! use tenrso_kernels::{khatri_rao, mttkrp, nmode_product};
34//!
35//! // Khatri-Rao product (for CP decomposition)
36//! let a = Array2::<f64>::ones((10, 5));
37//! let b = Array2::<f64>::ones((8, 5));
38//! let kr = khatri_rao(&a.view(), &b.view());
39//! assert_eq!(kr.shape(), &[80, 5]);
40//!
41//! // N-mode product (tensor-matrix multiplication)
42//! let tensor = DenseND::<f64>::ones(&[3, 4, 5]);
43//! let matrix = Array2::<f64>::ones((2, 3));
44//! let result = nmode_product(&tensor.view(), &matrix.view(), 0).unwrap();
45//! assert_eq!(result.shape(), &[2, 4, 5]); // mode-0 changed from 3 to 2
46//!
47//! // MTTKRP (core of CP-ALS)
48//! let factors = vec![
49//!     Array2::<f64>::ones((3, 2)),
50//!     Array2::<f64>::ones((4, 2)),
51//!     Array2::<f64>::ones((5, 2)),
52//! ];
53//! let factor_views: Vec<_> = factors.iter().map(|f| f.view()).collect();
54//! let mttkrp_result = mttkrp(&tensor.view(), &factor_views, 1).unwrap();
55//! assert_eq!(mttkrp_result.shape(), &[4, 2]);
56//! ```
57//!
58//! ## Performance
59//!
60//! All operations are highly optimized with:
61//! - **SIMD acceleration** via scirs2_core
62//! - **Parallel execution** for large problems (feature-gated)
63//! - **Cache-efficient tiling** for MTTKRP
64//! - **Zero-copy views** to minimize allocations
65//!
66//! Typical performance (see `PERFORMANCE.md` for details):
67//! - Khatri-Rao: **1.5 Gelem/s** (serial), **3× speedup** (parallel)
68//! - MTTKRP: **13.3 Gelem/s** (blocked parallel)
69//! - N-mode: **>5 Gelem/s** sustained
70//! - Hadamard in-place: **11 Gelem/s**, **2.7× faster** than allocating
71//!
72//! ## Usage Recommendations
73//!
74//! | Operation | When to Use Parallel | Notes |
75//! |-----------|---------------------|-------|
76//! | `khatri_rao_parallel` | Matrices ≥200 rows | 2-3× speedup |
77//! | `kronecker_parallel` | Rarely beneficial | Use serial version |
78//! | `hadamard_inplace` | Always | 2-3× faster than allocating |
79//! | `mttkrp_blocked` | Tensors ≥20³ | Cache-efficient |
80//! | `mttkrp_blocked_parallel` | Tensors ≥30³ | 4-5× speedup |
81//!
82//! ## Examples
83//!
84//! The `examples/` directory contains comprehensive demonstrations:
85//! - `khatri_rao.rs` - Khatri-Rao product with parallel speedup measurements
86//! - `mttkrp_cp.rs` - CP-ALS iteration and MTTKRP variants
87//! - `nmode_tucker.rs` - Tucker decomposition and compression
88//!
89//! Run with:
90//! ```bash
91//! cargo run --example khatri_rao --features parallel
92//! cargo run --example mttkrp_cp --features parallel
93//! cargo run --example nmode_tucker
94//! ```
95//!
96//! ## Features
97//!
98//! - `parallel` (default) - Enable parallel implementations using rayon
99//!
100//! ## SciRS2 Integration
101//!
102//! This crate uses `scirs2-core` for all array operations and numerical computations.
103//! Direct use of `ndarray`, `rand`, or `num-traits` is not permitted.
104//! See `SCIRS2_INTEGRATION_POLICY.md` for details.
105
106#![deny(warnings)]
107
108pub mod contractions;
109pub mod error;
110pub mod hadamard;
111pub mod khatri_rao;
112pub mod kronecker;
113pub mod mttkrp;
114pub mod nmode;
115pub mod outer;
116pub mod randomized;
117pub mod reductions;
118pub mod tt_ops;
119pub mod utils;
120
121#[cfg(test)]
122mod property_tests;
123
124// Re-exports
125pub use contractions::*;
126pub use error::{KernelError, KernelResult};
127pub use hadamard::*;
128pub use khatri_rao::*;
129pub use kronecker::*;
130pub use mttkrp::*;
131pub use nmode::*;
132pub use outer::*;
133pub use randomized::*;
134pub use reductions::*;
135pub use tt_ops::*;
136pub use utils::*;