1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
//! Shared sparse matrix utilities
//!
//! This module provides backend-agnostic utilities for sparse matrix operations.
//! Functions here are used by both CPU and GPU backends to ensure consistency.
use crateElement;
/// Compute zero tolerance threshold for sparse operations
///
/// Returns the threshold below which values are considered zero and eliminated
/// from sparse representations. The threshold is dtype-dependent to account for
/// different numeric precision levels.
///
/// # Rationale
///
/// Sparse matrices rely on the assumption that most values are exactly zero.
/// However, due to floating-point rounding errors, operations like subtraction
/// can produce values like `1e-16` instead of exactly `0.0`. These "near-zeros"
/// should be eliminated to:
/// 1. Maintain sparsity (avoid memory bloat)
/// 2. Preserve numerical stability (avoid accumulation of rounding errors)
/// 3. Match user expectations (A - A should be truly sparse)
///
/// # Precision Levels
///
/// - **F64/I64/U64**: `1e-15` - Near machine epsilon, preserves maximum precision
/// - **F32/I32/U32**: `1e-7` - ~7 ULPs (unit of least precision), balances precision and sparsity
/// - **F16/BF16/I16/U16**: `1e-3` - Aggressive threshold due to limited precision
/// - **FP8/I8/U8**: `1e-2` - Very aggressive due to extreme quantization
///
/// # Backend Consistency
///
/// This function is used identically by all backends (CPU, CUDA, WebGPU) to ensure
/// that sparse operations produce the same sparsity pattern regardless of where
/// they execute. Same input tensors always produce same output structure.