1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
//! Stochastic Taylor Derivative Estimators (STDE).
//!
//! # Custom-ops caveat
//!
//! Every estimator in this module takes `&BytecodeTape<F>` (immutable),
//! so none of them call `tape.forward(x)` before the Taylor pass.
//! `forward_tangent` linearizes custom ops around the **recording-time**
//! primals stored in `tape.values`, not the call-time `x`. For tapes
//! containing custom ops evaluated at `x ≠ x_record`, estimator output
//! for second- and higher-order coefficients (Laplacian, Hessian
//! diagonal, Taylor-jet composition) is biased by an amount that does
//! not vanish as `S → ∞`. First-order estimators like `divergence` get
//! an `O(‖x − x_record‖)` linear bias through custom ops.
//!
//! If your tape contains custom ops and you need estimators at points
//! other than the recording `x`, either (a) re-record with a fresh
//! tape at the new `x`, or (b) refactor the custom op into primitive
//! operations so `forward_tangent`'s Taylor propagation remains exact.
//!
//! Estimate differential operators (Laplacian, Hessian diagonal, directional
//! derivatives) by pushing random direction vectors through Taylor-mode AD.
//!
//! # How it works
//!
//! For f: R^n -> R at point x, define g(t) = f(x + t*v) where v is a
//! direction vector. The Taylor coefficients of g at t=0 are:
//!
//! - c0 = f(x)
//! - c1 = nabla f(x) . v (directional first derivative)
//! - c2 = v^T H_f(x) v / 2 (half the directional second derivative)
//!
//! By choosing v appropriately (Rademacher, Gaussian, coordinate basis),
//! we can estimate operators like the Laplacian in O(S*K*L) time instead
//! of O(n^2*L) for the full Hessian.
//!
//! # Variance properties
//!
//! The Hutchinson estimator `(1/S) sum_s v_s^T H v_s` is unbiased when
//! E\[vv^T\] = I. Its variance depends on the distribution of v:
//!
//! - **Rademacher** (entries ±1): Var = `2 sum_{i≠j} H_ij^2`. The diagonal
//! contributes zero variance since `v_i^2 = 1` always.
//! - **Gaussian** (v ~ N(0,I)): Var = `2 ||H||_F^2`. Higher variance than
//! Rademacher because `v_i^2 ~ chi-squared(1)` introduces diagonal noise.
//!
//! The [`laplacian_with_control`] function reduces Gaussian variance to match
//! Rademacher by subtracting the exact diagonal contribution (a control
//! variate). This requires the diagonal from [`hessian_diagonal`] (n extra
//! evaluations). For Rademacher directions, the control variate has no effect
//! since the diagonal variance is already zero.
//!
//! **Antithetic sampling** (pairing +v with -v) does **not** reduce variance
//! for trace estimation. The Hutchinson sample `v^T H v` is quadratic (even)
//! in v, so `(-v)^T H (-v) = v^T H v` — antithetic pairs are identical.
//!
//! # Design
//!
//! - **No `rand` dependency**: all functions accept user-provided direction
//! vectors. The library stays pure; users bring their own RNG.
//! - **`Taylor<F, 3>`** for second-order operators: stack-allocated, Copy,
//! monomorphized. The order K=3 is statically known.
//! - **`TaylorDyn`** variants for runtime-determined order.
//! - **Panics on misuse**: dimension mismatches panic, following existing
//! API conventions (`record`, `grad`, `hvp`).
//!
//! # Const-Generic Higher-Order Diagonal
//!
//! [`diagonal_kth_order_const`] is a stack-allocated variant of
//! [`diagonal_kth_order`] for compile-time-known derivative order. It uses
//! `Taylor<F, ORDER>` directly (no `TaylorDyn` arena), which is faster when
//! the order is statically known. `ORDER = k + 1` where k is the derivative
//! order. For f32, practical limit is `ORDER ≤ 14` (k ≤ 13) since `k! > 2^23`
//! causes precision loss.
//!
//! # Higher-Order Estimation
//!
//! [`diagonal_kth_order`] generalises [`hessian_diagonal`] from k=2 to
//! arbitrary k, computing exact `[∂^k u/∂x_j^k]` for all coordinates via
//! `TaylorDyn` jets of order k+1. The stochastic variant
//! [`diagonal_kth_order_stochastic`] subsamples coordinates for an unbiased
//! estimate of `Σ_j ∂^k u/∂x_j^k`.
//!
//! # Dense STDE for Positive-Definite Operators
//!
//! [`dense_stde_2nd`] estimates `tr(C · H_u(x)) = Σ_{ij} C_{ij} ∂²u/∂x_i∂x_j`
//! for a positive-definite coefficient matrix C. The caller provides a Cholesky
//! factor L (such that `C = L L^T`) and standard Gaussian vectors z. Internally,
//! `v = L · z` is computed, then pushed through a second-order Taylor jet. When
//! `L = I`, this reduces to the Hutchinson Laplacian estimator.
//!
//! # Dense STDE for Indefinite Operators (requires `nalgebra`)
//!
//! [`dense_stde_2nd_indefinite`] handles arbitrary symmetric C matrices by
//! eigendecomposing into positive and negative parts. Near-zero eigenvalues are
//! clamped to prevent sign-flipping from floating-point noise.
//!
//! # Parabolic PDE σ-Transform
//!
//! [`parabolic_diffusion`] computes `½ tr(σσ^T · Hess u)` for parabolic PDEs
//! (Fokker-Planck, Black-Scholes, HJB) by pushing columns of σ through
//! second-order Taylor jets. This avoids forming the off-diagonal Hessian
//! entries that a naïve `tr(A·H)` computation would require.
//!
//! # Sparse STDE (requires `stde` + `diffop`)
//!
//! `stde_sparse` estimates arbitrary differential operators `Lu(x)` by
//! sampling sparse k-jets from a `SparseSamplingDistribution` built via
//! `DiffOp::sparse_distribution`. Each sample is a single forward
//! pushforward; the per-sample estimator is `sign(C_α) · Z · D^α u` where
//! Z = Σ|C_α| is the normalization constant. This implements the core
//! contribution of Shi et al. (NeurIPS 2024).
pub use ;
pub use ;
pub use ;
pub use ;
pub use dense_stde_2nd_indefinite;
pub use ;
pub use ;
pub use ;
pub use stde_sparse;