Expand description
GPU substrate for de-nested cubic-cell derivative moments.
This module is the shared GPU evaluator for the de-nested cubic transport
kernel that currently lives in src/families/cubic_cell_kernel.rs. For
each partition cell (left, right, c_0, c_1, c_2, c_3) it computes the
derivative-moment vector
M_k = ∫_{left}^{right} z^k · exp(-q(z)) dz, k = 0..=max_degree,
q(z) = 0.5 · (z² + η(z)²),
η(z) = c_0 + c_1·z + c_2·z² + c_3·z³.Three branches feed into the same device API:
- Affine (
c_2 = c_3 = 0, finite interval): closed-form via theT_n(a,b)recurrence used byaffine_anchor_moment_vector_into. - Non-affine finite: fixed 384-point Gauss–Legendre on the cell.
- Affine tail: closed-form on a semi-infinite (or whole-line) interval.
This is distinct from src/gpu/cubic_bspline_moments.rs, which
computes tensor B-spline cell moments. The two modules share neither math
nor data layout: do not conflate them.
§Layout
- [
branch] — host-side branch classifier; mirrorscubic_cell_kernel::branch_cell+ the semi-infinite tail logic ofevaluate_cell_state_dispatched. - [
host_substrate] — CPU-resident implementation. Works on every platform and is the parity reference for the device kernel. - [
kernel_src] — NVRTC-compilable CUDA C++ source as Rust string constants (D9 / D15 / D21 specializations). - [
device] — Linux+CUDA dispatcher that compiles, launches, and gathers the NVRTC kernel for the NonAffineFinite bucket; Affine / AffineTail buckets stay on CPU until Stage-2.