Skip to main content

RowJet

Trait RowJet 

Source
pub trait RowJet<const K: usize>: Copy {
    type Value: Copy;

Show 19 methods // Required methods fn constant(c: f64) -> Self; fn variable(x: f64, slot: usize) -> Self; fn values(&self) -> Self::Value; fn add(&self, o: &Self) -> Self; fn sub(&self, o: &Self) -> Self; fn mul(&self, o: &Self) -> Self; fn scale(&self, s: f64) -> Self; fn compose_unary_with<const N: usize>( &self, stack_fn: impl Fn(f64) -> [f64; N], ) -> Self; fn guard(&self, pred: impl Fn(f64) -> bool) -> GuardVerdict; fn scale_rows(&self, s: Self::Value) -> Self; fn pack_rows(rows: &[usize], value_of: impl Fn(usize) -> f64) -> Self::Value; // Provided methods fn neg(&self) -> Self { ... } fn exp(&self) -> Self { ... } fn ln(&self) -> Self { ... } fn sqrt(&self) -> Self { ... } fn recip(&self) -> Self { ... } fn powf(&self, a: f64) -> Self { ... } fn ln_gamma(&self) -> Self { ... } fn digamma(&self) -> Self { ... }
}
Expand description

The shared row-NLL algebra over BOTH the scalar jets and the f64x4 lane towers — the bound that lets ONE single-source row-NLL body SIMD-batch 4 rows/pass without a dual-source copy (module §“The RowJet bridge”).

Every scalar crate::jet_scalar::JetScalar<K> is a RowJet<K> via the blanket impl below (Value = f64), bit-identically to its JetScalar methods; Tower3Lane / Tower4Lane over f64x4 are RowJet<K> with Value = [f64; 4], routing through their per-lane methods so lane i of a batched evaluation is to_bits-identical to the scalar evaluation on row i.

Required Associated Types§

Source

type Value: Copy

The value channel(s) seen by guard and values: a single f64 on a scalar jet, [f64; 4] on an f64x4 lane tower.

Required Methods§

Source

fn constant(c: f64) -> Self

A constant (value c, all derivatives zero), broadcast to every lane.

Source

fn variable(x: f64, slot: usize) -> Self

The seeded primary slot at value x (unit first derivative in slot), broadcast to every lane. Per-lane-DISTINCT seeding for the batch path is done by the lane instantiators (generic_batched_fourth_tower / generic_batched_third_tower), which build the tower variables directly from each row’s primaries; this method is for any row-invariant auxiliary variable a body introduces.

Source

fn values(&self) -> Self::Value

The value channel(s): f64 (scalar) or [f64; 4] (lane).

Source

fn add(&self, o: &Self) -> Self

Truncated Leibniz self + o.

Source

fn sub(&self, o: &Self) -> Self

Truncated Leibniz self − o.

Source

fn mul(&self, o: &Self) -> Self

Truncated Leibniz self · o.

Source

fn scale(&self, s: f64) -> Self

Multiply every channel by the plain scalar s.

Source

fn compose_unary_with<const N: usize>( &self, stack_fn: impl Fn(f64) -> [f64; N], ) -> Self

Faà di Bruno compose with a unary special function whose [f64; N] derivative stack is built from the running base value PER LANE through stack_fn. This is the SHARED-TRAIT version of the compose_unary_with inherent method that already exists on both the scalar towers and the lane towers: on a scalar jet stack_fn is run once at the value; on an f64x4 lane tower it is re-run per lane (the four rows carry four distinct base values), so lane i is to_bits-identical to the scalar result on row i. Making it a trait method is precisely what lets a body written once over R: RowJet<K> instantiate at the batch towers. N is widened/narrowed to the tower’s native width by [resize_stack] (N == 5 is a verbatim copy).

Source

fn guard(&self, pred: impl Fn(f64) -> bool) -> GuardVerdict

Per-lane domain guard: evaluate pred on each active lane’s value channel and report which lanes failed (see GuardVerdict). A scalar jet checks its one value; a lane tower checks all four. Lets a batched program detect an out-of-domain row in a 4-group and bail that group to the scalar tail.

Source

fn scale_rows(&self, s: Self::Value) -> Self

Per-lane scale: multiply every channel by the per-lane factor s (Self::Value). On a scalar jet Self::Value = f64, so this is exactly scale and the scalar call sites stay BIT-IDENTICAL when .scale(x) is rewritten to .scale_rows(x); on an f64x4 lane tower Self::Value = [f64; 4] and lane i is multiplied by s[i]. This is the primitive that lets a batched body carry CONTINUOUS per-row data — the survival covariance_ones / z_sum / observation-weight wi factors that enter the jet algebra as .scale(per_row_value) and that the single-f64 scale would broadcast wrongly across the four rows. Build s from the lane→row map with pack_rows.

Source

fn pack_rows(rows: &[usize], value_of: impl Fn(usize) -> f64) -> Self::Value

Gather a per-lane auxiliary datum from the lane→row map rows: value_of(r) is evaluated for each active lane’s row and packed into Self::Value (a single f64 on a scalar jet, [f64; 4] on an f64x4 lane tower). This is how a body written once over RowJet feeds per-row CONTINUOUS data (the arguments to scale_rows) into the batch path without knowing the concrete representation: the program holds the per-row data and the caller threads rows (length 1 scalar, length 4 batch) into RowNllProgramRowJet::row_nll, so the body writes x.scale_rows(R::pack_rows(rows, |r| self.cov(r))). A multiplicative weight buried in a compose_unary_with stack is pulled out the same way: x.compose_unary_with(|u| stack(u, 1.0)).scale_rows(R::pack_rows(rows, |r| self.wi(r))). (Binary per-row branches such as the event indicator di are kept lane-uniform by grouping and the guard bail, not packed.)

Provided Methods§

Source

fn neg(&self) -> Self

Negate every channel. Defaults to scale(-1.0); the blanket overrides it to delegate to crate::jet_scalar::JetScalar::neg.

Source

fn exp(&self) -> Self

e^self.

Source

fn ln(&self) -> Self

ln(self). Caller guarantees positivity.

Source

fn sqrt(&self) -> Self

√self. Caller guarantees positivity.

Source

fn recip(&self) -> Self

1/self.

Source

fn powf(&self, a: f64) -> Self

self^a for real a. Caller guarantees a positive base.

Source

fn ln_gamma(&self) -> Self

ln Γ(self). Caller guarantees a positive argument.

Source

fn digamma(&self) -> Self

ψ(self) (digamma). Caller guarantees a positive argument.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

Implementors§

Source§

impl<const K: usize, S: JetScalar<K>> RowJet<K> for S

Blanket: every scalar crate::jet_scalar::JetScalar<K> is a RowJet<K> with Value = f64. Each op delegates to the identical JetScalar method, so the existing scalar call sites compile UNCHANGED and bit-identically — the bridge adds the lane representation without churning the scalar path. (The concrete lane impls below cannot overlap this: Tower3Lane / Tower4Lane are local types that do not implement JetScalar, and the orphan rule forbids any downstream impl, so the coherence checker proves the impls disjoint.)

Source§

impl<const K: usize> RowJet<K> for Tower3Lane<f64x4, K>

The f64x4 lane Tower3Lane is a RowJet<K> with Value = [f64; 4], the order-≤3 sibling of the Tower4Lane impl. A body that uses N == 5 stacks drops the (unused) fourth-derivative entry here, matching the scalar Tower3 which also carries only up to the third tensor.

Source§

type Value = [f64; 4]

Source§

impl<const K: usize> RowJet<K> for Tower4Lane<f64x4, K>

The f64x4 lane Tower4Lane is a RowJet<K> with Value = [f64; 4], routing each op through its existing per-lane method. Lane i of a batched evaluation is to_bits-identical to the scalar Tower4 evaluation on row i (the per-lane methods are term-for-term lifts of the scalar tower).

Source§

type Value = [f64; 4]