pub fn generic_batched_third_tower<const K: usize, P: RowNllProgramRowJet<K> + ?Sized>(
prog: &P,
rows: [usize; 4],
) -> Result<Tower3Batch<K>, String>Expand description
Evaluate a RowNllProgramRowJet at the f64x4 lane Tower3Batch,
computing (v, g, H, t3) for FOUR rows in one SIMD pass — the order-≤3 lane
twin of generic_full_tower. Tower3Batch::lane(i) is
to_bits-identical to the order-≤3 scalar evaluation on rows[i].