pub fn conv2d_raw(
input: &[f64],
filters: &[f64],
bias: &[f64],
out: &mut [f64],
n: usize,
c_in: usize,
h_in: usize,
w_in: usize,
c_out: usize,
kh: usize,
kw: usize,
stride: usize,
)Expand description
2D convolution — NCHW layout, valid mode (no padding), configurable stride.
§Layout
input:[N, C_in, H_in, W_in]row-major contiguousfilters:[C_out, C_in, kH, kW]row-major contiguousbias:[C_out]out:[N, C_out, H_out, W_out]pre-allocated by caller
where H_out = (H_in - kH) / stride + 1 and W_out = (W_in - kW) / stride + 1.
§Numerical contract
Every kernel-to-patch dot product uses BinnedAccumulatorF64, guaranteeing
bit-identical results regardless of stride, batch size, or channel count.
§NoGC guarantee
All index arithmetic uses u64 before narrowing to usize, preventing
overflow for high-resolution inputs (e.g., 8192×8192). The output buffer
is caller-allocated; this function performs zero heap allocations.