Expand description
Real wgpu-backed GpuNdarray<T> that implements ArrayProtocol.
Enabled only with the array_protocol_wgpu feature, which implies wgpu_backend.
§Supported operations (GPU dispatch)
add,subtract,multiply— elementwise binary, workgroup (256,1,1), usesarrayLengthmultiply_by_scalar_f32— elementwise scalar multiply, workgroup (256,1,1)matmul— naive (one thread per output element), workgroup (16,16,1)sum(axis=None)— two-pass reduce, workgroup (256,1,1)transpose(2-D) — 16×16 bank-conflict-padded tile, workgroup (16,16,1) (32×32 exceeds Metal’s 256-invocation-per-workgroup limit)concatenate(axis=0)— viacopy_buffer_to_buffer, no shaderconcatenate(axis>0)— WGSL gather kernel (CONCAT_AXISN_WGSL), storage-buf stridessum(axis=Some(ax))— WGSL per-output-element axis reduction (REDUCE_SUM_AXIS_WGSL)reshape— zero-copy (cloneArc<Buffer>, new shape/strides)
§CPU-fallback operations
svd— falls back to CPUNdarrayWrapperinverse— falls back to CPUNdarrayWrappermultiply_by_scalar_f64,divide_by_scalar_f64— convert to f32, then GPU- GPU kernel errors on axis ops — fallback to CPU (graceful degradation)
§GPU threshold
Arrays with fewer than 4096 elements skip GPU dispatch entirely and fall back to CPU.
Structs§
- GpuNdarray
- A GPU-backed n-dimensional array backed by a real wgpu
Buffer.
Traits§
- GpuScalar
- Marker trait for element types that wgpu-29 supports natively (f32 only; f64 is not supported in WGSL without extensions).
Functions§
- global_
context - Returns the shared
WebGPUContext, orNoneif no adapter is available. - is_
gpu_ available - Returns
trueif a wgpu adapter was found when first called; cached afterwards.