Skip to main content

Module gpu

Module gpu

Expand description

GPU compute backend using wgpu (WebGPU)

Toyota Way Principles:

Muda elimination: GPU only when compute > 5x transfer time
Genchi Genbutsu: Empirical benchmarks prove 50-100x speedups

Architecture:

WGSL compute shaders for parallel reduction
Workgroup size: 256 threads (GPU warp size optimization)
Two-stage reduction: workgroup-local + global

References:

HeavyDB (2017): GPU aggregation patterns
Harris (2007): Optimizing parallel reduction in CUDA
Leis et al. (2014): Morsel-driven parallelism

Modules§

jit: JIT WGSL Compiler for Kernel Fusion (CORE-003)
kernels: GPU compute kernels (WGSL shaders)
multigpu: Multi-GPU data partitioning and distribution

Structs§

GpuEngine: GPU compute engine for aggregations

Functions§

gpu_backends: Platform-appropriate wgpu backend mask for adapter enumeration.