Skip to main content

Module gpu

Module gpu

Expand description

GPU-accelerated batch inference for the MoE classifier via wgpu compute shaders.

Processes N feature vectors in a single GPU dispatch, achieving ~10-100x throughput over CPU for large batches. Falls back to CPU when no GPU is available or for batches smaller than the crossover threshold.

Architecture mirrors ml_scorer.rs exactly:

Gate: Linear(41→6) + softmax
6 experts: Linear(41→32)+ReLU → Linear(32→16)+ReLU → Linear(16→1)
Output: sigmoid(weighted sum of expert logits)

Functions§

batch_ml_inference: Score multiple (credential, context) pairs in a single batch.
gpu_available: Check if GPU acceleration is available. Return true when GPU scoring support is available in this build/runtime.
gpu_probe: Probe GPU availability and adapter metadata without panicking.