Module kernels

Source

Expand description

Kernel implementations: scalar reference, AVX2 SIMD, and CUDA PTX.

Each submodule provides three variants of its kernel:

fn {name}_scalar(...) — Pure Rust scalar reference (ground truth)
unsafe fn {name}_avx2(...) — AVX2 SIMD implementation
fn {name}_ptx() -> &'static str — PTX assembly source string

Modules§

absolute_position: Absolute position embeddings kernel.
activation: Activation kernels: ReLU, GELU, SiLU.
adamw: AdamW optimizer kernel.
alibi: ALiBi (Attention with Linear Biases) kernel.
attention: Scaled dot-product attention kernel.
batchnorm: Batch normalization kernel.
bias_add: Bias addition kernel.
cma_es: CMA-ES sampling kernel.
conv1d: 1D Convolution kernel.
cross_entropy: Cross-entropy loss kernel with log-softmax.
dropout: Dropout kernel.
embedding: Embedding lookup kernel.
f16_convert: F16 (half-precision) conversion kernel.
flash_attention: Flash Attention: IO-aware tiled attention.
gated_delta_net: Gated Delta Net recurrence kernel.
gelu: GELU kernel (standalone module).
gqa: Grouped Query Attention kernel.
kmeans: K-means clustering kernel.
layernorm: Layer normalization kernel.
lbfgs: L-BFGS two-loop recursion kernel.
linear: Linear projection kernel.
matmul: Matrix multiplication kernel.
ops: Shared kernel primitives: dot product, softmax row, score matrix.
pagerank: PageRank iteration kernel.
rmsnorm: RMSNorm kernel: root mean square layer normalization.
rope: Rotary Position Embedding (RoPE) kernel.
sampling: Sampling algorithms kernel.
silu_standalone: Standalone SiLU kernel with explicit sigmoid.
softmax: Softmax kernel: numerically stable exponential normalization.
ssm: State-Space Model (SSM) scan kernel.
swiglu: SwiGLU gated MLP kernel.
tied_embeddings: Tied embeddings kernel (language model head).
transpose: Matrix transpose kernel: out-of-place B = A^T with AVX2 8×8 micro-kernel.
ulp: ULP (Unit in the Last Place) distance utilities for floating-point comparison.

Enums§

Backend: Backend selector for kernel dispatch.

Module kernels

Module kernels Copy item path

Modules§

Enums§

Module kernels