Skip to main content

Crate rlx_wgpu

Crate rlx_wgpu 

Source
Expand description

RLX wgpu backend — cross-platform GPU execution via the wgpu Rust crate (Metal on macOS, Vulkan on Linux, DX12 on Windows, WebGPU in browsers).

Compared to rlx-metal: same overall shape (device singleton, buffer arena, per-op compute pipelines, command-buffer-per-forward-pass) but with WGSL kernels and the wgpu Rust API instead of MSL + the metal crate. Pure Rust deps — no FFI / submodules to manage.

Layout:

  • device — wgpu instance/adapter/device singleton (sync wrapper)
  • buffer — typed GPU buffer + arena
  • kernels — WGSL source strings + per-kernel pipeline cache
  • backend — Backend trait impl + per-op dispatch

Re-exports§

pub use device::is_vulkan_available;
pub use device::select_vulkan_backend;

Modules§

backend
WgpuExecutable — compiles an rlx-ir Graph into a sequence of kernel dispatches against a pre-allocated arena buffer.
buffer
Buffer arena for the wgpu backend. Mirrors the rlx-metal arena shape: pre-plan one big storage buffer at compile time, sub-allocate per-node offsets at known positions, treat I/O as write_buffer / read_buffer against those offsets.
calibrate
On-disk wgpu calibration for cost-model ranking.
coop_f16_vk
device
wgpu device discovery + capabilities.
fft_dispatch
fft_host
gdn_host
Host-side Op::GatedDeltaNet for wgpu arenas (readback → CPU → writeback).
gguf_host
Host-side GGUF K-quant Op::DequantMatMul for wgpu arenas.
im2col_host
kernels
WGSL kernel sources + per-kernel pipeline cache.
llada2_gate_host
Host-side Op::Custom("llada2.group_limited_gate") for wgpu arenas.
log_mel_host
training_bwd_host
Host-side training backward ops for wgpu arenas (readback → CPU → writeback).
umap_knn_host
Host-side Op::Custom("umap.knn") for wgpu arenas (small n only).
unfuse
IR-level “unfusion” pass for the wgpu backend.
welch_peaks_dispatch
welch_peaks_host

Functions§

is_available
True if a wgpu adapter is reachable on this system. Always available at the crate level; the runtime registry only registers the backend when this returns true so tests on weird CI machines without a GPU don’t trip up.