oxiui-compute-wgpu — Pure-Rust wgpu GPU-compute abstraction for OxiUI
oxiui-compute-wgpu is the shared GPU-compute layer for the COOLJAPAN
ecosystem. It consolidates the repeated Instance → Adapter → Device → Queue
initialisation boilerplate that headless compute workloads (sparse solvers,
Lattice-Boltzmann, Monte-Carlo) each duplicated, and adds typed buffers, buffer
pooling and sub-allocation, pipeline caching, dispatch helpers, a WGSL
preprocessor and validator, and a set of validated built-in compute kernels.
#![forbid(unsafe_code)] is enforced crate-wide; pollster blocks on wgpu's
async adapter/device requests so the public API stays synchronous; bytemuck
handles zero-copy Pod casting between CPU and GPU.
Targets wgpu 29. See the wgpu 29 Notes for the renamed APIs.
Installation
[]
= "0.1"
wgpu, bytemuck, and pollster are re-exported, so a single dependency
declaration is enough.
Quick Start
Configure a context with the builder, double every value in a buffer on the GPU
with a DispatchBuilder, and read the result back:
use ;
// 1. Configure and build a compute context (Err on hosts with no GPU adapter).
let Ok = builder
.with_power_preference
.build
else ;
// 2. Upload input data to a storage buffer.
let input: = vec!;
let buffer = storage_buffer_init;
// 3. Compile a WGSL compute shader (auto-layout from reflection).
const SHADER: &str = r#"
@group(0) @binding(0) var<storage, read_write> data: array<f32>;
@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) gid: vec3<u32>) {
if gid.x < arrayLength(&data) {
data[gid.x] = data[gid.x] * 2.0;
}
}
"#;
let pipeline = compute_pipeline;
// 4. Bind, dispatch (ceil-div grid), and submit in one call.
let bind_group = ctx.device.create_bind_group;
new
.bind
.dispatch_1d
.label
.submit;
// 5. Read the results back to the CPU.
let output: = read_back;
assert_eq!;
API Overview
| Module | Key exports |
|---|---|
context |
ComputeContext, ContextBuilder |
buffer |
storage_buffer_init, uniform_buffer, staging_buffer, read_back, read_back_range, read_back_async, TypedBuffer<T>, BufferPool, SubAllocator, mapped_storage_init |
pipeline |
compute_pipeline, checked_compute_pipeline, PipelineCache, DispatchBuilder, dispatch_1d, dispatch_2d, dispatch_3d, validate_immediates, encode_indirect_dispatch |
wgsl |
preprocess, validate, SHADER_PREFIX_SUM, SHADER_REDUCTION_SUM, SHADER_HISTOGRAM, SHADER_MATMUL |
error |
ComputeError (NoAdapter, DeviceRequest, OutOfMemory, ShaderCompilation, Operation) |
Built-in Kernels
Four validated WGSL kernels ship as pub const source strings in wgsl. Each
uses the entry point main_cs and is compiled with compute_pipeline (or
checked_compute_pipeline).
| Constant | Entry point | Constraints |
|---|---|---|
SHADER_PREFIX_SUM |
main_cs |
Inclusive scan, f32. Single workgroup; input length ≤ 256. Dispatch one workgroup of size 256. |
SHADER_REDUCTION_SUM |
main_cs |
Sum reduction, f32 → f32. Single workgroup; input length ≤ 256. Dispatch one workgroup of size 256. |
SHADER_HISTOGRAM |
main_cs |
u32 values → bin counts. Up to 256 bins; each element binned as input[i] % num_bins. Workgroup size 64; workgroup-local atomic histogram. |
SHADER_MATMUL |
main_cs |
Tiled f32 matmul, M×K · K×N → M×N (row-major). 16×16 shared-memory tiles; bind MatDims { M, K, N } uniform at @binding(3). Dispatch ceil(N/16) × ceil(M/16) × 1. |
wgpu 29 Notes
This crate targets wgpu 29. Relevant API changes from earlier versions:
- Push constants are now immediates:
wgpu::Features::IMMEDIATES,ComputePass::set_immediates, andLimits::max_immediate_size. Alignment iswgpu::IMMEDIATE_DATA_ALIGNMENT(4); validate withvalidate_immediates. Instance::request_adapterreturnsResult(notOption).Adapter::request_devicetakes a single&DeviceDescriptorargument.
Feature Flags
None. All functionality is on by default. GPU access is optional at runtime:
ComputeContext::try_new returns None on headless hosts (CI, VMs without GPU
pass-through), and ComputeContext::new / ContextBuilder::build return
ComputeError::NoAdapter, so callers can skip gracefully rather than fail.
Related Crates
oxiui— the OxiUI facade crate.oxiui-core—RenderBackend,DrawList, geometry types, sharedrequire_gpu!macro.oxiui-render-soft— CPU render backend; will use compute shaders for blur/dithering.oxiui-render-wgpu— GPU render backend; shares the device/queue with this crate.oxiui-text— text pipeline; GPU glyph rasterization is a planned consumer.oxiui-theme— design tokens andShadowSpec.
License
Apache-2.0 — COOLJAPAN OU (Team Kitasan)