pub struct WebGpuBackend { /* private fields */ }Expand description
Cross-platform GPU compute backend backed by wgpu.
§Lifecycle
WebGpuBackend::new()— create an uninitialised backend.init()— select the best available adapter and create the device.- Use
alloc,copy_htod, compute ops,copy_dtoh,free. synchronize()— wait for all pending GPU work to finish.
Implementations§
Source§impl WebGpuBackend
impl WebGpuBackend
Source§impl WebGpuBackend
impl WebGpuBackend
Sourcepub fn gemm_f16(
&self,
m: usize,
n: usize,
k: usize,
alpha: f64,
a_ptr: u64,
b_ptr: u64,
beta: f64,
c_ptr: u64,
) -> BackendResult<()>
pub fn gemm_f16( &self, m: usize, n: usize, k: usize, alpha: f64, a_ptr: u64, b_ptr: u64, beta: f64, c_ptr: u64, ) -> BackendResult<()>
FP16 GEMM: C = alpha * A * B + beta * C with half-precision storage.
This is an inherent method (not on ComputeBackend) because FP16
support is WebGPU-specific and requires the f16 WGSL extension.
Buffers pointed to by a_ptr, b_ptr, c_ptr must contain f16
elements (2 bytes each).
Trait Implementations§
Source§impl ComputeBackend for WebGpuBackend
impl ComputeBackend for WebGpuBackend
Source§fn init(&mut self) -> BackendResult<()>
fn init(&mut self) -> BackendResult<()>
Initialize the backend (select device, create context). Read more
Source§fn is_initialized(&self) -> bool
fn is_initialized(&self) -> bool
Returns
true if the backend is ready for operations.Source§fn gemm(
&self,
trans_a: BackendTranspose,
trans_b: BackendTranspose,
m: usize,
n: usize,
k: usize,
alpha: f64,
a_ptr: u64,
_lda: usize,
b_ptr: u64,
_ldb: usize,
beta: f64,
c_ptr: u64,
_ldc: usize,
) -> BackendResult<()>
fn gemm( &self, trans_a: BackendTranspose, trans_b: BackendTranspose, m: usize, n: usize, k: usize, alpha: f64, a_ptr: u64, _lda: usize, b_ptr: u64, _ldb: usize, beta: f64, c_ptr: u64, _ldc: usize, ) -> BackendResult<()>
General matrix multiply:
C = alpha * op(A) * op(B) + beta * C. Read moreSource§fn batched_gemm(
&self,
trans_a: BackendTranspose,
trans_b: BackendTranspose,
m: usize,
n: usize,
k: usize,
alpha: f64,
a_ptr: u64,
_lda: usize,
stride_a: usize,
b_ptr: u64,
_ldb: usize,
stride_b: usize,
beta: f64,
c_ptr: u64,
_ldc: usize,
stride_c: usize,
batch_count: usize,
) -> BackendResult<()>
fn batched_gemm( &self, trans_a: BackendTranspose, trans_b: BackendTranspose, m: usize, n: usize, k: usize, alpha: f64, a_ptr: u64, _lda: usize, stride_a: usize, b_ptr: u64, _ldb: usize, stride_b: usize, beta: f64, c_ptr: u64, _ldc: usize, stride_c: usize, batch_count: usize, ) -> BackendResult<()>
Strided batched GEMM: for each batch
b in 0..batch_count,
compute C_b = alpha * op(A_b) * op(B_b) + beta * C_b
where A_b starts at a_ptr + b * stride_a * 4 bytes (f32 elements), etc. Read moreSource§fn conv2d_forward(
&self,
input_ptr: u64,
input_shape: &[usize],
filter_ptr: u64,
filter_shape: &[usize],
output_ptr: u64,
output_shape: &[usize],
stride: &[usize],
padding: &[usize],
) -> BackendResult<()>
fn conv2d_forward( &self, input_ptr: u64, input_shape: &[usize], filter_ptr: u64, filter_shape: &[usize], output_ptr: u64, output_shape: &[usize], stride: &[usize], padding: &[usize], ) -> BackendResult<()>
2D convolution forward pass. Read more
Source§fn attention(
&self,
q_ptr: u64,
k_ptr: u64,
v_ptr: u64,
o_ptr: u64,
batch: usize,
heads: usize,
seq_q: usize,
seq_kv: usize,
head_dim: usize,
scale: f64,
causal: bool,
) -> BackendResult<()>
fn attention( &self, q_ptr: u64, k_ptr: u64, v_ptr: u64, o_ptr: u64, batch: usize, heads: usize, seq_q: usize, seq_kv: usize, head_dim: usize, scale: f64, causal: bool, ) -> BackendResult<()>
Scaled dot-product attention. Read more
Source§fn reduce(
&self,
op: ReduceOp,
input_ptr: u64,
output_ptr: u64,
shape: &[usize],
axis: usize,
) -> BackendResult<()>
fn reduce( &self, op: ReduceOp, input_ptr: u64, output_ptr: u64, shape: &[usize], axis: usize, ) -> BackendResult<()>
Reduction along an axis. Read more
Source§fn unary(
&self,
op: UnaryOp,
input_ptr: u64,
output_ptr: u64,
n: usize,
) -> BackendResult<()>
fn unary( &self, op: UnaryOp, input_ptr: u64, output_ptr: u64, n: usize, ) -> BackendResult<()>
Element-wise unary operation. Read more
Source§fn binary(
&self,
op: BinaryOp,
a_ptr: u64,
b_ptr: u64,
output_ptr: u64,
n: usize,
) -> BackendResult<()>
fn binary( &self, op: BinaryOp, a_ptr: u64, b_ptr: u64, output_ptr: u64, n: usize, ) -> BackendResult<()>
Element-wise binary operation. Read more
Source§fn synchronize(&self) -> BackendResult<()>
fn synchronize(&self) -> BackendResult<()>
Synchronize all pending operations on this backend. Read more
Source§fn free(&self, ptr: u64) -> BackendResult<()>
fn free(&self, ptr: u64) -> BackendResult<()>
Free device memory previously allocated with
alloc.Source§impl Debug for WebGpuBackend
impl Debug for WebGpuBackend
Auto Trait Implementations§
impl Freeze for WebGpuBackend
impl !RefUnwindSafe for WebGpuBackend
impl Send for WebGpuBackend
impl Sync for WebGpuBackend
impl Unpin for WebGpuBackend
impl UnsafeUnpin for WebGpuBackend
impl !UnwindSafe for WebGpuBackend
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more