Struct GpuDevice

Source

pub struct GpuDevice {
    pub adapter_name: String,
    pub backend: String,
    /* private fields */
}

Expand description

GPU device handle. wgpu picks the right backend — Vulkan, Metal, DX12. One codepath, every vendor.

Fields§

§adapter_name: String§backend: String

Implementations§

Source §

impl GpuDevice

Source

pub fn gpu() -> Result<Self>

Discover the best GPU and initialize it. wgpu auto-selects the backend: Vulkan on Linux (AMD/NVIDIA/Intel), Metal on macOS, DX12 on Windows.

Source

pub fn upload(&self, data: &[f32]) -> GpuBuffer

Upload f32 slice to GPU. Returns a storage buffer usable in compute shaders.

Source

pub fn alloc(&self, n: usize) -> GpuBuffer

Allocate an empty GPU buffer for n f32 elements.

Source

pub fn read(&self, buf: &GpuBuffer) -> Result<Vec<f32>>

Read GPU buffer back to CPU as f32 vec.

Source §

pub fn relu_backward( &self, grad_out: &GpuBuffer, input: &GpuBuffer, ) -> Result<GpuBuffer>

ReLU backward: grad_a = grad_out * (input > 0)

Source

pub fn sigmoid_backward( &self, grad_out: &GpuBuffer, output: &GpuBuffer, ) -> Result<GpuBuffer>

Sigmoid backward: grad_a = grad_out * output * (1 - output)

Source

pub fn swish_backward( &self, grad_out: &GpuBuffer, input: &GpuBuffer, ) -> Result<GpuBuffer>

Swish backward: grad_a = grad_out * (sig(x) + x * sig(x) * (1 - sig(x)))

Source

pub fn tanh_backward( &self, grad_out: &GpuBuffer, output: &GpuBuffer, ) -> Result<GpuBuffer>

Tanh backward: grad_a = grad_out * (1 - output^2)

Source §

impl GpuDevice

Source

pub fn matmul( &self, a: &GpuBuffer, b: &GpuBuffer, m: u32, n: u32, k: u32, ) -> Result<GpuBuffer>

Matrix multiply: A(m,k) x B(k,n) = C(m,n). Row-major layout.

Source

pub fn batch_matmul( &self, a: &GpuBuffer, b: &GpuBuffer, batch: u32, m: u32, n: u32, k: u32, ) -> Result<GpuBuffer>

Batched matmul: A[batch,m,k] x B[batch,k,n] = C[batch,m,n].

Source

pub fn conv2d( &self, input: &GpuBuffer, weight: &GpuBuffer, bias: Option<&GpuBuffer>, batch: u32, in_c: u32, in_h: u32, in_w: u32, out_c: u32, kh: u32, kw: u32, stride: (u32, u32), padding: (u32, u32), dilation: (u32, u32), groups: u32, ) -> Result<GpuBuffer>

Conv2d: input[N,C_in,H,W] * weight[C_out,C_in/groups,kH,kW] + bias[C_out]. NCHW layout. Returns output[N,C_out,out_H,out_W].

Source

pub fn conv_transpose2d( &self, input: &GpuBuffer, weight: &GpuBuffer, bias: Option<&GpuBuffer>, batch: u32, in_c: u32, in_h: u32, in_w: u32, out_c: u32, kh: u32, kw: u32, stride: (u32, u32), padding: (u32, u32), output_padding: (u32, u32), dilation: (u32, u32), groups: u32, ) -> Result<GpuBuffer>

Transposed conv2d (deconvolution): input[N,C_in,H,W] -> output[N,C_out,out_H,out_W]. Weight layout: [C_in, C_out/groups, kH, kW].

Source §

impl GpuDevice

Source

pub fn group_norm( &self, input: &GpuBuffer, gamma: &GpuBuffer, beta: &GpuBuffer, batch: u32, channels: u32, spatial: u32, groups: u32, eps: f32, ) -> Result<GpuBuffer>

Group normalization: input[N,C,*spatial] with C/groups groups. gamma[C] and beta[C] are learnable affine params.

Source §

impl GpuDevice

Source

pub fn concat( &self, a: &GpuBuffer, b: &GpuBuffer, outer_size: u32, a_inner: u32, b_inner: u32, ) -> Result<GpuBuffer>

Concat two buffers along a given axis. outer_size = product of dims before concat axis. a_inner = a’s size along concat axis * product of dims after. b_inner = same for b.

Source

pub fn transpose( &self, a: &GpuBuffer, outer_size: u32, d0: u32, d1: u32, inner: u32, ) -> Result<GpuBuffer>

Transpose two dimensions of a tensor. Shape is […, d0, d1, …inner_dims]. outer_size = product of dims before d0. inner = product of dims after d1.

Source §