pub struct Device { /* private fields */ }Expand description
A GPU compute device.
This is the main entry point for scry-gpu. A Device wraps a single
GPU and provides methods to upload data, dispatch shaders, and read
results back.
§Example
let gpu = Device::auto()?;
let input = gpu.upload(&[1.0f32, 2.0, 3.0, 4.0])?;
let output = gpu.alloc::<f32>(4)?;
gpu.dispatch(SHADER_SRC, &[&input, &output], 4)?;
let result: Vec<f32> = output.download()?;Implementations§
Source§impl Device
impl Device
Sourcepub fn auto() -> Result<Self>
pub fn auto() -> Result<Self>
Auto-select the best available GPU.
Tries backends in order of preference: CUDA → Vulkan → (Metal in future). CUDA is preferred when available because it enables cuBLAS matmul and native CUDA kernel dispatch.
Sourcepub fn with_backend(kind: BackendKind) -> Result<Self>
pub fn with_backend(kind: BackendKind) -> Result<Self>
Create a device with a specific backend.
Sourcepub fn upload<T: Pod>(&self, data: &[T]) -> Result<Buffer<T>>
pub fn upload<T: Pod>(&self, data: &[T]) -> Result<Buffer<T>>
Upload a slice to GPU memory, returning a typed buffer.
Sourcepub fn alloc<T: Pod>(&self, count: usize) -> Result<Buffer<T>>
pub fn alloc<T: Pod>(&self, count: usize) -> Result<Buffer<T>>
Allocate an uninitialized GPU buffer for count elements of type T.
Sourcepub fn dispatch(
&self,
shader_src: &str,
buffers: &[&dyn GpuBuf],
invocations: u32,
) -> Result<()>
pub fn dispatch( &self, shader_src: &str, buffers: &[&dyn GpuBuf], invocations: u32, ) -> Result<()>
Dispatch a WGSL compute shader.
Buffers are bound in order to @binding(0), @binding(1), etc.
Workgroup dispatch dimensions are auto-calculated from invocations
and the shader’s @workgroup_size.
Sourcepub fn dispatch_configured(
&self,
config: &DispatchConfig<'_>,
buffers: &[&dyn GpuBuf],
) -> Result<()>
pub fn dispatch_configured( &self, config: &DispatchConfig<'_>, buffers: &[&dyn GpuBuf], ) -> Result<()>
Dispatch with full configuration.
Sourcepub fn compile(&self, shader_src: &str) -> Result<Kernel>
pub fn compile(&self, shader_src: &str) -> Result<Kernel>
Compile a WGSL compute shader into a reusable Kernel.
The returned kernel holds all GPU objects (pipeline, layouts,
shader module) and can be dispatched many times via Device::run.
Uses "main" as the entry point. See Device::compile_named
for a custom entry point.
Sourcepub fn compile_named(
&self,
shader_src: &str,
entry_point: &str,
) -> Result<Kernel>
pub fn compile_named( &self, shader_src: &str, entry_point: &str, ) -> Result<Kernel>
Compile a WGSL shader with a specific entry point name.
Sourcepub fn run(
&self,
kernel: &Kernel,
buffers: &[&dyn GpuBuf],
invocations: u32,
) -> Result<()>
pub fn run( &self, kernel: &Kernel, buffers: &[&dyn GpuBuf], invocations: u32, ) -> Result<()>
Dispatch a precompiled kernel.
Buffers are bound in order to @binding(0), @binding(1), etc.
Workgroup dispatch dimensions are auto-calculated from invocations
and the kernel’s compiled @workgroup_size.
Sourcepub fn run_with_push_constants(
&self,
kernel: &Kernel,
buffers: &[&dyn GpuBuf],
invocations: u32,
push_constants: &[u8],
) -> Result<()>
pub fn run_with_push_constants( &self, kernel: &Kernel, buffers: &[&dyn GpuBuf], invocations: u32, push_constants: &[u8], ) -> Result<()>
Dispatch a precompiled kernel with push constants.
Sourcepub fn run_configured(
&self,
kernel: &Kernel,
buffers: &[&dyn GpuBuf],
workgroups: [u32; 3],
push_constants: Option<&[u8]>,
) -> Result<()>
pub fn run_configured( &self, kernel: &Kernel, buffers: &[&dyn GpuBuf], workgroups: [u32; 3], push_constants: Option<&[u8]>, ) -> Result<()>
Dispatch a precompiled kernel with explicit workgroup dimensions.
Use this for 2D/3D dispatches or when you need precise control over
workgroup counts. For simple 1D dispatches, prefer Device::run.
Sourcepub fn copy_buffer<T: Pod>(&self, src: &Buffer<T>) -> Result<Buffer<T>>
pub fn copy_buffer<T: Pod>(&self, src: &Buffer<T>) -> Result<Buffer<T>>
Create a GPU-to-GPU copy of a buffer.
Allocates a new buffer on the same device and copies the contents
of src into it. The copy is synchronous (blocks until complete).
Sourcepub fn batch(&self) -> Result<Batch>
pub fn batch(&self) -> Result<Batch>
Begin a batched dispatch session.
Records multiple dispatches into a single command buffer, submitted
with one fence wait via [Batch::submit].
Sourcepub fn subgroup_size(&self) -> u32
pub fn subgroup_size(&self) -> u32
Subgroup (warp/wavefront) size.
Typically 32 on NVIDIA, 64 on AMD, 32 on Intel. Useful for sizing subgroup-aware shaders.
Sourcepub const fn backend_kind(&self) -> BackendKind
pub const fn backend_kind(&self) -> BackendKind
Which backend this device is using.