pub struct GpuCommandBatch { /* private fields */ }Expand description
Command batch for async GPU execution
Accumulates GPU operations and executes them together to minimize CPU↔GPU data transfers.
Implementations§
Source§impl GpuCommandBatch
impl GpuCommandBatch
Sourcepub async fn execute(&mut self) -> Result<(), String>
pub async fn execute(&mut self) -> Result<(), String>
Execute all queued operations on GPU
Uses a single command encoder for all operations (one GPU submission)
and caches pipelines per shader source to avoid redundant compilation.
The pipeline cache is local to this call — see execute_with_cache()
for persistent caching across multiple batch executions.
§Contract (C-BATCH-EXEC-001)
- Precondition: Operations queued via
matmul(),relu(), etc. - Postcondition: All operations executed, results in GPU buffers
- Invariant: Pipeline compiled at most once per unique shader source
- Invariant: Single
queue.submit()perexecute()call
Sourcepub async fn execute_with_cache(
&mut self,
cache: &mut PipelineCache,
) -> Result<(), String>
pub async fn execute_with_cache( &mut self, cache: &mut PipelineCache, ) -> Result<(), String>
Execute with a persistent pipeline cache (KAIZEN-023).
Same as execute() but uses a caller-provided pipeline cache that
persists across multiple batch executions. Shaders compiled in a
previous batch are reused without recompilation.
For Qwen3-4B FFN (36 layers × 3 unique shaders per batch):
execute(): 3 compilations per layer × 36 = 108 totalexecute_with_cache(): 3 compilations (layer 1) + 0 (layers 2-36) = 3 total
Source§impl GpuCommandBatch
impl GpuCommandBatch
Sourcepub fn upload(&mut self, data: &[f32]) -> BufferId
pub fn upload(&mut self, data: &[f32]) -> BufferId
Upload data to GPU (queued for batch execution)
Returns a buffer ID that can be used in subsequent operations.
Sourcepub fn relu(&mut self, input: BufferId) -> BufferId
pub fn relu(&mut self, input: BufferId) -> BufferId
Queue ReLU operation: max(0, x)
Returns buffer ID for the output.
Sourcepub fn scale(&mut self, input: BufferId, scalar: f32) -> BufferId
pub fn scale(&mut self, input: BufferId, scalar: f32) -> BufferId
Queue scalar multiplication: x * scalar
Returns buffer ID for the output.
Sourcepub fn add(&mut self, a: BufferId, b: BufferId) -> BufferId
pub fn add(&mut self, a: BufferId, b: BufferId) -> BufferId
Queue element-wise addition: a + b
Returns buffer ID for the output.
§Panics
Panics if buffers have different sizes.
Sourcepub fn mul(&mut self, a: BufferId, b: BufferId) -> BufferId
pub fn mul(&mut self, a: BufferId, b: BufferId) -> BufferId
Queue element-wise multiplication: a * b
Returns buffer ID for the output.
§Panics
Panics if buffers have different sizes.
Sourcepub fn dot(&mut self, a: BufferId, b: BufferId) -> BufferId
pub fn dot(&mut self, a: BufferId, b: BufferId) -> BufferId
Queue dot product: sum(a[i] * b[i])
Returns buffer ID for a single-element output buffer.
§Panics
Panics if buffers have different sizes.
Sourcepub fn sigmoid(&mut self, input: BufferId) -> BufferId
pub fn sigmoid(&mut self, input: BufferId) -> BufferId
Queue sigmoid activation: 1 / (1 + exp(-x))
Returns buffer ID for the output.
Sourcepub fn tanh(&mut self, input: BufferId) -> BufferId
pub fn tanh(&mut self, input: BufferId) -> BufferId
Queue hyperbolic tangent: tanh(x)
Returns buffer ID for the output.
Sourcepub fn swish(&mut self, input: BufferId) -> BufferId
pub fn swish(&mut self, input: BufferId) -> BufferId
Queue Swish activation: x * sigmoid(x)
Returns buffer ID for the output.
Sourcepub fn gelu(&mut self, input: BufferId) -> BufferId
pub fn gelu(&mut self, input: BufferId) -> BufferId
Queue GELU activation: x * Φ(x)
Returns buffer ID for the output.
Sourcepub fn sub(&mut self, a: BufferId, b: BufferId) -> BufferId
pub fn sub(&mut self, a: BufferId, b: BufferId) -> BufferId
Queue element-wise subtraction: a - b
Returns buffer ID for the output.
§Panics
Panics if buffers have different sizes.
Sourcepub fn matmul(
&mut self,
a: BufferId,
b: BufferId,
m: u32,
k: u32,
n: u32,
) -> BufferId
pub fn matmul( &mut self, a: BufferId, b: BufferId, m: u32, k: u32, n: u32, ) -> BufferId
Queue matrix multiplication: C = A × B
A is M×K elements, B is K×N elements, output is M×N elements. All matrices are row-major flat arrays.
Returns buffer ID for the M×N output.
§Panics
Panics if buffer sizes don’t match the declared dimensions.
Sourcepub fn import_buffer(&mut self, buffer: Arc<Buffer>, size: usize) -> BufferId
pub fn import_buffer(&mut self, buffer: Arc<Buffer>, size: usize) -> BufferId
Import a pre-existing GPU buffer for use in batch operations.
Unlike upload() which copies host data to GPU during execute(),
imported buffers are already GPU-resident and skip the upload step.
The Arc wrapper allows the same buffer to be shared across multiple
batch executions without re-uploading (KAIZEN-015: GPU-resident weights).
§Contract (C-BATCH-IMPORT-001)
- Precondition:
bufferis a validwgpu::Bufferwith STORAGE | COPY_SRC usage - Postcondition: Returned
BufferIdcan be used in all batch operations (matmul, etc.) - Invariant: Imported buffer is NOT destroyed when the batch is dropped —
the
Arckeeps it alive as long as the caller retains a clone
Sourcepub fn wgpu_device(&self) -> &Device
pub fn wgpu_device(&self) -> &Device
Get the underlying wgpu device for creating persistent buffers.
Used to create wgpu::Buffer instances that outlive individual batch executions.
Created buffers can be registered via import_buffer().
Sourcepub fn wgpu_queue(&self) -> &Queue
pub fn wgpu_queue(&self) -> &Queue
Get the underlying wgpu queue for writing to persistent buffers.
Sourcepub fn num_operations(&self) -> usize
pub fn num_operations(&self) -> usize
Get number of queued operations
Sourcepub fn num_buffers(&self) -> usize
pub fn num_buffers(&self) -> usize
Get number of buffers
Auto Trait Implementations§
impl Freeze for GpuCommandBatch
impl !RefUnwindSafe for GpuCommandBatch
impl Send for GpuCommandBatch
impl Sync for GpuCommandBatch
impl Unpin for GpuCommandBatch
impl UnsafeUnpin for GpuCommandBatch
impl !UnwindSafe for GpuCommandBatch
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> FmtForward for T
impl<T> FmtForward for T
Source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self to use its Binary implementation when Debug-formatted.Source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self to use its Display implementation when
Debug-formatted.Source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self to use its LowerExp implementation when
Debug-formatted.Source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self to use its LowerHex implementation when
Debug-formatted.Source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self to use its Octal implementation when Debug-formatted.Source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self to use its Pointer implementation when
Debug-formatted.Source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self to use its UpperExp implementation when
Debug-formatted.Source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self to use its UpperHex implementation when
Debug-formatted.Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
Source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
Source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
Source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
Source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.Source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.Source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> Tap for T
impl<T> Tap for T
Source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read moreSource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read moreSource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read moreSource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read moreSource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.Source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.Source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.Source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.Source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.