Struct ComputeClient

Source

pub struct ComputeClient<Server: ComputeServer> { /* private fields */ }

Expand description

The ComputeClient is the entry point to require tasks from the ComputeServer. It should be obtained for a specific device via the Compute struct.

Implementations§

Source §

impl<Server> ComputeClient<Server>
where Server: ComputeServer,

Source

pub fn info(&self) -> &Server::Info

Get the info of the current backend.

Source

pub fn init<D: Device>(device: &D, server: Server) -> Self

Create a new client with a new server.

Source

pub fn load<D: Device>(device: &D) -> Self

Load the client for the given device.

Source

pub unsafe fn set_stream(&mut self, stream_id: StreamId)

Set the stream in which the current client is operating on.

§Safety

This is highly unsafe and should probably only be used by the CubeCL/Burn projects for now.

Source

pub fn read_async( &self, handles: Vec<Handle>, ) -> impl Future<Output = Vec<Bytes>> + Send

Given bindings, returns owned resources as bytes.

Source

pub fn read(&self, handles: Vec<Handle>) -> Vec<Bytes>

Given bindings, returns owned resources as bytes.

§Remarks

Panics if the read operation fails.

Source

pub fn read_one(&self, handle: Handle) -> Bytes

Given a binding, returns owned resource as bytes.

§Remarks

Panics if the read operation fails.

Source

pub fn read_tensor_async( &self, descriptors: Vec<CopyDescriptor<'_>>, ) -> impl Future<Output = Vec<Bytes>> + Send

Given bindings, returns owned resources as bytes.

Source

pub fn read_tensor(&self, descriptors: Vec<CopyDescriptor<'_>>) -> Vec<Bytes>

Given bindings, returns owned resources as bytes.

§Remarks

Panics if the read operation fails.

The tensor must be in the same layout as created by the runtime, or more strict. Contiguous tensors are always fine, strided tensors are only ok if the stride is similar to the one created by the runtime (i.e. padded on only the last dimension). A way to check stride compatibility on the runtime will be added in the future.

Also see ComputeClient::create_tensor.

Source

pub fn read_one_tensor_async( &self, descriptor: CopyDescriptor<'_>, ) -> impl Future<Output = Bytes> + Send

Given a binding, returns owned resource as bytes. See ComputeClient::read_tensor

Source

pub fn read_one_tensor(&self, descriptor: CopyDescriptor<'_>) -> Bytes

Given a binding, returns owned resource as bytes.

§Remarks

Panics if the read operation fails. See ComputeClient::read_tensor

Source

pub fn get_resource( &self, binding: Binding, ) -> BindingResource<<Server::Storage as ComputeStorage>::Resource>

Given a resource handle, returns the storage resource.

Source

pub fn create(&self, data: &[u8]) -> Handle

Given a resource, stores it and returns the resource handle.

Source

pub fn create_tensor( &self, data: &[u8], shape: &[usize], elem_size: usize, ) -> Allocation

Given a resource and shape, stores it and returns the tensor handle and strides. This may or may not return contiguous strides. The layout is up to the runtime, and care should be taken when indexing.

Currently the tensor may either be contiguous (most runtimes), or “pitched”, to use the CUDA terminology. This means the last (contiguous) dimension is padded to fit a certain alignment, and the strides are adjusted accordingly. This can make memory accesses significantly faster since all rows are aligned to at least 16 bytes (the maximum load width), meaning the GPU can load as much data as possible in a single instruction. It may be aligned even more to also take cache lines into account.

However, the stride must be taken into account when indexing and reading the tensor (also see ComputeClient::read_tensor).

Source

pub fn create_tensors( &self, descriptors: Vec<(AllocationDescriptor<'_>, &[u8])>, ) -> Vec<Allocation>

Reserves all shapes in a single storage buffer, copies the corresponding data into each handle, and returns the handles for them. See ComputeClient::create_tensor

Source

pub fn empty(&self, size: usize) -> Handle

Reserves size bytes in the storage, and returns a handle over them.

Source

pub fn empty_tensor(&self, shape: &[usize], elem_size: usize) -> Allocation

Reserves shape in the storage, and returns a tensor handle for it. See ComputeClient::create_tensor

Source

pub fn empty_tensors( &self, descriptors: Vec<AllocationDescriptor<'_>>, ) -> Vec<Allocation>

Reserves all shapes in a single storage buffer, and returns the handles for them. See ComputeClient::create_tensor

Source

pub fn to_client(&self, src: Handle, dst_server: &Self) -> Allocation

Transfer data from one client to another

Source

pub fn to_client_tensor( &self, src_descriptor: CopyDescriptor<'_>, dst_server: &Self, ) -> Allocation

Transfer data from one client to another

Make sure the source description can be read in a contiguous manner.

Source

pub fn execute( &self, kernel: Server::Kernel, count: CubeCount, bindings: Bindings, )

Executes the kernel over the given bindings.

Source

pub unsafe fn execute_unchecked( &self, kernel: Server::Kernel, count: CubeCount, bindings: Bindings, )

Executes the kernel over the given bindings without performing any bound checks.

§Safety

To ensure this is safe, you must verify your kernel:

Has no out-of-bound reads and writes that can happen.
Has no infinite loops that might never terminate.

Source

pub fn flush(&self)

Flush all outstanding commands.

Source

pub fn sync(&self) -> DynFut<()>

Wait for the completion of every task in the server.

Source

pub fn properties(&self) -> &DeviceProperties

Get the features supported by the compute server.

Source

pub fn properties_mut(&mut self) -> Option<&mut DeviceProperties>

§Warning

For private use only.

Source

pub fn memory_usage(&self) -> MemoryUsage

Get the current memory usage of this client.

Source

pub unsafe fn allocation_mode(&self, mode: MemoryAllocationMode)

Change the memory allocation mode.

§Safety

This function isn’t thread safe and might create memory leaks.

Source

pub fn memory_persistent_allocation<Input, Output, Func: Fn(Input) -> Output>( &self, input: Input, func: Func, ) -> Output

Use a persistent memory strategy to execute the provided function.

§Notes

Using that memory strategy is beneficial for stating model parameters and similar workflows.
You can call Self::memory_cleanup() if you want to free persistent memory.

Source