pub struct ComputeClient<Server: ComputeServer> { /* private fields */ }Expand description
The ComputeClient is the entry point to require tasks from the ComputeServer. It should be obtained for a specific device via the Compute struct.
Implementations§
Source§impl<Server> ComputeClient<Server>where
    Server: ComputeServer,
 
impl<Server> ComputeClient<Server>where
    Server: ComputeServer,
Sourcepub fn init<D: Device>(device: &D, server: Server) -> Self
 
pub fn init<D: Device>(device: &D, server: Server) -> Self
Create a new client with a new server.
Sourcepub unsafe fn set_stream(&mut self, stream_id: StreamId)
 
pub unsafe fn set_stream(&mut self, stream_id: StreamId)
Set the stream in which the current client is operating on.
§Safety
This is highly unsafe and should probably only be used by the CubeCL/Burn projects for now.
Sourcepub fn read_async(
    &self,
    handles: Vec<Handle>,
) -> impl Future<Output = Vec<Bytes>> + Send
 
pub fn read_async( &self, handles: Vec<Handle>, ) -> impl Future<Output = Vec<Bytes>> + Send
Given bindings, returns owned resources as bytes.
Sourcepub fn read_tensor_async(
    &self,
    descriptors: Vec<CopyDescriptor<'_>>,
) -> impl Future<Output = Vec<Bytes>> + Send
 
pub fn read_tensor_async( &self, descriptors: Vec<CopyDescriptor<'_>>, ) -> impl Future<Output = Vec<Bytes>> + Send
Given bindings, returns owned resources as bytes.
Sourcepub fn read_tensor(&self, descriptors: Vec<CopyDescriptor<'_>>) -> Vec<Bytes>
 
pub fn read_tensor(&self, descriptors: Vec<CopyDescriptor<'_>>) -> Vec<Bytes>
Given bindings, returns owned resources as bytes.
§Remarks
Panics if the read operation fails.
The tensor must be in the same layout as created by the runtime, or more strict. Contiguous tensors are always fine, strided tensors are only ok if the stride is similar to the one created by the runtime (i.e. padded on only the last dimension). A way to check stride compatibility on the runtime will be added in the future.
Also see ComputeClient::create_tensor.
Sourcepub fn read_one_tensor_async(
    &self,
    descriptor: CopyDescriptor<'_>,
) -> impl Future<Output = Bytes> + Send
 
pub fn read_one_tensor_async( &self, descriptor: CopyDescriptor<'_>, ) -> impl Future<Output = Bytes> + Send
Given a binding, returns owned resource as bytes. See ComputeClient::read_tensor
Sourcepub fn read_one_tensor(&self, descriptor: CopyDescriptor<'_>) -> Bytes
 
pub fn read_one_tensor(&self, descriptor: CopyDescriptor<'_>) -> Bytes
Given a binding, returns owned resource as bytes.
§Remarks
Panics if the read operation fails. See ComputeClient::read_tensor
Sourcepub fn get_resource(
    &self,
    binding: Binding,
) -> BindingResource<<Server::Storage as ComputeStorage>::Resource>
 
pub fn get_resource( &self, binding: Binding, ) -> BindingResource<<Server::Storage as ComputeStorage>::Resource>
Given a resource handle, returns the storage resource.
Sourcepub fn create(&self, data: &[u8]) -> Handle
 
pub fn create(&self, data: &[u8]) -> Handle
Given a resource, stores it and returns the resource handle.
Sourcepub fn create_tensor(
    &self,
    data: &[u8],
    shape: &[usize],
    elem_size: usize,
) -> Allocation
 
pub fn create_tensor( &self, data: &[u8], shape: &[usize], elem_size: usize, ) -> Allocation
Given a resource and shape, stores it and returns the tensor handle and strides. This may or may not return contiguous strides. The layout is up to the runtime, and care should be taken when indexing.
Currently the tensor may either be contiguous (most runtimes), or “pitched”, to use the CUDA terminology. This means the last (contiguous) dimension is padded to fit a certain alignment, and the strides are adjusted accordingly. This can make memory accesses significantly faster since all rows are aligned to at least 16 bytes (the maximum load width), meaning the GPU can load as much data as possible in a single instruction. It may be aligned even more to also take cache lines into account.
However, the stride must be taken into account when indexing and reading the tensor (also see ComputeClient::read_tensor).
Sourcepub fn create_tensors(
    &self,
    descriptors: Vec<(AllocationDescriptor<'_>, &[u8])>,
) -> Vec<Allocation>
 
pub fn create_tensors( &self, descriptors: Vec<(AllocationDescriptor<'_>, &[u8])>, ) -> Vec<Allocation>
Reserves all shapes in a single storage buffer, copies the corresponding data into each
handle, and returns the handles for them.
See ComputeClient::create_tensor
Sourcepub fn empty(&self, size: usize) -> Handle
 
pub fn empty(&self, size: usize) -> Handle
Reserves size bytes in the storage, and returns a handle over them.
Sourcepub fn empty_tensor(&self, shape: &[usize], elem_size: usize) -> Allocation
 
pub fn empty_tensor(&self, shape: &[usize], elem_size: usize) -> Allocation
Reserves shape in the storage, and returns a tensor handle for it.
See ComputeClient::create_tensor
Sourcepub fn empty_tensors(
    &self,
    descriptors: Vec<AllocationDescriptor<'_>>,
) -> Vec<Allocation>
 
pub fn empty_tensors( &self, descriptors: Vec<AllocationDescriptor<'_>>, ) -> Vec<Allocation>
Reserves all shapes in a single storage buffer, and returns the handles for them.
See ComputeClient::create_tensor
Sourcepub fn to_client(&self, src: Handle, dst_server: &Self) -> Allocation
 
pub fn to_client(&self, src: Handle, dst_server: &Self) -> Allocation
Transfer data from one client to another
Sourcepub fn to_client_tensor(
    &self,
    src_descriptor: CopyDescriptor<'_>,
    dst_server: &Self,
) -> Allocation
 
pub fn to_client_tensor( &self, src_descriptor: CopyDescriptor<'_>, dst_server: &Self, ) -> Allocation
Transfer data from one client to another
Make sure the source description can be read in a contiguous manner.
Sourcepub fn execute(
    &self,
    kernel: Server::Kernel,
    count: CubeCount,
    bindings: Bindings,
)
 
pub fn execute( &self, kernel: Server::Kernel, count: CubeCount, bindings: Bindings, )
Executes the kernel over the given bindings.
Sourcepub unsafe fn execute_unchecked(
    &self,
    kernel: Server::Kernel,
    count: CubeCount,
    bindings: Bindings,
)
 
pub unsafe fn execute_unchecked( &self, kernel: Server::Kernel, count: CubeCount, bindings: Bindings, )
Executes the kernel over the given bindings without performing any bound checks.
§Safety
To ensure this is safe, you must verify your kernel:
- Has no out-of-bound reads and writes that can happen.
- Has no infinite loops that might never terminate.
Sourcepub fn properties(&self) -> &DeviceProperties
 
pub fn properties(&self) -> &DeviceProperties
Get the features supported by the compute server.
Sourcepub fn properties_mut(&mut self) -> Option<&mut DeviceProperties>
 
pub fn properties_mut(&mut self) -> Option<&mut DeviceProperties>
§Warning
For private use only.
Sourcepub fn memory_usage(&self) -> MemoryUsage
 
pub fn memory_usage(&self) -> MemoryUsage
Get the current memory usage of this client.
Sourcepub unsafe fn allocation_mode(&self, mode: MemoryAllocationMode)
 
pub unsafe fn allocation_mode(&self, mode: MemoryAllocationMode)
Change the memory allocation mode.
§Safety
This function isn’t thread safe and might create memory leaks.
Sourcepub fn memory_persistent_allocation<Input, Output, Func: Fn(Input) -> Output>(
    &self,
    input: Input,
    func: Func,
) -> Output
 
pub fn memory_persistent_allocation<Input, Output, Func: Fn(Input) -> Output>( &self, input: Input, func: Func, ) -> Output
Use a persistent memory strategy to execute the provided function.
§Notes
- Using that memory strategy is beneficial for stating model parameters and similar workflows.
- You can call Self::memory_cleanup() if you want to free persistent memory.
Sourcepub fn memory_cleanup(&self)
 
pub fn memory_cleanup(&self)
Ask the client to release memory that it can release.
Nb: Results will vary on what the memory allocator deems beneficial, so it’s not guaranteed any memory is freed.
Sourcepub fn profile<O>(
    &self,
    func: impl FnOnce() -> O,
    func_name: &str,
) -> Result<ProfileDuration, ProfileError>
 
pub fn profile<O>( &self, func: impl FnOnce() -> O, func_name: &str, ) -> Result<ProfileDuration, ProfileError>
Measure the execution time of some inner operations.