Struct rust_gpu_tools::cuda::Program
source · [−]pub struct Program { /* private fields */ }
Expand description
Abstraction that contains everything to run a CUDA kernel on a GPU.
The majority of methods are the same as crate::opencl::Program
, so you can write code using this
API, which will then work with OpenCL as well as CUDA kernels.
Implementations
sourceimpl Program
impl Program
sourcepub fn device_name(&self) -> &str
pub fn device_name(&self) -> &str
Returns the name of the GPU, e.g. “GeForce RTX 3090”.
sourcepub fn from_binary(
device: &Device,
filename: &CStr
) -> Result<Program, GPUError>
pub fn from_binary(
device: &Device,
filename: &CStr
) -> Result<Program, GPUError>
Creates a program for a specific device from a compiled CUDA binary file.
sourcepub fn from_bytes(device: &Device, bytes: &[u8]) -> Result<Program, GPUError>
pub fn from_bytes(device: &Device, bytes: &[u8]) -> Result<Program, GPUError>
Creates a program for a specific device from a compiled CUDA binary.
sourcepub unsafe fn create_buffer<T>(
&self,
length: usize
) -> Result<Buffer<T>, GPUError>
pub unsafe fn create_buffer<T>(
&self,
length: usize
) -> Result<Buffer<T>, GPUError>
Creates a new buffer that can be used for input/output with the GPU.
The length
is the number of elements to create.
It is usually used to create buffers that are initialized by the GPU. If you want to
directly transfer data from the host to the GPU, you would use the safe
Program::create_buffer_from_slice
instead.
Safety
The buffer needs to be initalized (by the host with Program::write_from_buffer
) or by
the GPU) before it can be read via Program::read_into_buffer
.
sourcepub fn create_buffer_from_slice<T>(
&self,
slice: &[T]
) -> Result<Buffer<T>, GPUError>
pub fn create_buffer_from_slice<T>(
&self,
slice: &[T]
) -> Result<Buffer<T>, GPUError>
Creates a new buffer on the GPU and initializes with the given slice.
sourcepub fn create_kernel(
&self,
name: &str,
gws: usize,
lws: usize
) -> Result<Kernel<'_>, GPUError>
pub fn create_kernel(
&self,
name: &str,
gws: usize,
lws: usize
) -> Result<Kernel<'_>, GPUError>
Returns a kernel.
The global_work_size
does not follow the OpenCL definition. It is not the total
number of threads. Instead it follows CUDA’s definition and is the number of
local_work_size
sized thread groups. So the total number of threads is
global_work_size * local_work_size
.
sourcepub fn write_from_buffer<T>(
&self,
buffer: &mut Buffer<T>,
data: &[T]
) -> Result<(), GPUError>
pub fn write_from_buffer<T>(
&self,
buffer: &mut Buffer<T>,
data: &[T]
) -> Result<(), GPUError>
Puts data from an existing buffer onto the GPU.
sourcepub fn read_into_buffer<T>(
&self,
buffer: &Buffer<T>,
data: &mut [T]
) -> Result<(), GPUError>
pub fn read_into_buffer<T>(
&self,
buffer: &Buffer<T>,
data: &mut [T]
) -> Result<(), GPUError>
Reads data from the GPU into an existing buffer.
sourcepub fn run<F, R, E, A>(&self, fun: F, arg: A) -> Result<R, E> where
F: FnOnce(&Self, A) -> Result<R, E>,
E: From<GPUError>,
pub fn run<F, R, E, A>(&self, fun: F, arg: A) -> Result<R, E> where
F: FnOnce(&Self, A) -> Result<R, E>,
E: From<GPUError>,
Run some code in the context of the program.
It sets the correct contexts.
It takes the program as a parameter, so that we can use the same function body, for both the OpenCL and the CUDA code path. The only difference is the type of the program.
Trait Implementations
Auto Trait Implementations
impl RefUnwindSafe for Program
impl !Sync for Program
impl Unpin for Program
impl UnwindSafe for Program
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more