Struct LibraryKernel

Source

pub struct LibraryKernel<'a> { /* private fields */ }

Implementations§

Source §

impl LibraryKernel<'_>

Source

pub fn name(&self) -> Result<String>

Source

pub fn function(&self) -> Result<DeviceFunction>

Returns the device function handle for this kernel and the current context. If the handle is not found, the call returns crate::error::Status::NotFound.

§Errors

Returns an error if the CUDA context cannot be bound, if CUDA Driver cannot find the function, or if it returns a null handle.

Source

pub unsafe fn add_to_graph<'a, P>( &self, graph: &mut Graph, dependencies: &[GraphNode], config: &LaunchConfig, params: P, ) -> Result<GraphNode>
where P: KernelLaunchArgs<'a>,

Adds this kernel to graph as a kernel node.

§Safety

The caller must ensure every pointer value passed through params remains valid for every graph instantiation, update, and launch that can execute the created node. Mutable pointer arguments must remain exclusive for the work ordered by those launches.

Source

pub unsafe fn set_graph_node_params<'a, P>( &self, executable: &mut ExecutableGraph, node: GraphNode, config: &LaunchConfig, params: P, ) -> Result<()>
where P: KernelLaunchArgs<'a>,

Updates this kernel’s parameters in an executable graph node.

§Safety

The caller must ensure every pointer value passed through params remains valid for every future launch that can execute node. Mutable pointer arguments must remain exclusive for the work ordered by those launches.

Source

pub fn attribute(&self, attribute: FunctionAttribute) -> Result<i32>

Source

pub fn set_attribute( &self, attribute: FunctionAttribute, value: i32, ) -> Result<()>

Source

pub fn set_cache_config(&self, config: FunctionCache) -> Result<()>

Sets the preferred cache configuration for this kernel on devices where L1 cache and shared memory use the same hardware resources. This setting is only a preference. The driver uses the requested configuration if possible, but it may choose a different configuration if required to execute the kernel. This per-kernel setting overrides any context-wide preference set via sys::cuCtxSetCacheConfig.

Attributes set using sys::cuFuncSetCacheConfig override this preference regardless of call order.

This setting does nothing on devices where the size of the L1 cache and shared memory are fixed.

Launching a kernel with a different preference than the most recent preference setting may insert a device-side synchronization point.

The supported cache configurations are:

FunctionCache::PreferNone: no preference for shared memory or L1 (default)
FunctionCache::PreferShared: prefer larger shared memory and smaller L1 cache
FunctionCache::PreferL1: prefer larger L1 cache and smaller shared memory
FunctionCache::PreferEqual: prefer equal sized L1 cache and shared memory

This has stricter locking requirements than its legacy counterpart sys::cuFuncSetCacheConfig because the setting has device-wide semantics. If multiple threads try to set a configuration on the same device simultaneously, the final cache configuration depends on OS scheduler interleaving and memory consistency.

§Errors

Returns an error if the CUDA context cannot be bound or if CUDA Driver rejects the cache configuration.

Source

pub fn param_info(&self, index: usize) -> Result<KernelParamInfo>

Queries the kernel parameter at the given index, returning the offset and size where the parameter resides in the device-side parameter layout. Use this information to update kernel node parameters from the device. The index must be less than the number of parameters that the kernel takes.

§Errors

Returns an error if the library context cannot be bound, index is not a valid kernel parameter index, CUDA cannot query the parameter layout, or a previous asynchronous launch reported an error.

Source