pub struct LibraryKernel<'a> { /* private fields */ }Implementations§
Source§impl LibraryKernel<'_>
impl LibraryKernel<'_>
pub fn name(&self) -> Result<String>
Sourcepub fn function(&self) -> Result<DeviceFunction>
pub fn function(&self) -> Result<DeviceFunction>
Returns the device function handle for this kernel and the current context.
If the handle is not found, the call returns crate::error::Status::NotFound.
§Errors
Returns an error if the CUDA context cannot be bound, if CUDA Driver cannot find the function, or if it returns a null handle.
Sourcepub unsafe fn add_to_graph<'a, P>(
&self,
graph: &mut Graph,
dependencies: &[GraphNode],
config: &LaunchConfig,
params: P,
) -> Result<GraphNode>where
P: KernelLaunchArgs<'a>,
pub unsafe fn add_to_graph<'a, P>(
&self,
graph: &mut Graph,
dependencies: &[GraphNode],
config: &LaunchConfig,
params: P,
) -> Result<GraphNode>where
P: KernelLaunchArgs<'a>,
Adds this kernel to graph as a kernel node.
§Safety
The caller must ensure every pointer value passed through params
remains valid for every graph instantiation, update, and launch that can
execute the created node. Mutable pointer arguments must remain
exclusive for the work ordered by those launches.
Sourcepub unsafe fn set_graph_node_params<'a, P>(
&self,
executable: &mut ExecutableGraph,
node: GraphNode,
config: &LaunchConfig,
params: P,
) -> Result<()>where
P: KernelLaunchArgs<'a>,
pub unsafe fn set_graph_node_params<'a, P>(
&self,
executable: &mut ExecutableGraph,
node: GraphNode,
config: &LaunchConfig,
params: P,
) -> Result<()>where
P: KernelLaunchArgs<'a>,
Updates this kernel’s parameters in an executable graph node.
§Safety
The caller must ensure every pointer value passed through params
remains valid for every future launch that can execute node. Mutable
pointer arguments must remain exclusive for the work ordered by those
launches.
pub fn attribute(&self, attribute: FunctionAttribute) -> Result<i32>
pub fn set_attribute( &self, attribute: FunctionAttribute, value: i32, ) -> Result<()>
Sourcepub fn set_cache_config(&self, config: FunctionCache) -> Result<()>
pub fn set_cache_config(&self, config: FunctionCache) -> Result<()>
Sets the preferred cache configuration for this kernel on devices where L1 cache and shared memory use the same hardware resources.
This setting is only a preference.
The driver uses the requested configuration if possible, but it may choose a different configuration if required to execute the kernel.
This per-kernel setting overrides any context-wide preference set via sys::cuCtxSetCacheConfig.
Attributes set using sys::cuFuncSetCacheConfig override this preference regardless of call order.
This setting does nothing on devices where the size of the L1 cache and shared memory are fixed.
Launching a kernel with a different preference than the most recent preference setting may insert a device-side synchronization point.
The supported cache configurations are:
FunctionCache::PreferNone: no preference for shared memory or L1 (default)FunctionCache::PreferShared: prefer larger shared memory and smaller L1 cacheFunctionCache::PreferL1: prefer larger L1 cache and smaller shared memoryFunctionCache::PreferEqual: prefer equal sized L1 cache and shared memory
This has stricter locking requirements than its legacy counterpart sys::cuFuncSetCacheConfig because the setting has device-wide semantics.
If multiple threads try to set a configuration on the same device simultaneously, the final cache configuration depends on OS scheduler interleaving and memory consistency.
§Errors
Returns an error if the CUDA context cannot be bound or if CUDA Driver rejects the cache configuration.
Sourcepub fn param_info(&self, index: usize) -> Result<KernelParamInfo>
pub fn param_info(&self, index: usize) -> Result<KernelParamInfo>
Queries the kernel parameter at the given index, returning the offset and size where the parameter resides in the device-side parameter layout. Use this information to update kernel node parameters from the device. The index must be less than the number of parameters that the kernel takes.
§Errors
Returns an error if the library context cannot be bound, index is not a valid kernel
parameter index, CUDA cannot query the parameter layout, or a previous asynchronous launch
reported an error.
pub const fn as_raw(&self) -> CUkernel
Trait Implementations§
Source§impl<'a> Clone for LibraryKernel<'a>
impl<'a> Clone for LibraryKernel<'a>
Source§fn clone(&self) -> LibraryKernel<'a>
fn clone(&self) -> LibraryKernel<'a>
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more