Struct rcublas_sys::cudaFuncAttributes[−][src]

#[repr(C)]pub struct cudaFuncAttributes {
    pub sharedSizeBytes: usize,
    pub constSizeBytes: usize,
    pub localSizeBytes: usize,
    pub maxThreadsPerBlock: c_int,
    pub numRegs: c_int,
    pub ptxVersion: c_int,
    pub binaryVersion: c_int,
    pub cacheModeCA: c_int,
    pub maxDynamicSharedSizeBytes: c_int,
    pub preferredShmemCarveout: c_int,
}

Expand description

CUDA function attributes

Fields

sharedSizeBytes: usize

The size in bytes of statically-allocated shared memory per block required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.

constSizeBytes: usize

The size in bytes of user-allocated constant memory required by this function.

localSizeBytes: usize

The size in bytes of local memory used by each thread of this function.

maxThreadsPerBlock: c_int

The maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.

numRegs: c_int

The number of registers used by each thread of this function.

ptxVersion: c_int

The PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13.

binaryVersion: c_int

The binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13.

cacheModeCA: c_int

The attribute to indicate whether the function has been compiled with user specified option “-Xptxas –dlcm=ca” set.

maxDynamicSharedSizeBytes: c_int

The maximum size in bytes of dynamic shared memory per block for this function. Any launch must have a dynamic shared memory size smaller than this value.

preferredShmemCarveout: c_int

On devices where the L1 cache and shared memory use the same hardware resources, this sets the shared memory carveout preference, in percent of the maximum shared memory. Refer to ::cudaDevAttrMaxSharedMemoryPerMultiprocessor. This is only a hint, and the driver can choose a different ratio if required to execute the function. See ::cudaFuncSetAttribute

Struct rcublas_sys::cudaFuncAttributes[−][src]

Fields

Trait Implementations

impl Clone for cudaFuncAttributes

fn clone(&self) -> cudaFuncAttributes

fn clone_from(&mut self, source: &Self)

impl Debug for cudaFuncAttributes

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Copy for cudaFuncAttributes

Auto Trait Implementations

impl RefUnwindSafe for cudaFuncAttributes

impl Send for cudaFuncAttributes

impl Sync for cudaFuncAttributes

impl Unpin for cudaFuncAttributes

impl UnwindSafe for cudaFuncAttributes

Blanket Implementations

impl<T> Any for T where T: 'static + ?Sized,

pub fn type_id(&self) -> TypeId

impl<T> Borrow<T> for T where T: ?Sized,

pub fn borrow(&self) -> &T

impl<T> BorrowMut<T> for T where T: ?Sized,

pub fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

pub fn from(t: T) -> T

impl<T, U> Into<U> for T where U: From<T>,

pub fn into(self) -> U

impl<T> ToOwned for T where T: Clone,

type Owned = T

pub fn to_owned(&self) -> T

pub fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for T where U: Into<T>,

type Error = Infallible

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for T where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> Any for T where
T: 'static + ?Sized,

impl<T> Borrow<T> for T where
T: ?Sized,

impl<T> BorrowMut<T> for T where
T: ?Sized,

impl<T, U> Into<U> for T where
U: From<T>,

impl<T> ToOwned for T where
T: Clone,

impl<T, U> TryFrom<U> for T where
U: Into<T>,

impl<T, U> TryInto<U> for T where
U: TryFrom<T>,