pub struct GpuArchSpec {
pub name: String,
pub max_threads_per_sm: u32,
pub max_blocks_per_sm: u32,
pub max_warps_per_sm: u32,
pub warp_size: u32,
pub registers_per_sm: u32,
pub register_alloc_granularity: u32,
pub shared_memory_per_sm: u32,
pub shared_memory_alloc_granularity: u32,
pub sm_count: u32,
}Expand description
GPU architecture specification.
Fields§
§name: StringCompute capability name (e.g., “sm_90”, “gfx1100”).
max_threads_per_sm: u32Max threads per SM/CU.
max_blocks_per_sm: u32Max blocks (thread groups) per SM.
max_warps_per_sm: u32Max warps per SM.
warp_size: u32Warp size (32 for NVIDIA, 64 for AMD RDNA, 32 for AMD CDNA).
registers_per_sm: u32Total registers per SM.
register_alloc_granularity: u32Register allocation granularity (registers are allocated in chunks).
Shared memory per SM (bytes).
Shared memory allocation granularity (bytes).
sm_count: u32Number of SMs/CUs on the device.
Implementations§
Source§impl GpuArchSpec
impl GpuArchSpec
Sourcepub fn ada_lovelace() -> Self
pub fn ada_lovelace() -> Self
NVIDIA Ada Lovelace (SM 8.9) — RTX 4090.
Trait Implementations§
Source§impl Clone for GpuArchSpec
impl Clone for GpuArchSpec
Source§fn clone(&self) -> GpuArchSpec
fn clone(&self) -> GpuArchSpec
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for GpuArchSpec
impl RefUnwindSafe for GpuArchSpec
impl Send for GpuArchSpec
impl Sync for GpuArchSpec
impl Unpin for GpuArchSpec
impl UnsafeUnpin for GpuArchSpec
impl UnwindSafe for GpuArchSpec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more