pub struct DeviceOccupancyInfo {
pub sm_count: u32,
pub max_threads_per_sm: u32,
pub max_blocks_per_sm: u32,
pub max_registers_per_sm: u32,
pub max_shared_memory_per_sm: u32,
pub warp_size: u32,
}Expand description
Hardware parameters needed for CPU-side occupancy estimation.
Fields§
§sm_count: u32Number of streaming multiprocessors on the device.
max_threads_per_sm: u32Maximum resident threads per SM.
max_blocks_per_sm: u32Maximum concurrent blocks per SM.
max_registers_per_sm: u32Total 32-bit registers available per SM.
Shared memory capacity per SM in bytes.
warp_size: u32Threads per warp (typically 32).
Implementations§
Source§impl DeviceOccupancyInfo
impl DeviceOccupancyInfo
Sourcepub fn for_compute_capability(sm_major: u32, sm_minor: u32) -> Self
pub fn for_compute_capability(sm_major: u32, sm_minor: u32) -> Self
Return synthetic DeviceOccupancyInfo for a given SM compute
capability, enabling CPU-side occupancy analysis without a live GPU.
Covers all major NVIDIA GPU architectures from Turing through Blackwell. Unknown architectures fall back to Ampere SM 8.6 defaults.
§SM capability table
| Architecture | sm_major | sm_minor | SMs | Threads/SM | Smem/SM |
|---|---|---|---|---|---|
| Turing | 7 | 5 | 68 | 1024 | 65536 |
| Ampere A100 | 8 | 0 | 108 | 2048 | 167936 |
| Ampere GA10x | 8 | 6 | 84 | 1536 | 102400 |
| Ada Lovelace | 8 | 9 | 76 | 1536 | 101376 |
| Hopper H100 | 9 | 0 | 132 | 2048 | 232448 |
| Blackwell B100 | 10 | 0 | 132 | 2048 | 262144 |
| Blackwell B200 | 12 | 0 | 148 | 2048 | 262144 |
Trait Implementations§
Source§impl Clone for DeviceOccupancyInfo
impl Clone for DeviceOccupancyInfo
Source§fn clone(&self) -> DeviceOccupancyInfo
fn clone(&self) -> DeviceOccupancyInfo
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for DeviceOccupancyInfo
impl Debug for DeviceOccupancyInfo
Source§impl Hash for DeviceOccupancyInfo
impl Hash for DeviceOccupancyInfo
Source§impl PartialEq for DeviceOccupancyInfo
impl PartialEq for DeviceOccupancyInfo
impl Copy for DeviceOccupancyInfo
impl Eq for DeviceOccupancyInfo
impl StructuralPartialEq for DeviceOccupancyInfo
Auto Trait Implementations§
impl Freeze for DeviceOccupancyInfo
impl RefUnwindSafe for DeviceOccupancyInfo
impl Send for DeviceOccupancyInfo
impl Sync for DeviceOccupancyInfo
impl Unpin for DeviceOccupancyInfo
impl UnsafeUnpin for DeviceOccupancyInfo
impl UnwindSafe for DeviceOccupancyInfo
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more