Expand description
Aligned GPU memory allocation for optimal access patterns.
This module provides AlignedBuffer<T>, a device memory buffer that
guarantees a specific alignment for the starting address. Proper alignment
is critical for coalesced memory accesses on GPUs — misaligned loads and
stores can incur extra memory transactions, significantly hurting
throughput.
§Alignment options
| Variant | Bytes | Use case |
|---|---|---|
Default | 256 | CUDA’s natural allocation alignment |
Align256 | 256 | Explicit 256-byte alignment |
Align512 | 512 | Optimal for many GPU texture/surface ops |
Align1024 | 1024 | Large-stride access patterns |
Align4096 | 4096 | Page-aligned for unified/mapped memory |
Custom(n) | n | User-specified (must be a power of two) |
§Example
let buf = AlignedBuffer::<f32>::alloc(1024, Alignment::Align512)?;
assert!(buf.is_aligned());
assert_eq!(buf.as_device_ptr() % 512, 0);Structs§
- Aligned
Buffer - A device memory buffer whose starting address is guaranteed to meet the
requested
Alignment. - Alignment
Info - Information about the alignment of an existing device pointer.
Enums§
- Alignment
- Specifies the byte alignment for a device memory allocation.
Functions§
- check_
alignment - Inspects a device pointer and reports its alignment characteristics.
- coalesce_
alignment - Computes the smallest alignment that ensures coalesced memory access for a
given
access_width(in bytes) across a warp ofwarp_sizethreads. - optimal_
alignment_ for_ type - Recommends an optimal
Alignmentfor a type based on its size. - round_
up_ to_ alignment - Rounds
bytesup to the next multiple ofalignment. - validate_
alignment - Validates that an
Alignmentis a power of two and within a reasonable range.