Skip to main content

Module aligned

Module aligned 

Source
Expand description

Aligned GPU memory allocation for optimal access patterns.

This module provides AlignedBuffer<T>, a device memory buffer that guarantees a specific alignment for the starting address. Proper alignment is critical for coalesced memory accesses on GPUs — misaligned loads and stores can incur extra memory transactions, significantly hurting throughput.

§Alignment options

VariantBytesUse case
Default256CUDA’s natural allocation alignment
Align256256Explicit 256-byte alignment
Align512512Optimal for many GPU texture/surface ops
Align10241024Large-stride access patterns
Align40964096Page-aligned for unified/mapped memory
Custom(n)nUser-specified (must be a power of two)

§Example

let buf = AlignedBuffer::<f32>::alloc(1024, Alignment::Align512)?;
assert!(buf.is_aligned());
assert_eq!(buf.as_device_ptr() % 512, 0);

Structs§

AlignedBuffer
A device memory buffer whose starting address is guaranteed to meet the requested Alignment.
AlignmentInfo
Information about the alignment of an existing device pointer.

Enums§

Alignment
Specifies the byte alignment for a device memory allocation.

Functions§

check_alignment
Inspects a device pointer and reports its alignment characteristics.
coalesce_alignment
Computes the smallest alignment that ensures coalesced memory access for a given access_width (in bytes) across a warp of warp_size threads.
optimal_alignment_for_type
Recommends an optimal Alignment for a type based on its size.
round_up_to_alignment
Rounds bytes up to the next multiple of alignment.
validate_alignment
Validates that an Alignment is a power of two and within a reasonable range.