Module shared_memory

Expand description

Shared memory management for CUDA kernel emulation

Emulates CUDA shared memory (__shared__) on the CPU, including:

Static shared memory allocation (known size at compile time)
Dynamic (extern) shared memory allocation (size provided at launch)
Bank conflict detection for profiling/debugging

In CUDA, shared memory is per-block SRAM accessible by all threads in a block. On the CPU we emulate this with heap-allocated buffers shared among threads in the same block.

Structs§

BankConflictDetector: Tracks shared memory access patterns to detect bank conflicts.
DynamicSharedMemory: Dynamic shared memory allocation.
SharedMemory: Static shared memory allocation.

Constants§

BANK_WIDTH_BYTES: Size of each bank in bytes (4 bytes = 32 bits, matching CUDA).
NUM_BANKS: Number of memory banks (matches NVIDIA GPU shared memory banks).

Module shared_memory

Module shared_memory Copy item path

Structs§

Constants§

Module shared_memory