Skip to main content

Module shared_memory

Module shared_memory 

Source
Expand description

Shared memory management for CUDA kernel emulation

Emulates CUDA shared memory (__shared__) on the CPU, including:

  • Static shared memory allocation (known size at compile time)
  • Dynamic (extern) shared memory allocation (size provided at launch)
  • Bank conflict detection for profiling/debugging

In CUDA, shared memory is per-block SRAM accessible by all threads in a block. On the CPU we emulate this with heap-allocated buffers shared among threads in the same block.

Structs§

BankConflictDetector
Tracks shared memory access patterns to detect bank conflicts.
DynamicSharedMemory
Dynamic shared memory allocation.
SharedMemory
Static shared memory allocation.

Constants§

BANK_WIDTH_BYTES
Size of each bank in bytes (4 bytes = 32 bits, matching CUDA).
NUM_BANKS
Number of memory banks (matches NVIDIA GPU shared memory banks).