Module queue

Expand description

Lock-free message queue implementation.

This module provides the core message queue abstraction used for communication between host and GPU kernels. The queue uses a ring buffer design with atomic operations for lock-free access.

§Cache-line padding

SPSC queue throughput under concurrent producer/consumer load is dominated by cache-line bouncing between cores. When head and tail live on the same cache line, every producer store(head) invalidates the consumer’s cached view of tail and vice versa — turning every operation into a forced cache-coherence round- trip. [CachePadded] places each hot field on its own 128-byte cache line (the widest modern line, covering both x86 spatial prefetching pairs and NVIDIA Hopper-era L2), so producer and consumer do not contend at the line granularity.

128 bytes is a conservative choice: AMD Zen 4 / Intel Sapphire Rapids use 64-byte lines but prefetch in pairs (the “destructive interference pair”), which std::sync::atomic::hint::spin_loop and crossbeam-utils both target with 128 bytes of padding.

Structs§

BoundedQueue: Bounded queue with blocking operations.
MpscQueue: Multi-producer single-consumer lock-free queue.
PartitionedQueue: A partitioned queue for reduced contention with multiple producers.
PartitionedQueueStats: Statistics for a partitioned queue.
QueueFactory: Factory for creating appropriately-sized message queues.
QueueMetrics: Comprehensive queue metrics snapshot.
QueueMonitor: Monitor for queue health and utilization.
QueueStats: Statistics for a message queue.
SpscQueue: Single-producer single-consumer lock-free ring buffer.

Enums§

QueueHealth: Queue health status from monitoring.
QueueTier: Queue capacity tiers for dynamic queue allocation.

Traits§

MessageQueue: Trait for message queue implementations.

Module queue

Module queue Copy item path

§Cache-line padding

Structs§

Enums§

Traits§

Module queue