slotbus
Lock-free shared memory IPC for Rust. Sub-microsecond wake latency. Sub-millisecond round trips. Zero-copy payloads. Drop-in replacement for localhost HTTP in same-machine architectures.
Why slotbus?
If your services run on the same machine, you're paying for overhead you don't need. HTTP localhost adds 5-20ms of socket copies, HTTP parsing, and serialization per round trip. Unix sockets are better but still kernel-mediated. gRPC layers protobuf and HTTP/2 on top.
Slotbus eliminates all of it. Processes read and write directly from shared memory pages with OS-level event signaling. The result:
| Metric | slotbus | HTTP localhost | Unix socket | gRPC | shmem-ipc (ring) |
|---|---|---|---|---|---|
| Wake latency | 0-1 us | ~50 us | ~20 us | ~100 us | ~1-5 us |
| Round-trip (GET, small) | 0.1-0.4 ms | 5-15 ms | 1-3 ms | 2-5 ms | N/A (stream) |
| Round-trip (POST, body) | 0.7-0.8 ms | 8-20 ms | 2-5 ms | 3-8 ms | N/A (stream) |
| Concurrent in-flight | 32 slots | unlimited | unlimited | unlimited | 1 (SPSC) |
| Serialization overhead | postcard (binary) | JSON + HTTP framing | protocol-dependent | protobuf + HTTP/2 | raw bytes |
| CPU while idle | 0% (event wait) | 0% (epoll/IOCP) | 0% (epoll/IOCP) | 0% (epoll/IOCP) | polling or futex |
Measured on Windows 11, AMD Ryzen 9, DDR5.
10-50x faster than HTTP localhost for request/response workloads.
Use Cases
- Microservice communication on a single host — replace localhost HTTP between co-located services with shared memory IPC
- Sidecar architectures — connect main processes to sidecars (auth, logging, metrics) without network overhead
- Plugin systems — let plugins run in separate processes with near-zero communication cost
- AI/ML inference — dispatch requests to GPU worker processes with minimal latency
- Game servers — fast IPC between game logic, physics, and networking processes
- Desktop applications — communicate between a UI process and background workers
Quick Start
Add slotbus to your Cargo.toml:
[]
= "0.1"
Hub side — create a bus and dispatch requests
use ;
// Create a shared memory bus for a worker named "my-worker"
let config = builder
.name
.num_slots // 32 concurrent in-flight requests
.region_size // 1MB control region
.build;
let bus = create?;
// Start the response watcher (background thread)
bus.start_response_watcher;
// Dispatch a request — returns a oneshot receiver
let response = bus.dispatch_request?;
// Wait for the worker's response
let resp = response.await?;
println!;
Worker side — connect and handle requests
use SlotWorker;
// Open the shared memory region created by the hub
let worker = open?;
// Start the receive loop (runs on a dedicated OS thread)
worker.start_receive_loop;
slotbus-hub
Need an HTTP gateway? slotbus-hub is a standalone HTTP-to-shared-memory router. Workers register routes via HTTP; clients send normal HTTP requests; the hub dispatches them through shared memory with sub-millisecond round trips.
Install it separately: cargo install slotbus-hub — see the slotbus-hub repo for full documentation.
Comparison
| Feature | slotbus | Unix socket | HTTP localhost | gRPC | shmem-ipc | iceoryx2 |
|---|---|---|---|---|---|---|
| Topology | Hub/worker (req/rsp) | Point-to-point | Client/server | Client/server | Point-to-point | Pub/sub |
| Latency | 0.1-0.8 ms RTT | 1-5 ms RTT | 5-20 ms RTT | 2-8 ms RTT | <0.1 ms (stream) | <0.1 ms |
| Wake mechanism | Named events | epoll/IOCP | epoll/IOCP | epoll/IOCP | futex/polling | waitset |
| Concurrency | 32 slots (configurable) | unlimited | unlimited | unlimited | 1 (SPSC) | per-publisher |
| Request/response | Native | Manual | Native | Native | Manual | No (pub/sub) |
| Zero-copy reads | Inline heap | No | No | No | Yes | Yes |
| Serialization | postcard (meta only) | user choice | HTTP + JSON | protobuf | raw bytes | raw bytes |
| HTTP bridge | slotbus-hub binary | manual | native | grpc-web | manual | no |
| Route registration | Dynamic (runtime) | N/A | framework | protobuf schema | N/A | topic-based |
| Overflow handling | Auto spillover regions | N/A | chunked transfer | streaming | fixed buffer | loan mechanism |
| Windows | Yes | Partial | Yes | Yes | No | Yes |
| Linux | Yes | Yes | Yes | Yes | Yes | Yes |
| macOS | Yes | Yes | Yes | Yes | Yes | Yes |
When to use slotbus:
- You need request/response semantics (not streaming or pub/sub)
- Your processes are on the same machine
- Latency matters — you want sub-millisecond round-trips
- You want an HTTP-compatible interface without HTTP overhead (via slotbus-hub)
- You have multiple workers behind a single entry point
When to use something else:
- Cross-machine communication (use gRPC or HTTP)
- Pure streaming / pub-sub (use iceoryx2 or ZeroMQ)
- Single-producer single-consumer with maximum throughput (use shmem-ipc ring buffers)
Platform Support
| Platform | Status | Signaling mechanism |
|---|---|---|
| Windows | Supported | Named Events (CreateEventW / SetEvent / WaitForSingleObject) |
| Linux | Supported | POSIX named semaphores (sem_open / sem_post / sem_timedwait) |
| macOS | Supported | POSIX named semaphores (sem_open / sem_post / sem_trywait polling) |
The shared memory layer uses the shared_memory crate, which supports all three platforms. The signaling layer uses platform-native primitives for sub-microsecond wake latency on Windows and Linux. macOS uses a polling fallback (~1ms resolution) since sem_timedwait is not available.
Shared Memory Layout
Hub Process Worker Process
┌─────────────────────┐ ┌─────────────────────┐
│ SlotBus │ │ SlotWorker │
│ (hub-side handle) │ │ (worker-side handle)│
└────────┬────────────┘ └────────┬─────────────┘
│ │
dispatch_request() start_receive_loop()
│ │
┌──────────────────▼────────────────────────────────────────▼───────────┐
│ Shared Memory Control Region (1 MB) │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Header (64 bytes) │ │
│ │ magic: 0x48554231 | version: 1 | num_slots: 32 │ │
│ │ heap_offset | heap_size | alloc_head (AtomicU32) │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Slot 0 │ │ Slot 1 │ │ Slot 2 │ . . . │ Slot 31 │ │
│ │ 128 bytes│ │ 128 bytes│ │ 128 bytes│ │ 128 bytes│ │
│ │ │ │ │ │ │ │ │ │
│ │ status │ │ status │ │ status │ │ status │ │
│ │ req_id │ │ req_id │ │ req_id │ │ req_id │ │
│ │ method │ │ method │ │ method │ │ method │ │
│ │ meta_ptr │ │ meta_ptr │ │ meta_ptr │ │ meta_ptr │ │
│ │ body_ptr │ │ body_ptr │ │ body_ptr │ │ body_ptr │ │
│ │ resp_ptr │ │ resp_ptr │ │ resp_ptr │ │ resp_ptr │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Inline Heap (~1MB - header - slots) │ │
│ │ Bump-allocated. Metadata and small bodies written here. │ │
│ │ CAS on alloc_head for thread-safe allocation. │ │
│ │ Auto-reset when all slots are Free. │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ Overflow Regions (temporary, per-slot, created on demand) │
│ slotbus-{name}-req-{slot} — large request bodies │
│ slotbus-{name}-rsp-{slot} — large response bodies │
└──────────────────────────────────────────────────────────────────┘
Signaling (zero-polling cross-process wakeup):
┌─────────────────────────────────────────────┐
│ slotbus-{name}-req ──► wake worker │ Hub signals after writing Ready slot
│ slotbus-{name}-rsp ──► wake hub │ Worker signals after writing Done slot
└─────────────────────────────────────────────┘
Windows: Named Events (CreateEventW / SetEvent / WaitForSingleObject)
Linux: POSIX named semaphores
Slot State Machine
Each slot transitions through four states using AtomicU32 compare-and-swap:
Hub writes request Worker CAS Worker writes response Hub CAS
into slot Ready → Claimed into slot Done → Free
│ │ │ │
▼ ▼ ▼ ▼
┌──────┐ ┌───────┐ ┌─────────┐ ┌──────┐ ┌──────┐
│ Free │ ───► │ Ready │ ───► │ Claimed │ ───► │ Done │ ───► │ Free │
└──────┘ └───────┘ └─────────┘ └──────┘ └──────┘
0 1 2 3 0
Free (0) — Slot is available for a new request
Ready (1) — Hub has written request data; worker may claim it
Claimed (2) — Worker is processing the request
Done (3) — Worker has written response data; hub may read it
No mutexes. No spinlocks. Just atomic CAS with Acquire/Release ordering to ensure memory visibility across processes.
Request Lifecycle
- Hub finds a free slot (linear scan, typically slot 0 or 1 for low-traffic workloads).
- Hub bump-allocates space on the inline heap for serialized
RequestMeta(path, headers, query params) and the request body. - Hub writes metadata pointers and body pointers into the slot's fixed-size fields (128 bytes per slot).
- Hub atomically sets the slot status from
FreetoReadywithReleaseordering. - Hub signals the request event — the worker wakes up in under 1 microsecond.
- Worker scans slots, finds the
Readyone, and CAS transitions it toClaimed. - Worker reads request data from the heap using the slot's offset/length pointers.
- Worker processes the request, writes the response into the heap (or an overflow region), and sets the slot to
Done. - Worker signals the response event — the hub wakes up.
- Hub reads the response, resolves the oneshot channel, and CAS transitions the slot back to
Free.
Inline Heap + Overflow
The control region contains a bump-allocated heap after the header and slots. Small payloads (metadata, typical JSON bodies) are written inline — no extra allocations, no extra shared memory regions.
When the heap is full, large payloads spill to overflow regions: temporary named shared memory mappings (slotbus-{name}-req-{slot} or slotbus-{name}-rsp-{slot}). These are created on demand and kept alive until the slot is freed.
The heap resets automatically when all slots return to Free — no fragmentation, no GC.
Serialization
Metadata is serialized with postcard, a compact binary format. Request and response bodies are raw bytes — slotbus does not impose a serialization format on your payloads.
Event Signaling
Each worker has two OS-level named events:
- Request event (
slotbus-{name}-req): hub signals after writing aReadyslot - Response event (
slotbus-{name}-rsp): worker signals after writing aDoneslot
Events are auto-reset: a single WaitForSingleObject call blocks until signaled, then automatically resets. No polling loops, no busy-waiting, no timer ticks. The 5-second timeout in the wait call is a safety fallback — under normal operation, the event fires in under 1 microsecond.
All options are set through the builder:
let config = builder
.name // Required. OS identifier for the SHM region.
.prefix // Prefix for all OS names. Default: "slotbus".
.num_slots // Concurrent request slots. Default: 32. Range: 1-256.
.region_size // Control region size in bytes. Default: 1MB.
.wait_timeout_ms // Event wait timeout (safety fallback). Default: 5000ms.
.instrumentation // Latency logging. Default: false.
.build;
| Option | Default | Description |
|---|---|---|
name |
required | Worker name. Used to derive all OS-level identifiers: {prefix}-{name} for the SHM region, {prefix}-{name}-req / {prefix}-{name}-rsp for events. |
prefix |
"slotbus" |
Namespace prefix. Change this to run multiple independent slotbus instances on the same machine. |
num_slots |
32 |
Number of concurrent request/response slots. Each slot is 128 bytes of fixed metadata. More slots = more concurrency, but the heap shrinks. Clamped to 1-256. |
region_size |
1,048,576 |
Total size of the control region in bytes. Must fit the header (64B) + slots (128B each) + heap. With 32 slots, the heap gets ~1,044,416 bytes. |
wait_timeout_ms |
5,000 |
Maximum time (ms) to block on an event wait. Safety fallback only — the event signal provides sub-microsecond wakeup. |
instrumentation |
false |
When enabled, logs timing data for slot claims, round-trips, and heap allocations via tracing. |
Derived OS Names
For a worker named "my-worker" with the default prefix:
| Resource | OS Name |
|---|---|
| Control region | slotbus-my-worker |
| Request event | slotbus-my-worker-req |
| Response event | slotbus-my-worker-rsp |
| Request overflow (slot 5) | slotbus-my-worker-req-5 |
| Response overflow (slot 5) | slotbus-my-worker-rsp-5 |
Contributing
Contributions are welcome. Please open an issue first to discuss what you'd like to change.
# Clone and build
# Run tests
# Run with instrumentation logging
RUST_LOG=slotbus=trace
The codebase is small by design. The core transport is under 1,000 lines.
Minimum Supported Rust Version
1.75
License
Licensed under the MIT License.