slotbus

Lock-free shared memory IPC for Rust. Sub-microsecond wake latency. Sub-millisecond round trips. Zero-copy payloads. Drop-in replacement for localhost HTTP in same-machine architectures.

Why slotbus?

If your services run on the same machine, you're paying for overhead you don't need. HTTP localhost adds 5-20ms of socket copies, HTTP parsing, and serialization per round trip. Unix sockets are better but still kernel-mediated. gRPC layers protobuf and HTTP/2 on top.

Slotbus eliminates all of it. Processes read and write directly from shared memory pages with OS-level event signaling. The result:

Metric	slotbus	HTTP localhost	Unix socket	gRPC	shmem-ipc (ring)
Wake latency	0-1 us	~50 us	~20 us	~100 us	~1-5 us
Round-trip (GET, small)	0.1-0.4 ms	5-15 ms	1-3 ms	2-5 ms	N/A (stream)
Round-trip (POST, body)	0.7-0.8 ms	8-20 ms	2-5 ms	3-8 ms	N/A (stream)
Concurrent in-flight	32 slots	unlimited	unlimited	unlimited	1 (SPSC)
Serialization overhead	postcard (binary)	JSON + HTTP framing	protocol-dependent	protobuf + HTTP/2	raw bytes
CPU while idle	0% (event wait)	0% (epoll/IOCP)	0% (epoll/IOCP)	0% (epoll/IOCP)	polling or futex

Measured on Windows 11, AMD Ryzen 9, DDR5.

10-50x faster than HTTP localhost for request/response workloads.

Use Cases

Microservice communication on a single host — replace localhost HTTP between co-located services with shared memory IPC
Sidecar architectures — connect main processes to sidecars (auth, logging, metrics) without network overhead
Plugin systems — let plugins run in separate processes with near-zero communication cost
AI/ML inference — dispatch requests to GPU worker processes with minimal latency
Game servers — fast IPC between game logic, physics, and networking processes
Desktop applications — communicate between a UI process and background workers

Quick Start

Add slotbus to your Cargo.toml:

[dependencies]
slotbus = "0.1"

Hub side — create a bus and dispatch requests

use slotbus::{SlotBus, SlotBusConfig};

// Create a shared memory bus for a worker named "my-worker"
let config = SlotBusConfig::builder()
    .name("my-worker")
    .num_slots(32)          // 32 concurrent in-flight requests
    .region_size(1_048_576) // 1MB control region
    .build();

let bus = SlotBus::create(config)?;

// Start the response watcher (background thread)
bus.start_response_watcher();

// Dispatch a request — returns a oneshot receiver
let response = bus.dispatch_request(
    "req-001",          // request ID
    "GET",              // HTTP method
    "/api/status",      // path
    &[],                // request body
)?;

// Wait for the worker's response
let resp = response.await?;
println!("Status: {}, Body: {} bytes", resp.status, resp.body.len());

Worker side — connect and handle requests

use slotbus::SlotWorker;

// Open the shared memory region created by the hub
let worker = SlotWorker::open("my-worker", Default::default())?;

// Start the receive loop (runs on a dedicated OS thread)
worker.start_receive_loop(|transport, slot_index, request| {
    println!("{} {}", request.method, request.path);

    // Process the request...
    let response_body = b"OK";

    // Write response back through shared memory
    transport.send_response(
        slot_index,
        200,                            // HTTP status
        response_body.to_vec(),         // body
        "text/plain",                   // content-type
        vec![],                         // extra headers
    ).unwrap();
});

slotbus-hub

Need an HTTP gateway? slotbus-hub is a standalone HTTP-to-shared-memory router. Workers register routes via HTTP; clients send normal HTTP requests; the hub dispatches them through shared memory with sub-millisecond round trips.

Install it separately: cargo install slotbus-hub — see the slotbus-hub repo for full documentation.

Comparison

Feature	slotbus	Unix socket	HTTP localhost	gRPC	shmem-ipc	iceoryx2
Topology	Hub/worker (req/rsp)	Point-to-point	Client/server	Client/server	Point-to-point	Pub/sub
Latency	0.1-0.8 ms RTT	1-5 ms RTT	5-20 ms RTT	2-8 ms RTT	<0.1 ms (stream)	<0.1 ms
Wake mechanism	Named events	epoll/IOCP	epoll/IOCP	epoll/IOCP	futex/polling	waitset
Concurrency	32 slots (configurable)	unlimited	unlimited	unlimited	1 (SPSC)	per-publisher
Request/response	Native	Manual	Native	Native	Manual	No (pub/sub)
Zero-copy reads	Inline heap	No	No	No	Yes	Yes
Serialization	postcard (meta only)	user choice	HTTP + JSON	protobuf	raw bytes	raw bytes
HTTP bridge	slotbus-hub binary	manual	native	grpc-web	manual	no
Route registration	Dynamic (runtime)	N/A	framework	protobuf schema	N/A	topic-based
Overflow handling	Auto spillover regions	N/A	chunked transfer	streaming	fixed buffer	loan mechanism
Windows	Yes	Partial	Yes	Yes	No	Yes
Linux	Yes	Yes	Yes	Yes	Yes	Yes
macOS	Yes	Yes	Yes	Yes	Yes	Yes

When to use slotbus:

You need request/response semantics (not streaming or pub/sub)
Your processes are on the same machine
Latency matters — you want sub-millisecond round-trips
You want an HTTP-compatible interface without HTTP overhead (via slotbus-hub)
You have multiple workers behind a single entry point

When to use something else:

Cross-machine communication (use gRPC or HTTP)
Pure streaming / pub-sub (use iceoryx2 or ZeroMQ)
Single-producer single-consumer with maximum throughput (use shmem-ipc ring buffers)

Platform Support

Platform	Status	Signaling mechanism
Windows	Supported	Named Events (`CreateEventW` / `SetEvent` / `WaitForSingleObject`)
Linux	Supported	POSIX named semaphores (`sem_open` / `sem_post` / `sem_timedwait`)
macOS	Supported	POSIX named semaphores (`sem_open` / `sem_post` / `sem_trywait` polling)

The shared memory layer uses the shared_memory crate, which supports all three platforms. The signaling layer uses platform-native primitives for sub-microsecond wake latency on Windows and Linux. macOS uses a polling fallback (~1ms resolution) since sem_timedwait is not available.

Shared Memory Layout

                         Hub Process                              Worker Process
                    ┌─────────────────────┐                  ┌─────────────────────┐
                    │     SlotBus          │                  │    SlotWorker        │
                    │  (hub-side handle)   │                  │  (worker-side handle)│
                    └────────┬────────────┘                  └────────┬─────────────┘
                             │                                        │
                   dispatch_request()                       start_receive_loop()
                             │                                        │
          ┌──────────────────▼────────────────────────────────────────▼───────────┐
          │                    Shared Memory Control Region (1 MB)                │
          │                                                                      │
          │  ┌────────────────────────────────────────────────────────────────┐   │
          │  │  Header (64 bytes)                                            │   │
          │  │  magic: 0x48554231 | version: 1 | num_slots: 32              │   │
          │  │  heap_offset | heap_size | alloc_head (AtomicU32)             │   │
          │  └────────────────────────────────────────────────────────────────┘   │
          │                                                                      │
          │  ┌──────────┐ ┌──────────┐ ┌──────────┐         ┌──────────┐        │
          │  │  Slot 0  │ │  Slot 1  │ │  Slot 2  │  . . .  │  Slot 31 │        │
          │  │ 128 bytes│ │ 128 bytes│ │ 128 bytes│         │ 128 bytes│        │
          │  │          │ │          │ │          │         │          │        │
          │  │ status   │ │ status   │ │ status   │         │ status   │        │
          │  │ req_id   │ │ req_id   │ │ req_id   │         │ req_id   │        │
          │  │ method   │ │ method   │ │ method   │         │ method   │        │
          │  │ meta_ptr │ │ meta_ptr │ │ meta_ptr │         │ meta_ptr │        │
          │  │ body_ptr │ │ body_ptr │ │ body_ptr │         │ body_ptr │        │
          │  │ resp_ptr │ │ resp_ptr │ │ resp_ptr │         │ resp_ptr │        │
          │  └──────────┘ └──────────┘ └──────────┘         └──────────┘        │
          │                                                                      │
          │  ┌────────────────────────────────────────────────────────────────┐   │
          │  │  Inline Heap (~1MB - header - slots)                          │   │
          │  │  Bump-allocated. Metadata and small bodies written here.      │   │
          │  │  CAS on alloc_head for thread-safe allocation.                │   │
          │  │  Auto-reset when all slots are Free.                          │   │
          │  └────────────────────────────────────────────────────────────────┘   │
          └──────────────────────────────────────────────────────────────────────┘

          ┌──────────────────────────────────────────────────────────────────┐
          │  Overflow Regions (temporary, per-slot, created on demand)       │
          │  slotbus-{name}-req-{slot}  — large request bodies              │
          │  slotbus-{name}-rsp-{slot}  — large response bodies             │
          └──────────────────────────────────────────────────────────────────┘

          Signaling (zero-polling cross-process wakeup):
          ┌─────────────────────────────────────────────┐
          │  slotbus-{name}-req  ──►  wake worker       │   Hub signals after writing Ready slot
          │  slotbus-{name}-rsp  ──►  wake hub          │   Worker signals after writing Done slot
          └─────────────────────────────────────────────┘
          Windows: Named Events (CreateEventW / SetEvent / WaitForSingleObject)
          Linux:   POSIX named semaphores

Slot State Machine

Each slot transitions through four states using AtomicU32 compare-and-swap:

    Hub writes request           Worker CAS            Worker writes response       Hub CAS
         into slot              Ready → Claimed             into slot             Done → Free
            │                       │                          │                      │
            ▼                       ▼                          ▼                      ▼
  ┌──────┐      ┌───────┐      ┌─────────┐      ┌──────┐      ┌──────┐
  │ Free │ ───► │ Ready │ ───► │ Claimed │ ───► │ Done │ ───► │ Free │
  └──────┘      └───────┘      └─────────┘      └──────┘      └──────┘
     0              1               2               3             0

  Free (0)    — Slot is available for a new request
  Ready (1)   — Hub has written request data; worker may claim it
  Claimed (2) — Worker is processing the request
  Done (3)    — Worker has written response data; hub may read it

No mutexes. No spinlocks. Just atomic CAS with Acquire/Release ordering to ensure memory visibility across processes.

Request Lifecycle

Hub finds a free slot (linear scan, typically slot 0 or 1 for low-traffic workloads).
Hub bump-allocates space on the inline heap for serialized RequestMeta (path, headers, query params) and the request body.
Hub writes metadata pointers and body pointers into the slot's fixed-size fields (128 bytes per slot).
Hub atomically sets the slot status from Free to Ready with Release ordering.
Hub signals the request event — the worker wakes up in under 1 microsecond.
Worker scans slots, finds the Ready one, and CAS transitions it to Claimed.
Worker reads request data from the heap using the slot's offset/length pointers.
Worker processes the request, writes the response into the heap (or an overflow region), and sets the slot to Done.
Worker signals the response event — the hub wakes up.
Hub reads the response, resolves the oneshot channel, and CAS transitions the slot back to Free.

Inline Heap + Overflow

The control region contains a bump-allocated heap after the header and slots. Small payloads (metadata, typical JSON bodies) are written inline — no extra allocations, no extra shared memory regions.

When the heap is full, large payloads spill to overflow regions: temporary named shared memory mappings (slotbus-{name}-req-{slot} or slotbus-{name}-rsp-{slot}). These are created on demand and kept alive until the slot is freed.

The heap resets automatically when all slots return to Free — no fragmentation, no GC.

Serialization

Metadata is serialized with postcard, a compact binary format. Request and response bodies are raw bytes — slotbus does not impose a serialization format on your payloads.

Event Signaling

Each worker has two OS-level named events:

Request event (slotbus-{name}-req): hub signals after writing a Ready slot
Response event (slotbus-{name}-rsp): worker signals after writing a Done slot

Events are auto-reset: a single WaitForSingleObject call blocks until signaled, then automatically resets. No polling loops, no busy-waiting, no timer ticks. The 5-second timeout in the wait call is a safety fallback — under normal operation, the event fires in under 1 microsecond.

All options are set through the builder:

let config = SlotBusConfig::builder()
    .name("my-worker")        // Required. OS identifier for the SHM region.
    .prefix("slotbus")        // Prefix for all OS names. Default: "slotbus".
    .num_slots(32)             // Concurrent request slots. Default: 32. Range: 1-256.
    .region_size(1_048_576)    // Control region size in bytes. Default: 1MB.
    .wait_timeout_ms(5_000)    // Event wait timeout (safety fallback). Default: 5000ms.
    .instrumentation(false)    // Latency logging. Default: false.
    .build();

Option	Default	Description
`name`	required	Worker name. Used to derive all OS-level identifiers: `{prefix}-{name}` for the SHM region, `{prefix}-{name}-req` / `{prefix}-{name}-rsp` for events.
`prefix`	`"slotbus"`	Namespace prefix. Change this to run multiple independent slotbus instances on the same machine.
`num_slots`	`32`	Number of concurrent request/response slots. Each slot is 128 bytes of fixed metadata. More slots = more concurrency, but the heap shrinks. Clamped to 1-256.
`region_size`	`1,048,576`	Total size of the control region in bytes. Must fit the header (64B) + slots (128B each) + heap. With 32 slots, the heap gets ~1,044,416 bytes.
`wait_timeout_ms`	`5,000`	Maximum time (ms) to block on an event wait. Safety fallback only — the event signal provides sub-microsecond wakeup.
`instrumentation`	`false`	When enabled, logs timing data for slot claims, round-trips, and heap allocations via `tracing`.

Derived OS Names

For a worker named "my-worker" with the default prefix:

Resource	OS Name
Control region	`slotbus-my-worker`
Request event	`slotbus-my-worker-req`
Response event	`slotbus-my-worker-rsp`
Request overflow (slot 5)	`slotbus-my-worker-req-5`
Response overflow (slot 5)	`slotbus-my-worker-rsp-5`

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

# Clone and build
git clone https://github.com/JustMaier/slotbus.git
cd slotbus
cargo build

# Run tests
cargo test

# Run with instrumentation logging
RUST_LOG=slotbus=trace cargo test

The codebase is small by design. The core transport is under 1,000 lines.

Minimum Supported Rust Version

1.75

License

Licensed under the MIT License.

slotbus 0.1.0