GMF - Thread-Per-Core gRPC Server Framework
A high-performance, runtime-agnostic gRPC server framework for Rust using thread-per-core architecture.
GMF pins one event loop per physical CPU core with no work-stealing, no shared task queues, and no lock contention on the request path.
Runtimes
| Runtime | Feature | Platforms | Backend |
|---|---|---|---|
| monoio (default) | monoio-runtime |
Linux, macOS | io_uring / kqueue |
| glommio | glommio-runtime |
Linux only | io_uring |
| tokio | tokio-runtime |
All | epoll / kqueue |
Quick Start
[]
= "2.0.0"
use MonoioServer;
builder
.addr
.max_connections
.num_cores
.build
.serve?;
To use a different runtime:
[]
= { = "2.0.0", = false, = ["tokio-runtime"] }
Graceful Shutdown
builder
.addr
.build
.serve_with_shutdown?;
Where signal is any Future<Output = ()> + Send + 'static (e.g. a ctrl-c handler).
How It Works
Each core runs an independent event loop with its own TCP listener and connection limiter. On Linux, the kernel distributes connections across cores via SO_REUSEPORT and each core gets its own io_uring instance (monoio/glommio) or epoll fd (tokio). No userspace load balancing, no shared task queues.
Documentation
| Document | Description |
|---|---|
| Architecture | Thread-per-core design, io_uring, CPU pinning, request lifecycle |
| Benchmarking | How to benchmark with ghz and criterion |
| Development | Building, testing, Docker setup |
| Platform Notes | Linux vs macOS differences, Docker IPv6 |
Performance
GMF's architecture is designed to outperform work-stealing runtimes on dedicated multi-core Linux servers by eliminating shared task queues and lock contention, leveraging io_uring syscall batching (monoio/glommio runtimes), and maintaining CPU cache locality through core pinning. The advantage grows with core count and hardware isolation.
See Benchmarking for how to run your own benchmarks and Architecture for a deep dive into why thread-per-core scales better.
License
Apache-2.0