docsrs only.Expand description
§Technical
English | 中文
Eventp is a zero-overhead event dispatch mechanism with a clean, test-friendly
API. This page tells the story of how it gets there.
§1. From mio, to event-manager, to eventp
mio is a thin, cross-platform wrapper over
epoll/kqueue/IOCP. You ask it to watch fds, it tells you which ones are
ready, you match on a Token (a usize you picked) to figure out what to
do. It is essentially “raw epoll with a portable accent” — see
mio’s tcp_server example
for the flavor.
event-manager goes a
step further: it adds a real subscription layer. Each fd is owned by a
Subscriber object that knows how to handle its own events; the dispatch
table is mutable at runtime; new sources can be registered from inside a
handler. This is a much nicer programming model for large projects (think
rust-vmm), and the
basic example
shows the kind of code you actually want to write.
So far, so good. But there is a price.
§1.1 The price: three HashMap lookups per event
When a Subscriber’s handler fires, it usually wants to do two things:
- Read or write its own data (
&mut self). - Mutate the reactor — add a new connection, remove itself, change interest
flags (
&mut Reactor).
These two &muts overlap, because Subscriber lives inside Reactor. Rust
refuses to compile that. The straightforward workaround is to give up
co-locating them and shuffle ownership around. event-manager does exactly
that, with four HashMaps in a three-layer structure:
The two &muts now come from genuinely distinct objects, and the borrow
checker is appeased. The cost is three HashMap lookups per dispatched event.
It gets a little worse: the maps use std::collections::HashMap, whose
default hasher is SipHash 1-3 — a HashDoS-resistant hash, which is great for
HTTP headers, but our keys are small integer fds handed out by the kernel.
There is no attacker. We are paying for armor against a threat that does not
exist.
§1.2 The hidden bomb: fd reuse and ghost events
Routing events by RawFd makes a whole class of ABA bugs all too easy
to trip over. POSIX specifies that open(2), accept(2), socket(2),
pipe(2) and friends return the lowest-numbered fd not currently in use
by this process. So the moment a fd is closed, its integer is the
first candidate for the next fd you open. Reuse is the norm, not the
exception.
Consider the kind of sequence fd-keyed dispatch invites:
- Subscriber
Aholdsfd = 7, registered into the reactor; the dispatch table has a row keyed by7. A’s destructor (or some deeper chain it triggers) closesfd = 7but forgets to unregister.- The process later
accepts a new connection. The kernel hands back the number7for it. The application registers it as subscriberB. - epoll fires. The reactor looks up
fd = 7, lands onA’s entry, and dispatches the event to a corpse.
This class of bug — let us call it the ghost event — has three flavors of nasty:
- Silent. The compiler can’t see it. Unit tests almost never reproduce it. It only shows up in production, on a busy day, with a postmortem.
- It crosses ownership boundaries. Even after
A’s storage has been freed and recycled for something else, the staleRawFd → subscriber idrow still exists. Events get routed to whoever happens to occupy that memory now. Have fun. - It’s not really the user’s fault. The API shape encourages
“close-then-remove” ordering, especially when
closeis invoked deep inside aDropchain. Pushing this invariant onto users is a design smell.
§1.3 The eventp insight
epoll_ctl(2) lets you attach an arbitrary 8-byte payload (epoll_data_t)
to every registered fd. When the event fires, epoll_wait(2) hands the same
payload back. Semantically, it’s a free-form “context pointer” slot —
that’s literally how the man page recommends using it.
So: put the heap address of the handler object in there. When the event
fires, we transmute the u64 back to a pointer, do one virtual call, and
we’re in user code. No hashing. No lookup. One callq. Done.
This also vaporizes the ghost-event class entirely: routing now follows the
object pointer the kernel hands back, not a RawFd lookup. The fd integer
being reused is irrelevant — different fd, different registration, different
pointer. And Eventp::delete is wired so that releasing the subscriber and
calling EPOLL_CTL_DEL are inseparable, which means “forget to remove” is
no longer an option the API even exposes.
Of course, none of this is free. To make it work we need to solve three Rust-specific puzzles, and that’s what the rest of this document is about:
&dyn Traitis 16 bytes on 64-bit. It doesn’t fit in au64.- Handing
&mut Reactorto a handler that itself lives inside the reactor is a textbook double-mutable-borrow. - Handlers may mutate the reactor (add, modify, delete — possibly themselves) while a batch of events is mid-dispatch. We need this to be sound.
§2. Slimming Down Fat Pointers: ThinBoxSubscriber
§2.1 Why runtime polymorphism (and why that’s a problem)
You might ask: why not just parameterize the reactor over T: Subscriber and
let monomorphization sort it out? In practice — VMM-style codebases especially
— roughly 90% of real reactors hold subscribers of many different concrete
types: a control eventfd, a TCP listener, a bunch of TCP connections, a
serial console, a vsock channel, … The moment a generic parameter shows up in
the reactor type, it virally infects every owner all the way to fn main.
Trait objects are the practical answer, and we will pay the one indirect
call.
The trouble is Rust’s trait-object representation:
Rust’s &dyn Trait is a fat pointer: data pointer + vtable pointer,
16 bytes on a 64-bit target. That is 8 bytes too many for epoll_data_t.
§2.2 Insight: don’t be afraid of the allocator
rustc’s memory layout is a default, not a prison. If we manage the
allocation ourselves, nothing stops us from putting the vtable pointer
inside the object, C++-style. Then a pointer to the object is just one
word — and that one word goes straight into epoll_event.data.
§2.3 First sketch
Let’s build it step by step.
pub struct ThinBoxSubscriber {
ptr: NonNull<u8>,
_marker: PhantomData<dyn Subscriber>,
}
impl ThinBoxSubscriber {
pub fn new<T: Subscriber>(value: T) -> Self {
todo!()
}
}§Step 1: cordon off the exotic cases
We only support 64-bit Linux. Anything else is a compile error:
#[cfg(not(target_pointer_width = "64"))]
compile_error!("Platforms with pointer width other than 64 are not supported.");With that, we can nail down a fact at compile time:
const _: () = assert!(size_of::<&dyn Subscriber>() == 16);If a future toolchain ever changes trait-object layout, the build fails on the spot — no silent miscompile.
§Step 2: pry the vtable out of a fat pointer
A fat pointer is a (data, vtable) pair in memory. So we transmute it:
let fat_ptr = &value as &dyn Subscriber;
let (_data_ptr, vptr) = unsafe {
mem::transmute::<&dyn Subscriber, (*const (), *const ())>(fat_ptr)
};Now we want a heap layout that starts with the vptr:
(vptr, T)Small but fatal: the alignment hole. If T has an alignment greater than usize
(think #[repr(align(16))] or a struct containing __m128), the compiler quietly inserts
padding between vptr and value:
So value is not at ptr + size_of::<usize>(). Our deref math is off.
Cue undefined behavior.
Trick: keep vptr adjacent to value, let padding fall outside. Use
Layout::extend to compose a one-usize header (which will hold the
vtable pointer) with the layout of T. The allocator returns the offset of
T for free, and inserts any padding before the header instead of between
the header and T:
let (layout, value_offset) = Layout::new::<usize>()
.extend(Layout::new::<T>())
.expect("Failed to create combined layout");We then make ptr point at T, and read vptr at the fixed negative
offset ptr - 8.
Exercise: why is the vptr offset valid? (hint: align rules of repr C)
§Step 3: allocate, place, point
let ptr = unsafe {
let raw = alloc::alloc(layout);
if raw.is_null() { alloc::handle_alloc_error(layout); }
NonNull::new_unchecked(raw.add(value_offset)) // point at T, not at the allocation
};
unsafe {
ptr.as_ptr().sub(size_of::<usize>()) // vptr slot
.cast::<*const ()>().write(vptr);
ptr.as_ptr().cast::<T>().write(value); // move T in
}Deref is the same trick in reverse — read the vptr from ptr - 8,
combine it with ptr into a fat pointer, hand out &mut dyn Subscriber<Ep>.
§2.4 Drop, panic-safely
Drop is where it gets fun. We have to:
- Run
T’s destructor. deallocthe heap slot.
What if step 1 panics? Per
panic-in-drop discussion
(the RFC was withdrawn, but the behavior stands), a panic inside Drop
unwinds. If we naively wrote drop_in_place(value); dealloc(ptr), an unwind
through step 1 would skip step 2 — and leak.
The trick is the classic guard-inside-Drop pattern: hand the deallocation
responsibility to a local struct whose own Drop is unconditional:
let _guard = DropGuard { ptr, value_layout, _marker: PhantomData };
unsafe { ptr::drop_in_place(value_ptr) }; // may panic
// _guard.drop() runs in either path and calls alloc::dealloc.The same pattern shows up in Vec and most other RAII containers — but
this is one of the rare cases where you actually need to write it yourself.
§2.5 The differences vs. the real code
The real src/thin.rs is slightly fancier than what’s above:
- The header also stores
raw_fd(next tovptr). This avoids some virtual calls toas_fd(). It also serves as a sentinel, where a value of -1 indicates that thevaluehas beendrop_in_placed but the heap slot is still alive. We will use this in §4 to make reentrant deletion sound. Subscriber<Ep>is generic over the reactor type (so that the mock reactor can plug into the sameThinBoxSubscriber<MockEventp>). It’s uniform churn, not interesting on its own.from_box_dynlets you convert an already type-erasedBox<dyn Subscriber<Ep>>into aThinBoxSubscriber.
§2.6 Why this kills the fd-reuse bug for free
Routing now goes:
epoll_wait → ev.data() (u64) → reinterpret as &mut dyn Subscriber<Ep>There is no RawFd → subscriber map on the dispatch path. The kernel hands
back the exact heap address you registered, so the only way a “ghost”
subscriber could receive an event is if its heap slot were deallocated
behind epoll’s back — and the only API that removes a subscriber
(Eventp::delete) is the same one that calls EPOLL_CTL_DEL. The two are
welded together; you cannot have one without the other.
§3. The Double Mutable Borrow
§3.1 The interface we wish we could write
What we’d like in user code is brutally obvious:
trait Subscriber {
fn handle(&mut self, reactor: &mut Eventp);
}What the borrow checker thinks of it:
error[E0499]: cannot borrow `*reactor` as mutable more than once at a time…because *self lives inside reactor.registered, and you have just
asked for two &muts that overlap. event-manager’s response was the
three-layer HashMap structure from §1; the cost was 3 lookups per event.
We’d rather not pay that.
§3.2 Approaching the problem from the other side
Let’s invert the framing. Suppose we have:
use rustc_hash::FxHashMap; // fast hasher, no DoS resistance (we don't need it)
struct Eventp {
registered: FxHashMap<RawFd, ThinBoxSubscriber>,
// ...
}Suppose we accept “splitting” &mut Eventp into two logical halves:
&mut subscriber_i— the one currently dispatching&mut (Eventp − subscriber_i)— everything else
We know, by §2, that ThinBoxSubscriber is just a pointer. The actual
subscriber bytes live on another heap allocation that the map merely
references. So when we pluck out &mut (self.registered[fd].deref()) and hand it to
Subscriber::handle, the only thing that can invalidate it is something
that frees or moves the heap slot under us.
Now, what could &mut Eventp actually do to that heap slot, during the
handler call? Three things:
- Public field access (
reactor.registered = ...). Easy to forbid: don’t expose any fields aspub. - Public method calls (
reactor.some_method(&mut self)). Annoying, but we control the method set. We can just not expose anything dangerous. mem::replace,mem::take,*reactor = new_reactor. 💥 The oldEventpis destructed right now, including the entireregisteredmap, including the heap slot we were currently inside. The&mut selfthat the handler holds is suddenly pointing into freed memory.
Categories 1 and 2 are in our hands. Category 3 is the actual showstopper.
§3.3 Descending deeper into the dark arts: Pin
We need a way to hand the handler “something with &mut Eventp-ish powers,
but with category 3 surgically removed”. Fortunately, Rust has already
been here. When async/await was being designed, Future faced the exact
same crisis — a Future returned by async fn is a self-referential state
machine, and mem::replace-ing it would invalidate its own internal
pointers. The fix, after a lot of debate and a lot of documentation, was
Pin.
Skipping the
sixteen chapters of Pin documentation:
the only thing it does that matters here is that safe code cannot turn
Pin<&mut T> back into &mut T unless T: Unpin. Inherent methods on the
pinned type can use unsafe internally to project back to &mut T, but
those methods are written by the type’s author and can be chosen to never
move the value out.
So: mark Eventp as !Unpin (one PhantomPinned field is enough), and
hand the handler a Pin<&mut Eventp>. Category 3 is gone. Safe user code
cannot mem::replace the reactor.
struct Eventp {
registered: FxHashMap<RawFd, ThinBoxSubscriber>,
_pinned: PhantomPinned,
// ...
}
trait Subscriber {
fn handle(&mut self, reactor: Pin<&mut Eventp>);
// ^^^^^^^^^^^^^^^^
// "you can use it, but you cannot make it stop existing"
}Before you cheer: keep The Problem With Single-threaded Shared
Mutability
in mind on the way back. The thing that makes this safe isn’t Pin waving a
wand; it’s the specific set of methods we expose on the pinned reactor,
which we will deliberately keep tiny.
§3.4 Pinned<'_, Ep>: the deliberately narrow API
Rather than handing out Pin<&mut Eventp> directly (which would let users
call any inherent method we ever add to Pin<&mut Eventp> later), we wrap
it in a newtype that has exactly the three methods corresponding to
epoll_ctl(2):
pub struct Pinned<'a, Ep>(pub Pin<&'a mut Ep>);
impl<'a, Ep: EventpOps> Pinned<'a, Ep> {
pub fn add(&mut self, sub: ThinBoxSubscriber<Ep>) -> io::Result<()> { ... }
pub fn modify(&mut self, fd: RawFd, interest: Interest) -> io::Result<()> { ... }
pub fn delete(&mut self, fd: RawFd) -> io::Result<()> { ... }
}These are exactly the three EPOLL_CTL_* operations, and nothing else.
run_once, into_inner, Drop, Default, you name it — all unreachable
from inside a handler. The reactor cannot be moved, cannot be replaced,
cannot even re-enter epoll_wait. The blast radius of “what a handler can
do to the reactor” is by construction the same as the blast radius of three
syscalls.
§3.5 What !Unpin actually guarantees (a small precision note)
A subtle point that’s easy to misread: !Unpin does not guarantee that
the registered map “doesn’t move in memory” — FxHashMap will happily
rehash and shuffle its internal buckets when you add a new subscriber.
What !Unpin guarantees is that the Eventp struct itself cannot be
moved or replaced, and therefore its registered field is not swapped out
from under us.
The actual reason the in-flight &mut Subscriber stays valid across a
rehash is §2’s indirection: the map only stores ThinBoxSubscriber (a
single word), and the subscriber bytes live on a separate heap allocation.
Rehashing moves the one-word handle, not the bytes it points at. The
handler’s &mut self continues to point at the same heap address.
In other words: §2 and §3 work together. The thin pointer gives us pointer
stability across rehashes, and Pin gives us pointer stability against
mem::replace. Either alone would not be enough.
§4. Handler internals: re-entrancy and the Handling state machine
§3 explained why &mut Eventp is safe to hand out (in narrowed form). It
left open the harder question: what may handlers actually do with it
without invalidating the in-flight subscriber reference?
§4.1 Per-operation hazard analysis
epoll_wait returns up to N ready events; we dispatch them one by one.
While handler i runs, it may call back into the reactor. For each
operation we must ask: could this corrupt the loop?
| Operation in handler | Risk | Resolution |
|---|---|---|
add(new_sub) | FxHashMap rehash. But thin pointers are stable; new sub isn’t in this batch. | Allow. |
modify(other, ..) | Updates kernel state + a Cell<Interest> inside the sub. Touches nothing else. | Allow. |
delete(other) | other’s event may also be in this batch — naive dealloc ⇒ dangling pointer. | Drop in place now, defer free to batch end. |
delete(self) | &mut self is still live; can’t drop now. But fd won’t reappear in this batch. | Mark drop_current = true; reap at tail. |
run_once_with_timeout(...) | Would clobber the dispatch state and re-enter epoll_wait. | Panic. |
The state for all of this is one tiny struct:
struct Handling {
fd: RawFd, // who's running right now
drop_current: bool, // self-delete requested
deferred_drop: Vec<ThinBoxSubscriber<Eventp>>, // dropped-in-place, awaiting dealloc
}self.handling is Some iff we’re inside a dispatch batch. Entering
run_once_with_timeout while it’s already Some panics — that’s how we
forbid reentrant run_once (src/lib.rs:285-322).
§4.2 The two flavours of delete
fn delete(&mut self, fd: RawFd) -> io::Result<()> {
// epoll_ctl(EPOLL_CTL_DEL) — same for every path
...
if let Some(h) = &mut self.handling {
if h.fd == fd {
// (A) self-delete: registry entry stays put until loop tail
h.drop_current = true;
} else {
// (B) cross-delete: pop from registry, run user destructor now
// (so fd/socket handles release immediately), but keep the
// heap slot alive until end of batch.
let mut sub = self.registered.remove(&fd).unwrap();
sub.drop_in_place();
h.deferred_drop.push(sub);
}
} else {
// (C) not in dispatch: just remove
self.registered.remove(&fd);
}
Ok(())
}This produces one user-visible quirk worth pinning in a test:
- Cross-delete then re-add the same fd in the same handler → works.
The registry entry was removed in (B), so the new
adddoesn’t collide. - Self-delete then re-add the same fd in the same handler →
AlreadyExists. Self-delete only flips a flag; the registry entry is still there.
Both are pinned by tests (handler_can_re_add_other_fd_after_delete, self_delete_then_re_add_same_fd_returns_already_exists), so any future change is visible and deliberate.
§4.3 ThinBoxSubscriber, augmented with a sentinel
§2’s drop_in_place story needs one more piece. When (B) runs the user
destructor early, the heap slot still exists — but it’s logically
“already dropped”. If epoll_wait reported both A and B in the same batch
and the dispatch loop later reconstructs B’s thin pointer from
ev.data(), we must not re-run the user’s handle.
So the layout grows one more field — the raw fd slot promised back in §2:
+---------+---------+---------+---------+--------------------+
| _pad_ | raw fd | _pad_ | vptr | dyn Subscriber<Ep> |
+---------+---------+---------+---------+--------------------+
ptr-16 ptr-8 ↑
ThinBoxSubscriber { ptr }It pulls double duty:
- Fast-path fd read. The dispatch loop wants to record “who’s running”
in
handling.fdbefore callinghandle(). With the cached fd, that’s a single load — no vtable dance. - Dropped-in-place sentinel.
drop_in_placewritesraw_fd = -1before calling the user destructor (so a re-entrant access duringT::dropsees the “dead” state), andtry_deref_mutreturnsNonewhenever it sees-1(src/thin.rs:189-246).
The dispatch loop wraps each reconstructed thin pointer in ManuallyDrop
(src/lib.rs:333-336). The real owner is the
registry (or deferred_drop); even if the handler panics on the way out,
this local can’t double-free.
§4.4 The batch tail
After the loop, we take() self.handling to None. Dropping the
Handling drops the deferred_drop vector, which drops each
ThinBoxSubscriber, which finally calls alloc::dealloc. All the
in-place-dropped subscribers from (B) get their heap slots released
exactly here. Any drop_current-flagged subscribers were already
removed from the registry inline after each handler returned.
§5. Builder & DI: throwing away the boilerplate
Tired of writing a struct + AsFd + HasInterest + Handler quartet and a
mock quartet for every fd you want to watch? Same. Let’s see how far the
type system can carry us.
§5.1 What the user writes
eventp::interest() // empty Interest
.edge_triggered() // builder methods on Interest
.read()
.with_fd(listener) // (Interest, Fd)
.with_handler(on_connection) // → TriSubscriber
.register_into(&mut reactor)?; // calls Eventp::add
fn on_connection(
listener: &mut impl Accept,
mut reactor: Pinned<impl EventpOps>,
) { ... }No subscriber struct. No trait impls. The handler is a plain fn (or
closure), with whatever parameters it actually needs, in whatever order
it pleases.
§5.2 Two halves of the builder, dual-trait style
There’s no Builder<T> here. with_fd and with_handler are trait methods
that turn one tuple type into another, and they happen to commute:
impl<Args, F> WithFd for (Interest, FnHandler<Args, F>) { type Out<Fd> = TriSubscriber<Fd, Args, F>; ... }
impl<Fd: AsFd> WithHandler for (Interest, Fd) { type Out<Args, F> = TriSubscriber<Fd, Args, F>; ... }Whichever you call first works; both paths converge on
TriSubscriber<Fd, Args, F>. The Subscriber<Ep> trait has a blanket
impl over AsFd + HasInterest + Handler<Ep>, so the resulting type plugs
straight into register_into.
§5.3 Parameter injection: the macro factory
A handler can take any subset of { &mut Fd, Event, Interest, Pinned<'_, Ep> }
in any order. To make this possible without proc-macros, the library writes
out all 65 impls by hand via a macro_rules! factory
(1 nullary + 4·P(4,1) + P(4,2) + P(4,3) + P(4,4) = 1 + 4 + 12 + 24 + 24 = 65;
see src/tri_subscriber.rs:143-253).
Two small things make this work:
- Signature lock-in via
PhantomData<fn(Args)>. Rust technically lets youimpl FnMut<A>multiple times for the same type.FnHandler<Args, F>carries anArgstype parameter, so(fd, event)and(event, fd)become differentArgs, and the correspondingHandlerimpls don’t overlap. - TT-muncher accumulator inside
impl_handler!walks the parameter list left-to-right, building the call’s argument list as it goes — the classicmacro_rules!pattern for n-ary code generation.
§5.4 Testing for almost free
Because handlers are plain functions and reactor methods go through the
EventpOps trait, your test is just:
fn on_connection<Ep: EventpOps>(listener: &mut impl Accept, mut reactor: Pinned<Ep>) { ... }
#[test]
fn accepts_then_registers_stream() {
let mut mock_accept = MockAccept::new(); // ← only mock what you used
let mut mock_reactor = MockEventp::new();
mock_accept.expect_accept().returning(...);
mock_reactor.expect_add().times(1).returning(|_| Ok(()));
on_connection(&mut mock_accept, pinned!(mock_reactor));
}MockEventp is generated by mockall — see
src/mock.rs — and the pinned! macro pins it on the
stack without Box::pin ceremony (src/pinned.rs:82-86).
Parameters you never inject in fn handle need no mocks at all.
For a complete end-to-end test suite written in this style, see
examples/echo-server.rs.
§6. The zero-cost dispatch path, verified
Let’s see what Eventp::run_once_with_timeout actually compiles to. The
following is the inner dispatch loop from a --release build of the echo
server (lightly annotated):
; for ev in buf:
17b8c: mov rdi, [r14 + r15 + 0x4] ; rdi = ev.data (the subscriber addr)
17b91: mov eax, [rdi - 0x10] ; eax = *raw_fd_ref() ← no vtable
17b94: mov [r12], eax ; handling.fd = eax
; if !is_subscriber_dropped:
17b98: cmp eax, -1 ; raw_fd == -1 ?
17b9b: je .skip ; predicted not-taken via hand-rolled `unlikely`
; s.handle(Event::from(ev), Pinned(...))
17b9d: mov rax, [rdi - 0x8] ; rax = vptr
17ba1: mov esi, [r14 + r15] ; esi = ev.events (Event::from)
17ba5: mov rdx, rbx ; rdx = &mut self (the Pinned)
17ba8: call [rax + 0x30] ; one indirect call — the handler
; if handling.drop_current { ... }
17bab: cmp byte ptr [rbx + 0x34], 0
17baf: je .next_event ; common case: nothing to doThat’s it. Per event we have: one load of the user-data word, one load of
the cached fd, one branch (predicted away), one load of the vtable slot,
one indirect call. No hash, no allocation, no Token → Handler lookup, no
trampoline.
Compare this to the event-manager shape: SipHash 1-3 + three
HashMap::get_mut calls + a Box<dyn> deref, on every single event. The
difference isn’t a constant factor; it’s an axis.
§A few quieter optimisations supporting that
FxHashMap. Keys are kernel-issued small integers; SipHash is pure overhead. (src/lib.rs:134)MaybeUninit<EpollEvent>event buffer. Allocatecapacityslots,set_lentocapacitywithout initialising, then re-slice to the firstnthatepoll_waitwrote.EpollEventis a POD wrapper aroundlibc::epoll_event. (src/lib.rs:201-219)hint::unreachable_unchecked()in the dispatch loop tells LLVM thatself.handlingis provablyNoneat one specific point, saving a drop check. (src/lib.rs:308-322)- Hand-rolled
unlikelyusingchecked_div(0)— a known trick for giving the optimiser a branch hint without depending on unstable intrinsics. (src/thin.rs:230-237) mem::transmute_copyinstead oftransmutewhen laundering a thin pointer into ausize, because we still need the original value to move it into the registry. (src/lib.rs:383)- Direct
libc::epoll_ctlforEPOLL_CTL_DEL, becausenix’s wrapper insists on anAsFdsource — which we may not have, if the source was already dropped. The fd number is all the kernel needs. (src/lib.rs:456-463)
§Runtime measurements
The disassembly above is the microscope. Here is the clock.
The harness lives in benches/dispatch.rs.
Three reactors are driven through eventfd sources so that one round of
fire-and-drain involves the same three syscalls (epoll_wait,
eventfd_write, eventfd_read) regardless of dispatcher: eventp,
mio (plus a 30-line FxHashMap<Token, Box<dyn FnMut()>> user table —
the shape any mio user actually writes), and event-manager. Anything
else would be measuring kernel I/O, not dispatch.
Host: Intel Xeon Platinum 8163 @ 2.50 GHz (Skylake-SP, 33 MB L3 shared),
Linux 5.10.134, rustc 1.95.0; cargo bench with lto=true and
codegen-units=1 (see [profile.bench] in Cargo.toml). Not a
CPU-pinned, isolated host — read the deltas, not the absolutes.
§One ready event among N registered, single fd per subscriber
| N | eventp | event-manager | mio + FxHashMap | em − ep |
|---|---|---|---|---|
| 1 | 1.126 µs | 1.165 µs | 1.133 µs | +39 ns |
| 10 | 1.112 | 1.163 | 1.136 | +51 ns |
| 100 | 1.114 | 1.165 | 1.138 | +51 ns |
| 1 000 | 1.108 | 1.159 | 1.130 | +51 ns |
| 10 000 | 1.103 | 1.157 | 1.127 | +54 ns |
| 100 000 | 1.127 | 1.179 | 1.153 | +52 ns |
Three things to read off:
- Dispatch is O(1) for all three. Each row’s median moves by less than 25 ns from N=1 to N=10,000. None of these designs have a “look up the handler” cost that grows with the registry.
- The bump at N=100,000 is shared. Every backend slows down by ~25 ns together. If this were HashMap cache pressure, only event-manager would feel it; the fact that all three move in lockstep pins the cost on the kernel side — the epoll interest set’s internal data structure feeling 100k entries, not anything in user space.
- The flat ~50 ns gap is two SipHash lookups. event-manager’s hot
path does
fd_dispatch.get(fd)followed bysubscribers.get_mut_unchecked(id); both arestd::collections::HashMap(SipHash 1-3). mio sits ~25 ns above eventp — one FxHash lookup. FxHash is roughly 2× faster than SipHash, and the numbers line up.
§Where the third HashMap actually fires
The dispatch_one_multi_fd_M4 group registers four eventfds per logical
subscriber — the natural shape of a virtio device, a vsock backend, or
anything multiplexing several signal fds.
| N (subs) | eventp | event-manager | mio | em − ep |
|---|---|---|---|---|
| 100 | 1.109 µs | 1.212 µs | 1.161 µs | +103 ns |
| 1 000 | 1.125 | 1.207 | 1.147 | +82 ns |
| 10 000 | 1.125 | 1.209 | 1.159 | +84 ns |
eventp and mio are essentially unchanged from the single-fd case.
event-manager picks up another ~30 ns on top of its existing 50 ns —
exactly the third lookup §1.1 promised. With four fds per
subscriber, process(events: Events, ...) only sees the RawFd, so to
call read on the right owned EventFd the handler has to do
self.fds.get_mut(&events.fd()) itself. There is no clean way out of
this in the event-manager API short of unsafe and a bare-RawFd
storage strategy. eventp doesn’t pay it because the fd object lives on
the subscriber as a field, handed to the handler as &mut Fd through
the dependency injection of §5.
§Per-event amortised throughput
dispatch_all_ready: N subscribers, all fired together, one run_once
to drain the batch.
| N | eventp ns/event | event-manager ns/event | mio ns/event |
|---|---|---|---|
| 16 | 804 | 856 | 828 |
| 64 | 809 | 862 | 833 |
| 256 | 806 | 866 | 837 |
| 1 024 | 817 | 896 | 855 |
Per-core throughput: eventp ≈ 1.24 M events/s, event-manager ≈ 1.16 M, mio + FxHashMap ≈ 1.20 M.
The em−ep delta widens from +52 ns at N=16 to +79 ns at N=1024 — a small extra +27 ns. That is event-manager’s HashMap entries spilling out of the L1 data cache (1024 entries × ~24 bytes ≈ 24 KB, just past 32 KB L1d on this host). eventp has no hashtable to miss.
§A note on the absolute numbers
The kernel’s three syscalls are roughly 1.05 µs of that 1.1 µs total — ~95% of one event today. So picking eventp over event-manager moves 4–7% of one event in this synthetic eventfd benchmark. That is a small win on its own.
The interesting axis is forward, not present: when the syscall floor
goes down (io_uring with IORING_SETUP_IOPOLL, batched ring polling,
busy-poll on a NAPI device, kernel bypass) the dispatch overhead this
section measures is what’s left. At that point the same 50 ns is the
lion’s share, not a rounding error. eventp is shaped for that future,
not today’s “syscall is everything” regime.
§7. Known limitations
Eventpis!Send. Cross-thread access goes through theremote_endpointmodule, which sends closures into the reactor over aneventfd+ MPSC channel. MakingEventpitselfSendwould require revisiting several of the unsafe invariants in §3-§4 and is not currently planned.- 64-bit Linux only. Both are checked at compile time
(src/lib.rs:1-11, src/thin.rs:48-49);
porting to 32-bit would mean giving up the “stash the address in
u64” trick, which is the entire point of the library.