udp-relay-core 0.1.0

Lock-free, cache-line-aligned core types for UDP relay/tunnel servers — TunnelClient, QualityAnalyzer, address validation, and DashMap helpers
Documentation

udp-relay-core

Crates.io Documentation License: MIT

Lock-free, cache-line-aligned building blocks for high-performance UDP relay and tunnel servers.

Why this crate?

If you're building a UDP relay, game server proxy, VPN tunnel, or any service that forwards packets between many clients at high throughput, you need per-client state that is:

  • Fast — no mutexes on the hot path; bare atomics only.
  • Concurrent — safely shared across threads without contention.
  • Observable — bandwidth, latency, packet loss, and priority scores available at a glance.

udp-relay-core gives you exactly that in a small, dependency-light package. You bring your own I/O layer (tokio, mio, io-uring, etc.) and plug these types in.

Use cases

  • Game server relays — track per-player bandwidth and timeout idle clients.
  • VPN / tunnel endpoints — monitor connection quality and prioritize traffic.
  • Media streaming proxies — detect slow or lossy paths and react in real time.
  • Load balancers — score backends by measured latency and loss, not just round-robin.

What's inside

Type / Function What it does
TunnelClient Per-client state: timeout tracking, bandwidth estimation (EWMA), latency, packet loss, lazy priority scoring. Cache-line-aligned with hot/cold field separation.
QualityAnalyzer Lightweight packet loss tracker with automatic counter halving to prevent overflow.
validate_address() Rejects loopback, unspecified, broadcast, multicast, and port-0 addresses.
create_dashmap_with_capacity() Creates a DashMap with a shard count tuned to your CPU count.

Getting started

Add to your Cargo.toml:

[dependencies]
udp-relay-core = "0.1"

Track clients

use std::sync::Arc;
use udp_relay_core::{TunnelClient, create_dashmap_with_capacity};

let clients = create_dashmap_with_capacity::<u32, Arc<TunnelClient>>(200);

let addr = "1.2.3.4:5678".parse().unwrap();
let client = Arc::new(TunnelClient::new_with_endpoint(addr, 30));
clients.insert(42, client.clone());

// On every received packet
let now = TunnelClient::current_timestamp();
client.update_stats(512, 0, now);
client.set_last_receive_tick_at(now);

// Periodic cleanup
if client.is_timed_out() {
    clients.remove(&42);
}

Monitor bandwidth and priority

use udp_relay_core::TunnelClient;

let client = TunnelClient::new(60);

// Feed traffic over time
let t0 = TunnelClient::current_timestamp();
client.update_stats(10_000, 5_000, t0);
client.update_stats(10_000, 5_000, t0 + 1);

// Report quality metrics from your ping/probe logic
client.set_latency(25);         // 25 ms
client.set_packet_loss_rate(50); // 5.0%

// Priority is recomputed lazily — only when metrics change
let _score = client.get_priority(); // lower = better

Track packet loss

use udp_relay_core::QualityAnalyzer;

let qa = QualityAnalyzer::new();

qa.record_packet(false); // received OK
qa.record_packet(true);  // lost

// Returns 0–1000 (i.e. 0.0%–100.0%)
let _rate = qa.get_packet_loss_rate();

Fast timestamps for event loops

For the lowest-overhead timestamp access, call update_clock() once per event-loop tick and use recent_timestamp() everywhere else:

use udp_relay_core::TunnelClient;

// Once per tick
TunnelClient::update_clock();

// Zero-overhead read (single atomic load)
let _now = TunnelClient::recent_timestamp();

Performance notes

  • All shared state uses bare atomics — no Mutex, no RwLock.
  • TunnelClient is Send + Sync and designed to sit behind Arc in a DashMap.
  • Hot-path fields (packet receive, bandwidth) are grouped on the first cache line; cold fields (priority, connection age) on the second — minimizing false sharing.
  • Timestamps use coarsetime (CLOCK_MONOTONIC_COARSE on Linux), which is 10–25x faster than SystemTime::now().
  • Bandwidth estimation uses an exponentially weighted moving average (EWMA) so the value adapts quickly without storing a history buffer.

License

MIT