oxpulse-sfu-kit
Reusable multi-client SFU primitives built on top of str0m.
str0m is a sans-I/O Rust WebRTC library — you plug in your own networking. This crate adds the multi-client glue: per-peer state machines, UDP packet routing, event fanout, simulcast layer forwarding, and bandwidth-adaptive layer selection.
What this gives you
Client— per-peer state machine wrappingstr0m::RtcRegistry— room-level UDP routing and event fanoutPropagated— event enum flowing between registry and clientsLayerSelector+BestFitSelector— per-subscriber simulcast layer selection using desired layer + publisher's active RIDsClientOrigin::RelayFromSfu— cascade SFU support: mark a client as an upstream relay, reroute keyframe requests and Dynacast hints upstream- Optional
pacer— BWE-adaptive layer switching (3-up/instant-down hysteresis, audio-only mode below 80 kbps) - Optional
av1-dd— AV1 Dependency Descriptor parser, per-subscriber temporal-layer drop gate - Optional
vfm— RFC 9626 Video Frame Marking for H.264/VP9/HEVC temporal-layer drop - Optional
active-speaker— dominant speaker detection with confidence margin - Optional
metrics-prometheus— Prometheus gauges including per-peer speaker activity scores
Usage
[]
= "0.5"
Minimal run loop:
use ;
async
Insert a peer after completing ICE/DTLS signaling:
use ;
use Arc;
let mut registry = new;
let rtc = new.build;
let client = new;
registry.insert;
Mark a client as a relay from another SFU edge (call before registry.insert):
use ;
let mut relay_client = new;
relay_client.set_origin;
registry.insert;
// Keyframe requests for relay-originated tracks now emit
// Propagated::UpstreamKeyframeRequest instead of sending PLI/FIR to the relay.
Feature flags
| Flag | What it does |
|---|---|
kalman-bwe |
GoogCC-inspired Kalman delay + loss-based BWE. BandwidthEstimator with TWCC ingestion. Registry::update_pacer_layers for automatic layer selection. Enable with pacer for full adaptive forwarding. |
| BWE-adaptive layer switching via (LiveKit-style 3-up/instant-down hysteresis). Adds at 80 kbps threshold. allows runtime tuning of all thresholds. | |
| GoogCC v2 per-subscriber estimator: (linear regression) + (loss-based AIMD). integrates with as an additional bitrate ceiling. | |
av1-dd |
AV1 Dependency Descriptor parser (av1::dependency_descriptor). SfuMediaPayload::av1_dd() accessor. Client::set_max_temporal_layer(u8) per-subscriber drop gate. |
vfm |
RFC 9626 Video Frame Marking parser for H.264/VP9/HEVC. SfuMediaPayload::vfm_frame_marking(). Client::set_max_vfm_temporal_layer(u8). |
active-speaker |
Dominant speaker tracking via rust-dominant-speaker. Propagated::ActiveSpeakerChanged { peer_id, confidence }. Registry::tick_active_speaker / record_audio_level / peer_audio_scores. |
metrics-prometheus |
Prometheus counters on SfuMetrics, including per-peer BWE, loss, RTT, and speaker activity gauges. |
googcc-bwe |
GoogCC v2 per-subscriber estimator: TrendlineDetector (linear regression, 20-packet window) + AimdController (loss-based AIMD +8%/x0.85). GoogCcEstimator integrates as an additional bitrate ceiling in PerSubscriber::combined_bps. |
test-utils |
Test seam helpers (test_seed module, Registry::*_for_tests methods). |
Audio quality guidance
Publisher-side noise filtering
For cleaner dominant-speaker elections, publishers should filter audio through a noise suppressor before computing the RFC 6464 level:
- RNNoise (
xiph/rnnoise, BSD-3-Clause) — DSP/DNN hybrid, runs on mobile. - ten-vad (
TEN-framework/ten-vad, MIT) — small CPU-friendly VAD alternative.
Opus DRED (Deep REDundancy)
Opus DRED (libopus ≥ 1.4, shipping in recent Chromium) embeds a neural-decoded
redundant stream at ≈1 kbps overhead. The SFU forwards it transparently — no kit
changes required. Signal DRED capability with Propagated::AudioCodecHint.
End-to-end encryption (SFrame)
The kit forwards RTP payloads opaquely — SFrame (RFC 9605) frames pass through
unchanged. Use KeyEpoch from crate::sframe to forward the key-epoch RTP
header extension. Key distribution (MLS RFC 9420) is your signalling layer's
responsibility.
Not included (by design)
- Signaling (bring your own — WebSocket, HTTP, gRPC)
- TURN server (run coturn or similar)
- End-to-end encryption payload processing (use SFrame; see
sframe::KeyEpoch) - Server-side audio/video mixing (MCU mode)
- WHIP / WHEP ingestion endpoints
Examples
See examples/basic-sfu.rs for a complete single-node SFU with a Prometheus
/metrics endpoint.
Capacity testing
examples/synthetic_room.rs is a self-contained load generator that spawns N
synthetic peers in one process — no real network, no DTLS handshake. It drives the
kit's fanout dispatch, per-subscriber simulcast layer-filter, and delivered_media
counters directly via the test-utils seam, then reports peak RSS, CPU%, total
packets forwarded, and p50/p95/p99 fanout latency.
Output is grep-friendly for D-Lite 4 scripting:
SYNTHETIC_ROOM_RESULT peers=8 duration_s=30 packets_forwarded=72000 peak_rss_mb=124 cpu_percent=18.3 latency_p50_us=85 latency_p95_us=210 latency_p99_us=450
What is NOT measured: str0m DTLS/ICE/SRTP, real UDP kernel path, actual wire bitrate (synthetic payload is always 4 bytes), GoogCC/pacer adapting to real TWCC feedback. See the example's module doc for full scope.
Relationship to str0m
We build on str0m's Rtc state machine. We do not replace it — we connect
multiple instances together for multi-party rooms. All credit for the underlying
protocol work goes to Martin Algesten and the
str0m contributors.
Extracted from
Originally built as part of OxPulse Chat. Published standalone for the broader Rust WebRTC ecosystem.
Status
API is stabilising through v0.x. Minor breaking changes may occur between minors;
check CHANGELOG.md before upgrading. Stability commitment from v1.0.
License
Dual MIT / Apache-2.0. See LICENSE-MIT and LICENSE-APACHE.