Expand description
Multi-GPU reactive variant phase — the rung benefit.
Decode the source once and dynamically schedule every rung’s CMAF segments across all available GPUs using a fair lease pool with mid-flight helper dispatch:
decode pump (decode once)
│ fan out normalized frames
▼
per-rung scaler ──► SegmentChunkQueue ──► encoder worker (holds a GpuLease)
──► helper worker (claims a freed lease)- One encoder per GPU at a time (
GpuPoolenforces it — concurrent NVENC sessions on one context deadlock). - A fast rung releases its lease early; the helper dispatcher grabs the freed lease and attaches an extra worker to a still-busy rung, so a slow rung finishes sooner. Segment work is the unit of parallelism.
- Helpers may land on a different GPU vendor than the rung’s first
worker; the per-rung AV1 codec invariant (
RungCodecInvariant) guarantees every contributed segment shares theav1Ccontract, so a cross-vendor (NVENC + QSV) rendition still decodes cleanly. A mismatched helper requeues its chunk and exits — the run never aborts on it.
Storage/transport specifics stay out of the engine: progress is reported
through the generic ProgressSink, so a consumer can layer an uploader
(object storage, a status queue, …) on top by watching RungStatus::Completed.
Structs§
- Multi
GpuParams - Inputs to
run_multigpu_hls. - Rung
Manifest - One rung’s finalized CMAF manifest.
- Rung
Packets - One rung’s full ordered AV1 packet stream, stitched from chunks encoded across GPUs. The caller muxes these into a single MP4 (+ audio).
Functions§
- detect_
gpu_ pool - Build a
GpuPoolfrom the host’s detected GPU inventory. - gpu_
pool_ for_ policy - Build a
GpuPoolconstrained to the givenEncodePolicy. An empty pool (e.g. a pinned index or vendor family that isn’t present) yields capacity 0, so the orchestrator’s pre-flight probe / lease claim surfaces a clear error. - policy_
gpu_ indices - The GPU indices an
EncodePolicyselects, in detection order. Used to pin the decode pump to a device consistent with the policy (so decode honors aFamily/SingleGpuconstraint, not just encode). - run_
multigpu_ hls - Run the reactive multi-GPU variant phase. Returns one
Option<RungManifest>per rung (in rung order);Nonemeans the rung produced no segments. - run_
multigpu_ single_ file - Single-file counterpart to
run_multigpu_hls: decode once, fan to per-rung scalers, and dynamically schedule each rung’s GOP-sized chunks across all GPUs (fair lease pool + mid-flight helper dispatch + cross-vendor codec invariant). Each worker encodes its chunk to packets (a fresh encoder per chunk → first frame is an IDR); the finalizer concatenates them in segment order into one ordered packet stream per rung — no disk round-trip. - serial_
gpu_ for_ policy - The GPU index to pin a serial (single-GPU) encode/decode to under a
policy:
None(auto/first-available) forAllGpus, the pinned index forSingleGpu, the first device of the vendor forFamily.