Skip to main content

Crate media_seek

Crate media_seek 

Source
Expand description

πŸ” A transport-agnostic container format index parser for Rust

Parse seek indices from MP4, WebM, MP3, OGG, FLAC, WAV, AIFF, AAC, FLV, AVI, and MPEG-TS streams.
Translate any [start_secs, end_secs] window into the exact byte ranges needed for partial HTTP downloads β€” no subprocess, no HTTP client bundled.

⚠️ Part of the yt-dlp workspace. Published independently on crates.io.


Develop CI Release Downloads docs.rs

Discussions Issues Pull requests

License Stars Forks

Statistics


Β§πŸ’­ Why media-seek?

Downloading only a time clip from a streaming URL requires knowing which bytes to request. Every container format stores this information differently β€” MP4 uses a sidx box, WebM has EBML Cues, MP3 uses the Xing TOC, and so on.

media-seek abstracts all of that. You feed it the leading bytes of a stream and it returns a ContainerIndex that answers one question: given a time window [start, end], which bytes do I need to fetch?

There is no HTTP client bundled and no subprocess spawned. Callers implement the two-line RangeFetcher trait to supply bytes from whatever transport they use. Formats whose index lies outside the probe window (WebM Cues, OGG page bisection, AVI idx1, MPEG-TS PCR search) request additional ranges through that trait automatically.

Β§πŸ“₯ How to get it

Add the following to your Cargo.toml file:

[dependencies]
media-seek = "0.4.0"

Check the releases page for the latest version.

Β§πŸ” Observability & Tracing

This crate always includes the Tracing tracing crate. It emits debug events for each format detection, parse step, and extra fetch.

⚠️ Important: tracing macros are pure no-ops without a configured subscriber. If you don’t add one, there is zero runtime overhead.

To capture logs, add a subscriber in your application:

[dependencies]
tracing-subscriber = "0.3"
β“˜
use tracing::Level;
use tracing_subscriber::FmtSubscriber;

let subscriber = FmtSubscriber::builder()
    .with_max_level(Level::DEBUG)
    .finish();
tracing::subscriber::set_global_default(subscriber)
    .expect("setting default subscriber failed");

Refer to the tracing-subscriber documentation for more advanced configuration (JSON output, log levels, targets, etc.).


§🎯 Supported formats

ExtensionIndex typePrecisionExtra fetches
mp4, m4afMP4 SIDX boxFragment boundaryNo
webmEBML Cues elementCluster boundaryMaybe
mp3Xing/VBRI TOC or CBR avgFrame / 1 % TOC entryNo
oggOGG page granule bisectionPage boundaryYes (up to 64)
flacSEEKTABLE metadata blockSeek pointNo
wavPCM formulaBlockAlign-exactNo
aiffPCM formulaSample-exactNo
aacADTS frame scan average~21 ms frameNo
flvAMF0 onMetaData keyframesKeyframeNo
aviidx1 chunk at EOFFrameYes (1 fetch)
tsPCR binary search~11 ms TS packetYes (up to 64)

MHTML (storyboard segments), None, and unrecognized magic bytes return Err(Error::UnsupportedFormat).

Β§πŸš€ Quick start

Β§1. Implement RangeFetcher

β“˜
use media_seek::RangeFetcher;

struct HttpFetcher {
    client: reqwest::Client,
    url: String,
}

impl RangeFetcher for HttpFetcher {
    type Error = reqwest::Error;

    async fn fetch(&self, start: u64, end: u64) -> std::result::Result<Vec<u8>, Self::Error> {
        let range = format!("bytes={}-{}", start, end);
        self.client
            .get(&self.url)
            .header("Range", range)
            .send()
            .await?
            .bytes()
            .await
            .map(|b| b.to_vec())
    }
}

Β§2. Parse the container index

use media_seek::{parse, RangeFetcher};

struct FileFetcher(Vec<u8>);

impl RangeFetcher for FileFetcher {
    type Error = std::io::Error;

    async fn fetch(&self, start: u64, end: u64) -> std::result::Result<Vec<u8>, Self::Error> {
        let s = start as usize;
        let e = (end as usize + 1).min(self.0.len());
        Ok(self.0[s..e].to_vec())
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let fetcher = FileFetcher(std::fs::read("video.mp4")?);

    // 512 KB is recommended β€” enough for most format headers and indices
    let probe: Vec<u8> = fetcher.fetch(0, 512 * 1024).await?;

    // Total stream size in bytes (from Content-Length or a HEAD request)
    let total_size: Option<u64> = Some(1_234_567_890);

    let index = parse(&probe, total_size, &fetcher).await?;
    Ok(())
}

Β§3. Translate timestamps to byte ranges

use media_seek::{parse, RangeFetcher};

struct FileFetcher(Vec<u8>);

impl RangeFetcher for FileFetcher {
    type Error = std::io::Error;

    async fn fetch(&self, start: u64, end: u64) -> std::result::Result<Vec<u8>, Self::Error> {
        let s = start as usize;
        let e = (end as usize + 1).min(self.0.len());
        Ok(self.0[s..e].to_vec())
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let fetcher = FileFetcher(std::fs::read("video.mp4")?);
    let probe: Vec<u8> = fetcher.fetch(0, 512 * 1024).await?;
    let total_size: Option<u64> = Some(1_234_567_890);

    let index = parse(&probe, total_size, &fetcher).await?;

    if let Some(range) = index.find_byte_range(60.0, 120.0) {
        // Always prefetch the init segment so decoders have codec parameters
        let init = fetcher.fetch(0, index.init_end_byte).await?;
        let clip = fetcher.fetch(range.start, range.end).await?;

        // Write init + clip to a file, then trim with FFmpeg stream copy:
        // ffmpeg -i combined.mp4 -ss 60 -t 60 -c copy -avoid_negative_ts 1 -y out.mp4
        let _ = (init, clip);
    }
    Ok(())
}

Β§πŸ“– Documentation

The full API reference is available on docs.rs.

Β§parse()

β“˜
pub async fn parse<F: RangeFetcher>(
    probe: &[u8],
    total_size: Option<u64>,
    fetcher: &F,
) -> Result<ContainerIndex>;

Detects the container format from magic bytes in probe and dispatches to the appropriate parser. Returns Err(Error::UnsupportedFormat) for unrecognised formats.

Β§ContainerIndex

β“˜
pub struct ContainerIndex {
    /// Last byte (inclusive) of the codec initialisation data (moov, EBML header, …).
    /// A partial download must always include `bytes 0..=init_end_byte`.
    pub init_end_byte: u64,
}

impl ContainerIndex {
    /// Returns `Some(ByteRange { start, end })` covering `[start_secs, end_secs]`,
    /// expanded to the nearest decodable boundary, or `None` if the range is not covered.
    pub fn find_byte_range(&self, start_secs: f64, end_secs: f64) -> Option<ByteRange>;
}

Β§RangeFetcher trait

β“˜
pub trait RangeFetcher {
    type Error: std::error::Error + Send + Sync + 'static;

    /// Fetches bytes `[start, end]` (inclusive) from the remote stream.
    fn fetch(&self, start: u64, end: u64)
        -> impl Future<Output = std::result::Result<Vec<u8>, Self::Error>> + Send;
}

fetch is called only when extra data is required beyond the initial probe:

  • WebM β€” when the Cues element starts beyond the probe window.
  • OGG β€” up to 64 equidistant binary-search probes across the stream.
  • AVI β€” one fetch of the last 64 KB to locate the idx1 chunk.
  • MPEG-TS β€” up to 64 equidistant binary-search probes for PCR timestamps.

All other formats (MP4, MP3, FLAC, WAV, AIFF, AAC, FLV) parse entirely from the probe.

§🚨 Error handling

β“˜
pub enum Error {
    /// MHTML, plaintext, or unrecognized magic bytes.
    UnsupportedFormat,
    /// Container index could not be parsed (truncated data, invalid structure).
    ParseFailed { reason: String },
    /// An extra Range fetch required by the parser failed.
    FetchFailed(Box<dyn std::error::Error + Send + Sync>),
}

UnsupportedFormat is the expected case for storyboard MHTML segments and non-media content. Callers should handle it by falling back to a full download or reporting that seeking is unavailable.


§🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. Make sure to follow the Contributing Guidelines.

Β§πŸ“„ License

This project is licensed under the GPL-3.0 License.

Re-exportsΒ§

pub use error::Error;
pub use error::Result;
pub use index::ByteRange;
pub use index::ContainerIndex;

ModulesΒ§

error
Error types for the media-seek crate.
index
Container seek index types returned by all format parsers.

TraitsΒ§

RangeFetcher
A source capable of fetching an arbitrary byte range from a remote stream.

FunctionsΒ§

parse
Parses the container index for a stream and returns a ContainerIndex.