Expand description
π A transport-agnostic container format index parser for Rust
Translate any
[start_secs, end_secs] window into the exact byte ranges needed for partial HTTP downloads β no subprocess, no HTTP client bundled.
Β§π Why media-seek?
Downloading only a time clip from a streaming URL requires knowing which bytes to request. Every container format stores this information differently β MP4 uses a sidx box, WebM has EBML Cues, MP3 uses the Xing TOC, and so on.
media-seek abstracts all of that. You feed it the leading bytes of a stream and it returns a ContainerIndex that answers one question: given a time window [start, end], which bytes do I need to fetch?
There is no HTTP client bundled and no subprocess spawned. Callers implement the two-line RangeFetcher trait to supply bytes from whatever transport they use. Formats whose index lies outside the probe window (WebM Cues, OGG page bisection, AVI idx1, MPEG-TS PCR search) request additional ranges through that trait automatically.
Β§π₯ How to get it
Add the following to your Cargo.toml file:
[dependencies]
media-seek = "0.4.0"Check the releases page for the latest version.
Β§π Observability & Tracing
This crate always includes the
tracing crate. It emits debug events for each format detection, parse step, and extra fetch.
β οΈ Important: tracing macros are pure no-ops without a configured subscriber. If you donβt add one, there is zero runtime overhead.
To capture logs, add a subscriber in your application:
[dependencies]
tracing-subscriber = "0.3"use tracing::Level;
use tracing_subscriber::FmtSubscriber;
let subscriber = FmtSubscriber::builder()
.with_max_level(Level::DEBUG)
.finish();
tracing::subscriber::set_global_default(subscriber)
.expect("setting default subscriber failed");Refer to the tracing-subscriber documentation for more advanced configuration (JSON output, log levels, targets, etc.).
Β§π― Supported formats
| Extension | Index type | Precision | Extra fetches |
|---|---|---|---|
mp4, m4a | fMP4 SIDX box | Fragment boundary | No |
webm | EBML Cues element | Cluster boundary | Maybe |
mp3 | Xing/VBRI TOC or CBR avg | Frame / 1 % TOC entry | No |
ogg | OGG page granule bisection | Page boundary | Yes (up to 64) |
flac | SEEKTABLE metadata block | Seek point | No |
wav | PCM formula | BlockAlign-exact | No |
aiff | PCM formula | Sample-exact | No |
aac | ADTS frame scan average | ~21 ms frame | No |
flv | AMF0 onMetaData keyframes | Keyframe | No |
avi | idx1 chunk at EOF | Frame | Yes (1 fetch) |
ts | PCR binary search | ~11 ms TS packet | Yes (up to 64) |
MHTML (storyboard segments), None, and unrecognized magic bytes return Err(Error::UnsupportedFormat).
Β§π Quick start
Β§1. Implement RangeFetcher
use media_seek::RangeFetcher;
struct HttpFetcher {
client: reqwest::Client,
url: String,
}
impl RangeFetcher for HttpFetcher {
type Error = reqwest::Error;
async fn fetch(&self, start: u64, end: u64) -> std::result::Result<Vec<u8>, Self::Error> {
let range = format!("bytes={}-{}", start, end);
self.client
.get(&self.url)
.header("Range", range)
.send()
.await?
.bytes()
.await
.map(|b| b.to_vec())
}
}Β§2. Parse the container index
use media_seek::{parse, RangeFetcher};
struct FileFetcher(Vec<u8>);
impl RangeFetcher for FileFetcher {
type Error = std::io::Error;
async fn fetch(&self, start: u64, end: u64) -> std::result::Result<Vec<u8>, Self::Error> {
let s = start as usize;
let e = (end as usize + 1).min(self.0.len());
Ok(self.0[s..e].to_vec())
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let fetcher = FileFetcher(std::fs::read("video.mp4")?);
// 512 KB is recommended β enough for most format headers and indices
let probe: Vec<u8> = fetcher.fetch(0, 512 * 1024).await?;
// Total stream size in bytes (from Content-Length or a HEAD request)
let total_size: Option<u64> = Some(1_234_567_890);
let index = parse(&probe, total_size, &fetcher).await?;
Ok(())
}Β§3. Translate timestamps to byte ranges
use media_seek::{parse, RangeFetcher};
struct FileFetcher(Vec<u8>);
impl RangeFetcher for FileFetcher {
type Error = std::io::Error;
async fn fetch(&self, start: u64, end: u64) -> std::result::Result<Vec<u8>, Self::Error> {
let s = start as usize;
let e = (end as usize + 1).min(self.0.len());
Ok(self.0[s..e].to_vec())
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let fetcher = FileFetcher(std::fs::read("video.mp4")?);
let probe: Vec<u8> = fetcher.fetch(0, 512 * 1024).await?;
let total_size: Option<u64> = Some(1_234_567_890);
let index = parse(&probe, total_size, &fetcher).await?;
if let Some(range) = index.find_byte_range(60.0, 120.0) {
// Always prefetch the init segment so decoders have codec parameters
let init = fetcher.fetch(0, index.init_end_byte).await?;
let clip = fetcher.fetch(range.start, range.end).await?;
// Write init + clip to a file, then trim with FFmpeg stream copy:
// ffmpeg -i combined.mp4 -ss 60 -t 60 -c copy -avoid_negative_ts 1 -y out.mp4
let _ = (init, clip);
}
Ok(())
}Β§π Documentation
The full API reference is available on docs.rs.
Β§parse()
pub async fn parse<F: RangeFetcher>(
probe: &[u8],
total_size: Option<u64>,
fetcher: &F,
) -> Result<ContainerIndex>;Detects the container format from magic bytes in probe and dispatches to the appropriate parser. Returns Err(Error::UnsupportedFormat) for unrecognised formats.
Β§ContainerIndex
pub struct ContainerIndex {
/// Last byte (inclusive) of the codec initialisation data (moov, EBML header, β¦).
/// A partial download must always include `bytes 0..=init_end_byte`.
pub init_end_byte: u64,
}
impl ContainerIndex {
/// Returns `Some(ByteRange { start, end })` covering `[start_secs, end_secs]`,
/// expanded to the nearest decodable boundary, or `None` if the range is not covered.
pub fn find_byte_range(&self, start_secs: f64, end_secs: f64) -> Option<ByteRange>;
}Β§RangeFetcher trait
pub trait RangeFetcher {
type Error: std::error::Error + Send + Sync + 'static;
/// Fetches bytes `[start, end]` (inclusive) from the remote stream.
fn fetch(&self, start: u64, end: u64)
-> impl Future<Output = std::result::Result<Vec<u8>, Self::Error>> + Send;
}fetch is called only when extra data is required beyond the initial probe:
- WebM β when the Cues element starts beyond the probe window.
- OGG β up to 64 equidistant binary-search probes across the stream.
- AVI β one fetch of the last 64 KB to locate the
idx1chunk. - MPEG-TS β up to 64 equidistant binary-search probes for PCR timestamps.
All other formats (MP4, MP3, FLAC, WAV, AIFF, AAC, FLV) parse entirely from the probe.
Β§π¨ Error handling
pub enum Error {
/// MHTML, plaintext, or unrecognized magic bytes.
UnsupportedFormat,
/// Container index could not be parsed (truncated data, invalid structure).
ParseFailed { reason: String },
/// An extra Range fetch required by the parser failed.
FetchFailed(Box<dyn std::error::Error + Send + Sync>),
}UnsupportedFormat is the expected case for storyboard MHTML segments and non-media content. Callers should handle it by falling back to a full download or reporting that seeking is unavailable.
Β§π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request. Make sure to follow the Contributing Guidelines.
Β§π License
This project is licensed under the GPL-3.0 License.
Re-exportsΒ§
pub use error::Error;pub use error::Result;pub use index::ByteRange;pub use index::ContainerIndex;
ModulesΒ§
- error
- Error types for the
media-seekcrate. - index
- Container seek index types returned by all format parsers.
TraitsΒ§
- Range
Fetcher - A source capable of fetching an arbitrary byte range from a remote stream.
FunctionsΒ§
- parse
- Parses the container index for a stream and returns a
ContainerIndex.