Expand description
§Hunch
A fast, offline Rust library for extracting structured media metadata from filenames, inspired by Python’s guessit.
Hunch parses messy media filenames and release names into structured metadata: title, year, season, episode, video codec, audio codec, resolution, and 49 properties in total.
§Quick Start
use hunch::hunch;
let result = hunch("The.Matrix.1999.1080p.BluRay.x264-GROUP.mkv");
assert_eq!(result.title(), Some("The Matrix"));
assert_eq!(result.year(), Some(1999));
assert_eq!(result.screen_size(), Some("1080p"));
assert_eq!(result.source(), Some("Blu-ray"));
assert_eq!(result.video_codec(), Some("H.264"));
assert_eq!(result.release_group(), Some("GROUP"));
assert_eq!(result.container(), Some("mkv"));§Accessing Properties
HunchResult provides typed convenience accessors for common
properties, plus generic first and
all methods for any Property:
use hunch::hunch;
use hunch::Property;
let r = hunch("Movie.2024.FRENCH.1080p.BluRay.DTS.x264-GROUP.mkv");
// Typed accessors (return the first value):
assert_eq!(r.title(), Some("Movie"));
assert_eq!(r.year(), Some(2024));
// Generic accessor for any property:
assert_eq!(r.first(Property::Language), Some("French"));§Multi-Valued Properties
Some properties (languages, episodes, “other” flags) can have multiple
values. Use all to retrieve them:
use hunch::hunch;
let r = hunch("Movie.2024.2160p.UHD.BluRay.Remux.HDR.HEVC.DTS-HD.MA-GROUP.mkv");
let flags = r.other();
assert!(flags.contains(&"HDR10"));
assert!(flags.contains(&"Remux"));§JSON Output
Convert results to a JSON-friendly map with to_flat_map:
use hunch::hunch;
let r = hunch("Movie.2024.1080p.BluRay.x264-GROUP.mkv");
let map = r.to_flat_map();
assert_eq!(map["title"], "Movie");
assert_eq!(map["year"], 2024);§Logging / Debugging
Hunch uses the log crate for diagnostic output. Enable it to
see how each pipeline stage processes a filename:
# CLI: use --verbose for debug, RUST_LOG for trace
hunch -v "Movie.2024.1080p.mkv"
RUST_LOG=hunch=trace hunch "Movie.2024.1080p.mkv"In library usage, attach any log-compatible subscriber
(e.g., env_logger, tracing-log).
§Architecture
Hunch uses a span-based, two-pass architecture with plain function
pointers (no trait objects, no unsafe):
- Tokenize — split on separators, extract extension, detect brackets
- Zone map — detect anchors (SxxExx, 720p, x264) to establish title-zone vs tech-zone boundaries
- Pass 1: Match & Resolve — 20 TOML rule files + algorithmic
matchers produce
MatchSpans; conflict resolution keeps higher-priority / longer matches - Pass 2: Extract — release group, title, episode title run with access to resolved match positions from Pass 1
- Result —
HunchResultwith 49 typed property accessors
All regex patterns use the regex crate only (linear-time, ReDoS-immune).
TOML rule files are embedded at compile time via include_str! — no
runtime file I/O.
Re-exports§
pub use matcher::span::Property;
Modules§
- matcher
- Match span and matching infrastructure.
- properties
- Property matchers and TOML integration tests.
- tokenizer
- Tokenizer: splits a media filename into a stream of tokens.
- zone_
map - Zone map: structural analysis of filename zones.
Structs§
- Hunch
Result - The result of parsing a media filename.
- Pipeline
- The two-pass parsing pipeline.
Enums§
- Confidence
- How confident hunch is in the extracted result.
- Media
Type - The type of media detected.
Functions§
- hunch
- Parse a media filename and return structured metadata.
- hunch_
with_ context - Parse a media filename using sibling filenames for improved title detection.