🔍 Hunch
A fast, offline media filename parser for Rust — extract title, year, season, episode, codec, language, and 40+ other properties from messy filenames.
Hunch is a Rust rewrite of Python's guessit. All 49 guessit properties are implemented. The engine uses a tokenizer-first, two-pass, TOML-driven architecture with linear-time regex only (ReDoS-immune).
Install
Homebrew (macOS / Linux)
Cargo (from source)
Pre-built binaries
Download from GitHub Releases.
Also supports cargo-binstall:
As a library
Usage
CLI
{
}
Multiple files at once:
Options:
-t, --type <TYPE> Hint media type: "movie" or "episode"
-n, --name-only Treat input as name only (no path separators)
-j, --json Output compact JSON (default is pretty-printed)
Library
use hunch;
guessit Compatibility
Hunch validates against guessit's own 1,309-case YAML test suite. All 49 guessit properties are implemented (3 intentionally diverged).
| guessit (Python) | hunch (Rust) | |
|---|---|---|
| Overall pass rate | 100% | 79.1% (1,036 / 1,309) |
| Properties implemented | 49 | 49 (3 intentionally diverged) |
| Properties at 95%+ | 49 | 22 |
| Properties at 100% | 49 | 16 |
96–100% accurate: year, video_codec, container, source, screen_size, audio_codec, crc32, color_depth, streaming_service, edition, frame_rate, aspect_ratio, size, version, date, proper_count, and more.
90%+ accurate: title, release_group, episode, season, audio_channels, type, website, film_title.
For per-property breakdowns see COMPATIBILITY.md.
Intentional divergences
| Property | Reason |
|---|---|
audio_bit_rate / video_bit_rate |
Hunch uses a single bit_rate |
mimetype |
Trivially derived from container; redundant |
Design
Hunch does not port guessit's rebulk engine. Instead:
- Tokenize — split on separators, extract extension, detect brackets
- Zone map — detect anchors (SxxExx, 720p, x264) to establish title zone vs tech zone boundaries
- Pass 1: Match & Resolve — 20 TOML rule files (embedded at compile time) + algorithmic matchers. Conflict resolution by priority then length.
- Pass 2: Extract — release group, title, and episode title run with access to resolved match positions from Pass 1.
- Result —
HunchResultwith 49 typed property accessors
See ARCHITECTURE.md for the full design and decision log.
Project Structure
src/
├── lib.rs # Public API: hunch(), hunch_with()
├── main.rs # CLI binary (clap)
├── hunch_result.rs # HunchResult type + JSON serialization
├── options.rs # Configuration
├── zone_map.rs # Structural zone analysis
├── tokenizer.rs # Input → TokenStream
├── pipeline/ # Two-pass orchestration
├── matcher/ # Conflict resolution + TOML rule engine
└── properties/ # 31 property matcher modules
rules/ # 20 TOML data files (compile-time embedded)
tests/ # Integration tests + guessit regression suite
Contributing