Skip to main content

Crate hunch

Crate hunch 

Source
Expand description

§Hunch

A fast, offline Rust library for extracting structured media metadata from filenames, inspired by Python’s guessit.

Hunch parses messy media filenames and release names into structured metadata: title, year, season, episode, video codec, audio codec, resolution, and 49 properties in total.

§Quick Start

use hunch::hunch;

let result = hunch("The.Matrix.1999.1080p.BluRay.x264-GROUP.mkv");
assert_eq!(result.title(), Some("The Matrix"));
assert_eq!(result.year(), Some(1999));
assert_eq!(result.screen_size(), Some("1080p"));
assert_eq!(result.source(), Some("Blu-ray"));
assert_eq!(result.video_codec(), Some("H.264"));
assert_eq!(result.release_group(), Some("GROUP"));
assert_eq!(result.container(), Some("mkv"));

§Accessing Properties

HunchResult provides typed convenience accessors for common properties, plus generic first and all methods for any Property:

use hunch::hunch;
use hunch::Property;

let r = hunch("Movie.2024.FRENCH.1080p.BluRay.DTS.x264-GROUP.mkv");

// Typed accessors (return the first value):
assert_eq!(r.title(), Some("Movie"));
assert_eq!(r.year(), Some(2024));

// Generic accessor for any property:
assert_eq!(r.first(Property::Language), Some("French"));

§Multi-Valued Properties

Some properties (languages, episodes, “other” flags) can have multiple values. Use all to retrieve them:

use hunch::hunch;

let r = hunch("Movie.2024.2160p.UHD.BluRay.Remux.HDR.HEVC.DTS-HD.MA-GROUP.mkv");
let flags = r.other();
assert!(flags.contains(&"HDR10"));
assert!(flags.contains(&"Remux"));

§JSON Output

Convert results to a JSON-friendly map with to_flat_map:

use hunch::hunch;

let r = hunch("Movie.2024.1080p.BluRay.x264-GROUP.mkv");
let map = r.to_flat_map();
assert_eq!(map["title"], "Movie");
assert_eq!(map["year"], 2024);

§Logging / Debugging

Hunch uses the log crate for diagnostic output. Enable it to see how each pipeline stage processes a filename:

# CLI: use --verbose for debug, RUST_LOG for trace
hunch -v "Movie.2024.1080p.mkv"
RUST_LOG=hunch=trace hunch "Movie.2024.1080p.mkv"

In library usage, attach any log-compatible subscriber (e.g., env_logger, tracing-log).

§Architecture

Hunch uses a span-based, two-pass architecture with plain function pointers (no trait objects, no unsafe):

  1. Tokenize — split on separators, extract extension, detect brackets
  2. Zone map — detect anchors (SxxExx, 720p, x264) to establish title-zone vs tech-zone boundaries
  3. Pass 1: Match & Resolve — 20 TOML rule files + algorithmic matchers produce MatchSpans; conflict resolution keeps higher-priority / longer matches
  4. Pass 2: Extract — release group, title, episode title run with access to resolved match positions from Pass 1
  5. ResultHunchResult with 49 typed property accessors

All regex patterns use the regex crate only (linear-time, ReDoS-immune). TOML rule files are embedded at compile time via include_str! — no runtime file I/O.

Re-exports§

pub use matcher::span::Property;

Modules§

matcher
Match span and matching infrastructure.
properties
Property matchers and TOML integration tests.
tokenizer
Tokenizer: splits a media filename into a stream of tokens.
zone_map
Zone map: structural analysis of filename zones.

Structs§

HunchResult
The result of parsing a media filename.
Pipeline
The two-pass parsing pipeline.

Enums§

Confidence
How confident hunch is in the extracted result.
MediaType
The type of media detected.

Functions§

hunch
Parse a media filename and return structured metadata.
hunch_with_context
Parse a media filename using sibling filenames for improved title detection.