DialogDetective
Automatically identify and rename unknown tv series video files by letting AI listen to their dialogue.
Why I Built This
I sometimes rip TV series from my Blu-ray/DVD collection to have them available for easier binge watching. Unfortunately, the structure of those disc releases is often completely non-linear - you get files like TITLE_01.mkv, TITLE_03.mkv, TITLE_07.mkv with no clear indication which episode is which.
I didn't want to manually map these weird title IDs to actual season and episode numbers. That would require me to watch a bit of each file and guess based on episode summaries from TV databases. I thought modern LLMs should be able to do this for me. A quick prototype later, it turned out they can.
So I created DialogDetective to do this work automatically. If you have the same problem, this tool might help you too.
How It Works
DialogDetective extracts audio from your video files, transcribes the dialogue using Whisper (with GPU acceleration), fetches episode metadata from TVMaze, and uses an LLM (Gemini or Claude) to match the transcript to the correct episode. Then it renames or copies the files with proper episode information.
Installation
Pre-built Binaries
Pre-built binaries are available on the GitHub Releases page:
- macOS (Apple Silicon & Intel): Built with Metal GPU acceleration
- Linux (x86_64 & aarch64): Built with CPU-only Whisper
- Windows (x86_64): Built with CPU-only Whisper
GPU Acceleration
DialogDetective uses whisper-rs for speech-to-text, which supports various GPU backends for faster transcription.
Default builds:
- macOS: Metal (Apple GPU) - enabled automatically
- Linux/Windows: CPU-only
Building with GPU support (Linux/Windows):
If you have the required GPU frameworks installed, you can build with GPU acceleration:
# NVIDIA CUDA (requires CUDA toolkit)
# Vulkan (requires Vulkan SDK)
# AMD ROCm/hipBLAS (requires ROCm)
See the whisper-rs documentation for detailed requirements for each GPU backend.
Prerequisites
- Rust toolchain (install from rustup.rs)
- FFmpeg - Must be installed and available in your PATH for audio extraction
- macOS:
brew install ffmpeg - Ubuntu/Debian:
apt install ffmpeg - Windows: Download from ffmpeg.org
- macOS:
- AI CLI: Gemini CLI (default) or Claude Code
- Must be installed and authenticated before use
Whisper models are downloaded automatically on first run.
Quick Start
# Dry run - see what would happen (recommended first step)
# It is encouraged to limit processing to specific seasons. See below for more information about this.
# Rename files in place
# Copy files to organized directory
# Select different Whisper model (default: base)
# See all available options
Season Filtering (Highly Encouraged!)
DialogDetective can work on full series without season filtering, but using -s or --season is highly encouraged for several important reasons:
- Reduces LLM context size - Only sends relevant episodes to the AI instead of the entire series
- Improves matching accuracy - Fewer episodes means less confusion and better identification
- Saves tokens - Significantly reduces API costs, especially for long-running series
- Faster processing - Less data to send and analyze
Important: The season filter limits the matching scope. If you specify -s 1 and a video file is actually from season 2, it will likely be mismatched to a season 1 episode. Only use season filtering when you know all your video files belong to the specified season(s).
Since you're typically processing a single season at a time when ripping discs, specifying the correct season makes the tool much more effective: -s 1 or --season 2
Usage
Run dialog_detective --help for complete usage information.
Important options:
-s/--season- Filter to specific season (highly encouraged, can be repeated)--model- Select Whisper model--matcher- AI backend: gemini (default) or claude--mode- Operation: dry-run (default), rename, or copy--list-models- Show all available Whisper models
Filename Templates
Use --format to customize output filenames.
Default: {show} - S{season:02}E{episode:02} - {title}.{ext}
Available variables:
{show}- Series name{season}- Season number (use{season:02}for zero-padding){episode}- Episode number (use{episode:02}for zero-padding){title}- Episode title{ext}- Original file extension
Example:
AI Backend Integration
DialogDetective currently uses CLI tools for LLM access (Gemini CLI and Claude Code). This was the easiest way for me to quickly support LLMs, as I already had both tools installed and authenticated on my system.
The interface is abstracted enough to easily add direct API access via API keys (OpenAI, Anthropic, etc.) if there's demand for it. If you need direct API support, feel free to reach out or submit a PR - contributions are highly welcome!
Cache & Storage
DialogDetective caches various data to avoid redundant processing and speed up repeated runs.
Cache Location
All cached data is stored in a platform-specific cache directory:
- macOS:
~/Library/Caches/de.westhoffswelt.dialogdetective/ - Linux:
~/.cache/dialogdetective/ - Windows:
%LOCALAPPDATA%\dialogdetective\
What Gets Cached
| Data | Location | TTL | Why |
|---|---|---|---|
| Whisper Models | models/ |
Permanent | Models are large (39MB - 2.9GB) and don't change. Downloaded once from HuggingFace on first use. |
| Series Metadata | metadata/ |
24 hours | Episode lists from TVMaze rarely change. Caching reduces API calls and speeds up repeated runs on the same show. |
| Transcripts | transcripts/ |
24 hours | Whisper transcription is CPU/GPU intensive. Caching by video file hash means re-running on the same files skips transcription entirely. |
| Match Results | matching/ |
24 hours | LLM matching costs tokens and time. Results are cached by a composite key (video hash + show + seasons + matcher), so identical queries return instantly. |
The 24-hour TTL balances freshness with efficiency. If you need to force a refresh (e.g., after TVMaze updates episode data), simply delete the relevant cache subdirectory.
Temporary Files
During processing, DialogDetective extracts audio to temporary WAV files in your system's temp directory (/tmp, /var/folders/..., or %TEMP%). These files are automatically cleaned up when processing completes or if the program is interrupted.
Managing Cache
To clear all cached data:
# macOS
# Linux
To clear only models (to free disk space):
# macOS
# Linux
Use dialog_detective --list-models to see which models are currently cached and their sizes.
License
MIT License - see LICENSE file for details.