captube 0.1.0 - Docs.rs

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="assets/captube-aperture-dark.svg">
    <img src="assets/captube-aperture.svg" alt="captube" width="160">
  </picture>
</p>

<h1 align="center">captube</h1>
<p align="center">Turn any YouTube video into slides.</p>

<p align="center">
  <a href="https://crates.io/crates/captube"><img src="https://img.shields.io/crates/v/captube.svg?style=flat-square" alt="crates.io"></a>
  <a href="https://docs.rs/captube"><img src="https://img.shields.io/docsrs/captube?style=flat-square" alt="docs.rs"></a>
  <a href="https://crates.io/crates/captube"><img src="https://img.shields.io/crates/d/captube.svg?style=flat-square" alt="downloads"></a>
  <a href="#license"><img src="https://img.shields.io/crates/l/captube.svg?style=flat-square" alt="license"></a>
</p>

---

captube takes a YouTube lecture URL and gives you back a PDF where every
page is one unique slide, captured from the video itself.

## Install

```bash
cargo install captube
```

Runtime dependencies (not bundled):

- `ffmpeg` and `ffprobe` — on `PATH`
- `yt-dlp` — on `PATH`

## Use

```bash
captube 'https://www.youtube.com/watch?v=<VIDEO_ID>' -o slides.pdf
```

All options:

```
captube <URL> [OPTIONS]

  -o, --output <PATH>              Output PDF path [default: output.pdf]
      --scene-threshold <F>        ffmpeg scene score cut-off (0.0-1.0) [default: 0.30]
      --fps <F>                    Sampling fps used during scene scanning [default: 2.0]
      --max-width <U32>            Maximum px width of embedded frames [default: 1280]
      --dedup-threshold <F>        Mean pixel diff (0-255) to consider frames
                                   the same slide — raise for fewer pages,
                                   lower to keep subtler slide variations
                                   [default: 20.0]
      --keep-workdir               Keep intermediate files for inspection
  -v, --verbose                    Print per-frame dedup decisions
```

## How it works

1. **Download** — `yt-dlp` fetches a video-only mp4 at ≤720p.
2. **Keyframe dump** — `ffmpeg -skip_frame nokey` decodes only keyframes
   (about one per GOP). Modern H.264 encoders put keyframes on scene
   boundaries, so these cover every real slide change — plus duplicates
   for slides that outlast a single GOP.
3. **Perceptual dedup** — each keyframe is hashed as a 256×256 grayscale
   thumbnail and compared to the previous kept frame by mean absolute
   difference. Mouse-cursor-only motion collapses away.
4. **Settle re-extract** — every remaining keyframe is re-extracted via
   `-ss pts+0.8`. This bypasses a decoder quirk where `-skip_frame nokey`
   occasionally hands out corrupt-looking frames at cross-fade
   boundaries, and it also lands on the stable post-transition frame if
   the keyframe happened to fall mid-fade.
5. **Final dedup + PDF** — a small-threshold pass collapses any frames
   whose settled versions converged onto the same slide; `printpdf`
   writes one page per remaining frame.

On a 58-minute lecture the full pipeline (download → PDF) runs in ~17s
on a modern x86_64 box.

## License

Licensed under either of

- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))
- MIT license ([LICENSE-MIT](LICENSE-MIT))

at your option.