Skip to main content

Crate scenesdetect

Crate scenesdetect 

Source
Expand description

scenesdetect

A Rust port of PySceneDetect — scene/shot cut detection built around a Sans-I/O streaming API, designed to slot in any other frame source.

github LoC Build codecov

docs.rs crates.io crates.io license

§Overview

scenesdetect is a from-scratch Rust port of PySceneDetect. It is deliberately Sans-I/O: the crate never opens a file, decodes a packet, or spawns a thread. Callers hand frames in one by one, and each detector returns an Option<Timestamp> identifying the cut point — or nothing. Composing those point cuts into scene ranges is the caller’s responsibility, which keeps this crate independent of any particular decoding pipeline.

Timestamps are represented as raw integer pts + Timebase (matching FFmpeg’s AVRational) rather than floating-point seconds, so all arithmetic is exact and cross-stream comparisons are unambiguous.

§Detectors

ModuleAlgorithmGood for
histogramYUV-luma histogram correlationGeneric cuts, robust to camera shake
phashDCT-based perceptual hash (pHash)Similarity-tolerant dedup / cut detection
thresholdMean-brightness state machineFade-to-black / fade-in transitions
contentHSV-space delta + optional Canny edge deltaMotion/composition changes — the default PySceneDetect algorithm
adaptiveRolling-average wrapper over contentSuppresses false positives on sustained fast motion

§Features

  • Sans-I/O streaming API — hand in LumaFrame / RgbFrame / HsvFrame (zero-copy slices), get Option<Timestamp> back per frame. No allocation on the hot path once the detector is primed.
  • Hand-written SIMD backends — aarch64 NEON, x86 SSSE3 + AVX2 (runtime-dispatched via is_x86_feature_detected!), and wasm simd128. All with scalar fallbacks, toggleable per-detector via Options::with_simd(false).
  • Exact rational timestampsTimebase mirrors FFmpeg’s AVRational; Timestamp compares semantically across timebases via i128 cross-multiply.
  • no_std + alloc — the crate builds without std; enable the default std feature for runtime x86 feature detection.
  • Optional serde — all Options types derive Serialize / Deserialize under the serde feature.

§Installation

[dependencies]
scenesdetect = "0.1"

§Crate features

FeatureDefaultPurpose
stdRuntime x86 SIMD dispatch, standard library types
allocno_std build using alloc only
serdeSerialize / Deserialize for all Options types

§Benchmarks

Numbers below are per-frame runtimes from the benchmark.yml CI workflow on GitHub-hosted runners, compiled with the default release profile (opt-level = 3, thin LTO). Each row is a single process_* call — that is, the full pipeline for one frame including the per-channel delta reduction. Lower is better; fps is 1 s / per-frame time. Full data lives in the Benchmarks workflow artifacts.

§Per-detector timings at 1080p

Best SIMD-on path, single-threaded:

DetectormacOS aarch64 NEONLinux x86_64 AVX2Windows x86_64 AVX2
histogram0.93 ms (≈1 080 fps)1.24 ms (≈810 fps)1.26 ms (≈790 fps)
phash1.65 ms (≈610 fps)2.03 ms (≈490 fps)2.22 ms (≈450 fps)
threshold — luma0.12 ms (≈8 000 fps)0.33 ms (≈3 080 fps)0.34 ms (≈2 940 fps)
threshold — RGB0.38 ms (≈2 650 fps)0.98 ms (≈1 030 fps)0.99 ms (≈1 020 fps)
content — luma-only0.48 ms (≈2 080 fps)0.34 ms (≈2 940 fps)0.40 ms (≈2 510 fps)
content — BGR, no edges3.38 ms (≈ 300 fps)2.78 ms (≈360 fps)2.84 ms (≈350 fps)
content — BGR with Canny edges58.0 ms (≈17 fps)71.0 ms (≈14 fps)75.8 ms (≈13 fps)
adaptive — luma-only0.49 ms (≈2 040 fps)0.30 ms (≈3 300 fps)0.40 ms (≈2 500 fps)
adaptive — BGR, no edges3.18 ms (≈ 315 fps)2.78 ms (≈360 fps)3.06 ms (≈325 fps)

§SIMD vs scalar at 1080p (content::process_bgr, default weights, no edges)

The BGR path is the hot spot — packed-BGR → planar HSV conversion is where the hand-written SIMD backends earn their keep. Scalar numbers come from the same benches with Options::with_simd(false).

TierSIMDScalarUplift
macos-aarch64-neon3.38 ms4.61 ms1.36×
ubuntu-x86_64-default (runtime AVX2)2.78 ms24.99 ms9.0×
ubuntu-x86_64-native (-C target-cpu=native)2.72 ms9.00 ms3.3×
ubuntu-x86_64-ssse3-only (AVX/AVX2/FMA disabled)2.09 ms21.34 ms10.2×
windows-x86_64-default2.84 ms57.55 ms20.3×

A few things fall out of this:

  • x86 SIMD is very much worth it. Intel/AMD runners without the hand-written std::arch dispatch — i.e. scalar — run the BGR pipeline 9–20× slower than the SSSE3/AVX2 backend. The biggest x86 win is the 3-plane deinterleave via PSHUFB, which the compiler doesn’t emit on its own.
  • NEON uplift is modest because aarch64’s auto-vectorizer handles the scalar fallback well; the hand-written NEON path still wins on the deinterleave (vld3q_u8) but the scalar baseline is already strong.
  • -C target-cpu=native closes most of the scalar gap on x86 (9 ms vs 25 ms default scalar) by unlocking AVX2 for LLVM’s auto-vectorizer, but it still loses to the hand-written dispatch by ~3×.
  • Canny edges are expensive. Turning on delta_edges dominates the frame time at ~60–75 ms/1080p. Only enable it when color deltas aren’t enough.
  • Adaptive overhead is ≈O(1) per frame. Varying window_width from 1 to 16 moves the 1080p luma-only timing by <5% — the rolling-sum fix made the per-frame cost flat.

§Reproducing locally

cargo bench --bench content
cargo bench --bench adaptive
# ...or all of them:
cargo bench

The benchmark.yml workflow runs five matrix rows on every push to main and every PR touching src/**, benches/**, or the workflow file: macos-aarch64-neon, ubuntu-x86_64-default, ubuntu-x86_64-native, ubuntu-x86_64-ssse3-only, windows-x86_64-default. The per-run artifact contains both a bencher-format summary and the Criterion HTML detail tree.

§Acknowledgements

scenesdetect is a Rust port of PySceneDetect by Brandon Castellano, released under the BSD 3-Clause license. The detector algorithms — histogram correlation, DCT-based pHash, brightness-threshold fades, HSV + Canny content deltas, and the rolling-average adaptive layer — are re-implementations of the algorithms described in PySceneDetect’s source and documentation. Default parameters mirror PySceneDetect’s where practical; any deliberate deviations are called out in the relevant module docs.

See THIRD-PARTY.md for the full upstream license text and additional third-party notices.

§License

scenesdetect is under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE, LICENSE-MIT for details.

Copyright (c) 2026 FinDIT studio authors.

Modules§

adaptivestd or alloc
Rolling-average / adaptive scene detector built on top of the content detector’s scores. Reduces false positives on fast camera motion. Adaptive (rolling-average) scene detector.
contentstd or alloc
Content-change scene detector using HSV-space per-frame deltas and optional Canny edge comparison. Content-change scene detection via HSV-space deltas and optional Canny edges.
frame
Frame types for scene detection. Frame-input types for the scene detectors.
histogramstd or alloc
Histogram-based scene detector using YUV luma correlation. Histogram-based scene detection via luma correlation.
phashstd or alloc
Perceptual hash-based scene detector using the DCT-based pHash algorithm. Perceptual hash (pHash) scene detection via DCT signatures.
thresholdstd or alloc
Intensity-threshold scene detector for fade-in / fade-out transitions. Intensity-threshold scene detection — fade-in / fade-out transitions.