Dominant colors with human-vocabulary names for video keyframes — MMCQ extraction + nearest-neighbor lookup against the xkcd color survey.
Overview
colorthief extracts dominant colors from packed-RGB video keyframes
and maps each to its closest entry in a 949-color human-vocabulary
table sourced from the xkcd color survey. Built for video
indexing and search-vocabulary pipelines: every output dominant
carries both the actual MMCQ-extracted RGB (for swatch rendering)
and the named Color (for search-index vocabulary), sorted
descending by population.
Crates in this workspace
| Crate | Purpose |
|---|---|
colorthief |
Dominant-color extraction (MMCQ) + naming pipeline. RgbFrame<'a> (8-bit) / Rgb48Frame<'a> (16-bit HDR) input. |
colorthief-dataset |
Static xkcd palette + nearest-neighbor lookup with three color-difference metrics (CIEDE2000, CIE94, Delta E 76). no_std + no_alloc. |
xtask |
Build-time codegen — re-runs offline to regenerate the static dataset and CIEDE2000 LUT from the upstream CSV. Not published. |
Installation
[]
= "0.1"
# Or, if you only need the static palette + nearest-neighbor lookup
# (no MMCQ; works in no_std + no_alloc):
= "0.1"
Minimum supported Rust version: 1.95 (required for stable AVX-512F
intrinsics and core::error::Error in no_std builds via
thiserror 2 without its std feature).
Examples
| Example | Crate | Run |
|---|---|---|
extract |
colorthief |
cargo run --release --example extract -p colorthief |
extract_rgb48 (HDR / 16-bit) |
colorthief |
cargo run --release --example extract_rgb48 -p colorthief |
extract_no_alloc (static mut Mmcq + fixed buffer) |
colorthief |
cargo run --release --example extract_no_alloc -p colorthief |
lookup (name-only, no MMCQ) |
colorthief-dataset |
cargo run --release --example lookup -p colorthief-dataset |
See more details in examples and examples.
Algorithms
Three nearest-neighbor metrics, behind a #[non_exhaustive] #[repr(u8)] enum:
Algorithm |
Speed (NEON) | Notes |
|---|---|---|
Ciede2000Exact (default) |
~230 ns/query (LUT) or 71.5 µs (full scan) | Modern perceptual gold-standard. Provably exact at u8 RGB resolution when lut feature is on. |
Cie94 |
~510 ns/query | Asymmetric (palette = reference). Mid-accuracy. |
DeltaE76 |
~470 ns/query | Squared Euclidean LAB. Fastest, but well-known biases in the saturated blue / yellow regions. |
The default Ciede2000Exact is ~310× faster than naive full-scan
thanks to a pre-computed 32³ candidate-set LUT (see Architecture
below).
Feature flags
colorthief:
| Feature | Default | Effect |
|---|---|---|
std |
✓ | thread_local!-cached MMCQ workspace; zero-alloc-per-call after first call per thread. Implies alloc. |
alloc |
Heap allocator available; enables Vec<Dominant>-returning APIs and Mmcq::new_boxed(). |
|
lut |
✓ | 32³ candidate-set LUT for CIEDE2000 — ~256 KB binary cost, ~310× CIEDE2000 speedup. |
colorthief-dataset:
| Feature | Default | Effect |
|---|---|---|
std |
✓ | Enables x86_64 runtime CPU-feature detection. |
alloc |
Forward-compat hook (current API is no_alloc). |
|
lut |
✓ | The 32³ CIEDE2000 LUT — propagated from colorthief/lut. |
No-std + no-alloc support
Both crates are usable in no_std + no_alloc environments. Caller
manages the MMCQ workspace (a static mut Mmcq placed in .bss) and
the output buffer (a fixed-size [Option<Dominant>; N]). See the
extract_no_alloc
example for the full pattern.
The Buffer<T> trait abstracts the output: Vec<T> (alloc-gated),
[Option<T>; N], &mut [Option<T>] ship by default; consumers can
plug in arrayvec::ArrayVec / heapless::Vec / custom types with
a one-line impl Buffer<T>.
For zero-alloc-per-call in single-threaded no_std + alloc
environments (typical wasm32-unknown-unknown / interrupt-free bare
metal), place an Mmcq in static mut yourself — the unsafe
then sits at your call site, not silently inside this crate.
SIMD backends
Color::nearest_to (Delta E 76) and Color::nearest_to_cie94
dispatch to per-arch SIMD backends:
| Backend | ISA | Lanes | Detection |
|---|---|---|---|
aarch64_neon |
NEON | 4 (128-bit) | compile-time (target_feature = "neon") |
x86_avx512 |
AVX-512F | 16 (512-bit) | runtime (is_x86_feature_detected!) |
x86_avx2 |
AVX2 | 8 (256-bit) | runtime |
x86_sse41 |
SSE4.1 | 4 (128-bit) | runtime |
wasm_simd128 |
SIMD128 | 4 (128-bit) | compile-time (target_feature = "simd128") |
scalar |
— | 1 | always available |
Every backend is bit-identical to the scalar reference — plain
mul + add (no FMA) — and verified against a 17³ = 4913-point
inline parity grid plus an exhaustive 256³ = 16,777,216-point sweep
(#[ignore]-gated; run via cargo test --release --ignored).
CIEDE2000 is scalar-only by design — its atan2 / sin / cos /
exp and branchy hue-wraparound logic don't vectorize cleanly; an
attempt regressed by ~35% vs the scalar baseline.
Codegen pipeline
colorthief-dataset/src/generated.rs is produced offline by
cargo run --release -p xtask -- codegen. The xtask:
- Parses
colorthief-dataset/assets/color_hierarchy.csv(sourced from Stitch Fix'scolornamer, Apache-2.0). - Computes CIE LAB (D65, 2°) per entry.
- Computes the 32³ CIEDE2000 candidate-set LUT (rayon-parallel, ~3 min on Apple Silicon — every u8 RGB swept through the full-scan reference).
- Emits two
#[non_exhaustive] #[repr(u8)]enums (Family,Kind) covering every distinct value in the CSV. - Pretty-prints +
rustfmts the result so it passescargo fmt --check.
CI's codegen-up-to-date job re-runs the xtask and fails if
generated.rs would change — guarantees no drift between assets/
and the committed source.
Coverage-side cfgs
For coverage runs that need to exercise lower-tier SIMD branches on hardware that natively supports a higher tier:
--cfg colorthief_force_scalar— bypass every SIMD backend.--cfg colorthief_disable_avx512— drop x86_64 from AVX-512F to AVX2.--cfg colorthief_disable_avx2— drop x86_64 to SSE4.1.
These flags are also exercised by the simd.yml CI workflow.
License
colorthief is dual-licensed under MIT or Apache-2.0 at your
option.
See LICENSE-APACHE, LICENSE-MIT for details.
The upstream xkcd color-survey data is public domain (Randall
Munroe); Stitch Fix's hierarchical name layers are Apache-2.0
(attribution in THIRD_PARTY_NOTICES.md).
Copyright (c) 2026 FinDIT Studio authors.