zensim 0.2.0

Fast psychovisual image similarity metric
Documentation

CI crates.io docs.rs codecov License: MIT OR Apache-2.0

zensim

Fast psychovisual image similarity metric. Combines ideas from SSIMULACRA2 and butteraugli — multi-scale SSIM + edge + high-frequency features in XYB color space, with trained weights and AVX2/AVX-512 SIMD throughout.

Quick start

use zensim::{Zensim, ZensimProfile, RgbSlice};

let z = Zensim::new(ZensimProfile::latest());
let source = RgbSlice::new(&src_pixels, width, height);
let distorted = RgbSlice::new(&dst_pixels, width, height);
let result = z.compute(&source, &distorted)?;
println!("{}: {:.2}", result.profile(), result.score()); // higher = more similar

With imgref (default feature, supports stride)

use zensim::{Zensim, ZensimProfile};

let source: imgref::ImgRef<rgb::Rgb<u8>> = imgref::Img::new(&src_pixels, width, height);
let distorted: imgref::ImgRef<rgb::Rgb<u8>> = imgref::Img::new(&dst_pixels, width, height);
let z = Zensim::new(ZensimProfile::latest());
let result = z.compute(&source, &distorted)?;

imgref::ImgRef carries width, height, and stride in one type — no separate dimension arguments, and stride-padded buffers work automatically.

RGBA

RGBA images are composited over a checkerboard before comparison, so alpha differences produce visible distortion:

use zensim::{Zensim, ZensimProfile, RgbaSlice};

let z = Zensim::new(ZensimProfile::latest());
let source = RgbaSlice::new(&src_rgba, width, height);
let distorted = RgbaSlice::new(&dst_rgba, width, height);
let result = z.compute(&source, &distorted)?;

Batch comparison

When comparing one reference against many distorted variants, precompute the reference to skip redundant XYB conversion and pyramid construction:

use zensim::{Zensim, ZensimProfile, RgbSlice};

let z = Zensim::new(ZensimProfile::latest());
let source = RgbSlice::new(&ref_pixels, width, height);
let precomputed = z.precompute_reference(&source)?;
for dst_pixels in &distorted_images {
    let dst = RgbSlice::new(dst_pixels, width, height);
    let result = z.compute_with_ref(&precomputed, &dst)?;
    println!("score: {:.2}", result.score());
}

Saves ~25% per comparison at 4K, ~34% at 8K (break-even at 3-7 distorted images per reference).

Score semantics

100 = identical. Higher = more similar. Score mapping: 100 - 18 × d^0.7 where d is the per-scale weighted feature distance (compressive — more resolution at the high-quality end where it matters most).

Scores are calibrated from 0 to 100 on our training data (344k synthetic pairs, q5–q100 across 6 codecs). Extreme distortions can produce scores below 0; the mapping is uncalibrated outside the training range.

ZensimResult provides:

Method Description
score() Similarity score (higher = more similar, typically 0–100)
raw_distance() Weighted feature distance before nonlinear mapping (lower = better)
dissimilarity() (100 - score) / 100 — 0 = identical
approx_ssim2() Approximate SSIMULACRA2 score (MAE 4.4 pts, r = 0.974)
approx_dssim() Approximate DSSIM value (MAE 0.0013, r = 0.952)
approx_butteraugli() Approximate butteraugli distance (MAE 1.65, r = 0.713)
features() Raw feature vector for diagnostics
mean_offset() Per-channel XYB mean shift [X, Y, B]

The mapping module provides bidirectional interpolation tables between zensim scores and SSIM2, DSSIM, butteraugli, libjpeg quality, and zenjpeg quality — calibrated on 344k synthetic pairs across 6 codecs.

Results are deterministic for the same input on the same architecture. Cross-architecture scores (AVX2 vs scalar vs AVX-512) may differ by small ULP.

Profiles

Each ZensimProfile variant bundles weights and parameters that affect score output. A given profile produces approximately the same scores across versions, but profiles may be removed in future major versions as the algorithm evolves.

Profile Weights Training data 5-fold CV SROCC
PreviewV0_1 228 344k synthetic pairs (6 codecs, q5–q100) 0.9936

ZensimProfile::latest() returns the most recent profile.

Input requirements

  • Color space: All inputs must be sRGB-encoded (gamma ~2.2). This is what you get from standard JPEG, PNG, and WebP decoders. If your pixels are linear-light (gamma 1.0), use PixelFormat::LinearF32Rgba via StridedBytes — zensim will apply the correct transfer function internally.
  • Wide gamut: Display P3 and BT.2020 inputs are accepted via ColorPrimaries on StridedBytes — gamut-mapped to sRGB internally. Passing wide-gamut data as sRGB will produce incorrect scores (the metric sees the wrong colors).
  • Pixel formats: RgbSlice (sRGB u8), RgbaSlice (sRGB u8 + alpha), imgref::ImgRef (sRGB u8, with stride), StridedBytes (any of: Srgb8Rgb, Srgb8Rgba, Srgb8Bgra, Srgb16Rgba, LinearF32Rgba), or implement the ImageSource trait directly.
  • Alpha: RGBA inputs are composited over a checkerboard so alpha differences produce visible distortion. Supports Straight and Opaque alpha modes.
  • Dimensions: Both images must be the same width × height, minimum 8×8.

Performance

Pure-computation benchmarks (no I/O), synthetic gradient images, AMD Ryzen 9 7950X 16C/32T (WSL2). All implementations receive pre-allocated pixel buffers.

SSIMULACRA2

Threading: zensim and ssimulacra2-rs use rayon (all cores). C++ libjxl and fast-ssim2 are single-threaded. zensim_st is zensim with .with_parallel(false) for a fair single-threaded comparison.

Resolution zensim zensim_st C++ libjxl (FFI) fast-ssim2 ssimulacra2-rs
512x512 8 ms 11 ms 45 ms 39 ms 251 ms
1280x720 14 ms 40 ms 163 ms 150 ms 529 ms
1920x1080 23 ms 90 ms 389 ms 338 ms 997 ms
2560x1440 37 ms 161 ms 683 ms 604 ms 2,358 ms
3840x2160 171 ms 499 ms 2,033 ms 1,390 ms 3,763 ms

Even single-threaded, zensim is 3–4x faster than fast-ssim2 and 4x faster than C++ libjxl. Multi-threaded zensim is 12x faster than C++ libjxl at 4K.

Butteraugli

Both butteraugli implementations are single-threaded. butteraugli-rs is the imazen pure-Rust port of libjxl's butteraugli.

Resolution C++ libjxl (FFI) butteraugli-rs
512x512 72 ms 60 ms
1280x720 304 ms 253 ms
1920x1080 705 ms 581 ms
2560x1440 1,219 ms 1,027 ms
3840x2160 2,446 ms 2,584 ms

Benchmarks are in zensim-bench/ — run with cargo bench -p zensim-bench --bench bench_compare.

Design

  • XYB color space — cube root LMS, same perceptual space as ssimulacra2/butteraugli
  • Modified SSIM — ssimulacra2's variant: drops the luminance denominator, uses 1 - (mu1-mu2)² directly. Correct for perceptually-uniform values where dark/bright errors should weigh equally.
  • 4-scale pyramid — 1×, 2×, 4×, 8× via box downscale (ssimulacra2 uses 6)
  • O(1)-per-pixel box blur — 1-pass default with fused SIMD kernels
  • 228 trained weights — optimized on 344k synthetic pairs across 6 codecs (mozjpeg, zenjpeg, jpegli, zenwebp, zenavif, zenjxl)
  • AVX2/AVX-512 SIMD throughout via archmage, with safe scalar fallback

Feature layout (per channel per scale)

19 features per channel per scale, all scored:

Basic features (13):

Index Feature Description
0 ssim_mean Mean SSIM error
1 ssim_4th L4-pooled SSIM error (emphasizes worst-case)
2 ssim_2nd L2-pooled SSIM error
3 art_mean Mean edge artifact (ringing, banding)
4 art_4th L4-pooled edge artifact
5 art_2nd L2-pooled edge artifact
6 det_mean Mean detail lost (blur, smoothing)
7 det_4th L4-pooled detail lost
8 det_2nd L2-pooled detail lost
9 mse Mean squared error in XYB
10 hf_energy_loss High-frequency energy loss (L2 ratio)
11 hf_mag_loss High-frequency magnitude loss (L1 ratio)
12 hf_energy_gain High-frequency energy gain (ringing/sharpening)

Peak features (6):

Index Feature Description
13 ssim_max Maximum SSIM error
14 art_max Maximum edge artifact
15 det_max Maximum detail lost
16 ssim_l8 L8-pooled SSIM error (near-worst-case)
17 art_l8 L8-pooled edge artifact
18 det_l8 L8-pooled detail lost

Total: 4 scales × 3 channels × 19 features = 228 weights. FeatureView provides named access to all features.

Feature flags

Flag Default Description
avx512 yes Enable AVX-512 SIMD paths
imgref yes ImageSource impls for imgref::ImgRef<Rgb<u8>> and ImgRef<Rgba<u8>> (stride-aware)
training no Expose metric internals for weight training/research
classification no Error classification API (classify(), DeltaStats, ErrorCategory)

Workspace crates

Crate Description
zensim Core metric library
zensim-regress Visual regression testing — checksum management, tolerance specs, remote reference storage, amplified diff images, side-by-side montages, and sixel terminal display. See zensim-regress/README.md.
zensim-validate Training and validation CLI for weight optimization

Visual diff images (zensim-regress)

zensim-regress generates amplified difference images and comparison montages for debugging visual regressions:

use zensim_regress::diff_image::*;

// Amplified diff: abs(expected - actual) * amplification_factor
let diff = generate_diff_image(&expected, &actual, 10);

// Side-by-side montage: expected | diff | actual (with border)
let montage = create_comparison_montage(&expected, &actual, 10, 2);

// Raw RGBA byte variants also available
let diff = generate_diff_image_raw(&exp_bytes, &act_bytes, w, h, 10);

Auto-save montages on checksum mismatch with .with_diff_output(), or display directly in sixel-capable terminals (foot, WezTerm, mintty). See zensim-regress/README.md for full API docs.

MSRV

Rust 1.89.0 (2024 edition).

License

MIT OR Apache-2.0

AI-Generated Code Notice

Developed with Claude (Anthropic). Not all code manually reviewed. Review critical paths before production use.