# Codec Integration Guide
Practical guide for integrating codec-eval into your image codec project (mozjpeg-rs, jpegli-rs, libavif, etc.).
## Quick Start
Add codec-eval to your dev-dependencies:
```toml
[dev-dependencies]
codec-eval = { git = "https://github.com/imazen/codec-eval" }
```
## Wiring Up Your Codec
codec-eval uses a callback-based design. You provide encode/decode functions, and the library handles metrics, reports, and analysis.
### Basic Pattern
```rust
use codec_eval::{EvalSession, EvalConfig, ImageData, ViewingCondition};
use std::path::PathBuf;
fn setup_session() -> EvalSession {
let config = EvalConfig::builder()
.report_dir(PathBuf::from("./benchmark-reports"))
.viewing(ViewingCondition::desktop())
.quality_levels(vec![20.0, 40.0, 60.0, 80.0, 95.0])
.build();
let mut session = EvalSession::new(config);
// Register your codec with encode callback
session.add_codec(
"my-codec",
env!("CARGO_PKG_VERSION"),
Box::new(|image, request| {
// Your encoding logic here
// image: &ImageData - the source image
// request: &EncodeRequest - contains quality level and params
let quality = request.quality as u8;
let encoded_bytes = my_codec::encode(image, quality)?;
Ok(encoded_bytes)
}),
);
session
}
```
### MozJPEG Example
```rust
use codec_eval::{EvalSession, EvalConfig, ImageData, EncodeRequest, ViewingCondition};
use mozjpeg::{Compress, ColorSpace};
fn register_mozjpeg(session: &mut EvalSession) {
session.add_codec(
"mozjpeg",
"4.1.1", // or pull from mozjpeg-sys version
Box::new(|image, request| {
let (width, height) = image.dimensions();
let rgb_data = image.as_rgb_slice()?;
let mut compress = Compress::new(ColorSpace::JCS_RGB);
compress.set_size(width, height);
compress.set_quality(request.quality as f32);
compress.set_mem_dest();
compress.start_compress();
// Write scanlines
let row_stride = width * 3;
for y in 0..height {
let row_start = y * row_stride;
let row = &rgb_data[row_start..row_start + row_stride];
compress.write_scanlines(&[row]);
}
compress.finish_compress();
Ok(compress.data_to_vec()?)
}),
);
}
```
### Jpegli Example
```rust
use codec_eval::{EvalSession, ImageData, EncodeRequest};
use jpegli::{Encoder, ColorType};
fn register_jpegli(session: &mut EvalSession) {
session.add_codec(
"jpegli",
jpegli::version(),
Box::new(|image, request| {
let (width, height) = image.dimensions();
let rgb_data = image.as_rgb_slice()?;
let encoder = Encoder::new_mem()?;
encoder.set_quality(request.quality as f32)?;
let encoded = encoder.encode(
rgb_data,
width as u32,
height as u32,
ColorType::Rgb,
)?;
Ok(encoded)
}),
);
}
```
### AVIF Example
```rust
use codec_eval::{EvalSession, ImageData, EncodeRequest};
use libavif::{AvifEncoder, AvifImage};
fn register_avif(session: &mut EvalSession) {
session.add_codec_with_params(
"avif",
"1.0.0",
Box::new(|image, request| {
let (width, height) = image.dimensions();
let rgba_data = image.as_rgba_slice()?;
let avif_image = AvifImage::from_rgba(
width as u32,
height as u32,
rgba_data,
)?;
let mut encoder = AvifEncoder::new();
// AVIF quality is 0-63, lower = better
// Map 0-100 scale to 63-0
let avif_quality = ((100.0 - request.quality) * 0.63) as i32;
encoder.set_quality(avif_quality);
// Check for speed param
if let Some(speed) = request.params.get("speed") {
encoder.set_speed(speed.parse().unwrap_or(6));
}
Ok(encoder.encode(&avif_image)?)
}),
);
}
```
## Running Evaluations
### Single Image
```rust
use codec_eval::{ImageData, ViewingCondition};
use imgref::ImgVec;
use rgb::RGB8;
fn evaluate_single_image() -> anyhow::Result<()> {
let session = setup_session();
// Load your test image
let img: ImgVec<RGB8> = load_png("test_images/photo.png")?;
let image_data = ImageData::from_imgvec(&img);
// Run evaluation across all quality levels
let report = session.evaluate_image("photo.png", image_data)?;
// Check results
for result in &report.results {
println!(
"{} q={}: {} bytes, DSSIM={:.6}",
result.codec_id,
result.quality,
result.encoded_size,
result.metrics.dssim.unwrap_or(0.0)
);
}
Ok(())
}
```
### Corpus Evaluation
```rust
fn evaluate_corpus() -> anyhow::Result<()> {
let session = setup_session();
// Evaluate all images in a directory
let report = session.evaluate_corpus("./test_images")?;
// Write reports
session.write_reports(&report)?;
// Creates:
// benchmark-reports/results.csv
// benchmark-reports/results.json
// benchmark-reports/summary.json
Ok(())
}
```
### Using Sparse Checkout for Test Corpora
Download only the images you need from large corpus repositories:
```rust
use codec_eval::corpus::sparse::{SparseCheckout, SparseFilter};
fn setup_test_corpus() -> anyhow::Result<()> {
// Clone with only PNG photos
let checkout = SparseCheckout::clone_shallow(
"https://github.com/imazen/codec-corpus",
"./codec-corpus",
1, // depth=1 for speed
)?;
// Download only what you need
checkout.set_filters(&[
SparseFilter::Format("png".to_string()),
SparseFilter::Category("photos".to_string()),
])?;
checkout.checkout()?;
let status = checkout.status()?;
println!("Downloaded {} files", status.checked_out_files);
Ok(())
}
```
## Quality Assertions for CI
### Quick Quality Checks (New in 0.3)
For simple tests without full corpus evaluation:
```rust
use codec_eval::{assert_quality, assert_perception_level, PerceptionLevel};
use codec_eval::metrics::prelude::*;
#[test]
fn test_encoding_quality() -> Result<(), Box<dyn std::error::Error>> {
// Load test image
let original = load_png_as_rgb8("test.png")?;
let (width, height) = (original.width(), original.height());
// Encode at quality 80
let encoded_bytes = my_codec::encode(&original, 80)?;
let decoded = my_codec::decode(&encoded_bytes)?;
// Convert to ImgVec<RGB8>
let reference = ImgVec::new(original, width, height);
let distorted = ImgVec::new(decoded, width, height);
// Option 1: Specific thresholds
assert_quality(&reference, &distorted,
Some(80.0), // min SSIMULACRA2 score
Some(0.003) // max DSSIM
)?;
// Option 2: Semantic quality levels
assert_perception_level(&reference, &distorted,
PerceptionLevel::Subtle // DSSIM < 0.0015
)?;
Ok(())
}
```
**Perception Levels** (based on DSSIM):
- `Imperceptible` - DSSIM < 0.0003 (requires near-lossless)
- `Marginal` - DSSIM < 0.0007 (excellent quality)
- `Subtle` - DSSIM < 0.0015 (high quality, suitable for most use cases)
- `Noticeable` - DSSIM < 0.003 (acceptable quality)
- `Degraded` - DSSIM ≥ 0.003 (visible artifacts)
### Full Corpus Evaluation
```rust
use codec_eval::metrics::PerceptionLevel;
#[test]
fn test_quality_at_q80() {
let session = setup_session();
let img = load_test_image();
let report = session.evaluate_image("test.png", img).unwrap();
for result in report.results.iter().filter(|r| r.quality == 80.0) {
let dssim = result.metrics.dssim.unwrap();
// Assert quality is at least "subtle" (imperceptible to most viewers)
assert!(
dssim < PerceptionLevel::Subtle.threshold(),
"{} at q80 has DSSIM {:.6}, expected < {:.6}",
result.codec_id,
dssim,
PerceptionLevel::Subtle.threshold()
);
}
}
```
### Migrating from Direct Metric Usage
If you're currently using dssim-core, butteraugli, or fast-ssim2 directly in your tests, codec-eval 0.3 can simplify your code significantly.
**Before** (manual metric setup):
```rust
use dssim_core::Dssim;
use rgb::RGBA;
fn compute_dssim(orig: &[u8], comp: &[u8], width: usize, height: usize) -> f64 {
let attr = Dssim::new();
// Manual RGBA conversion
let orig_rgba: Vec<RGBA<u8>> = orig
.chunks(3)
.map(|c| RGBA::new(c[0], c[1], c[2], 255))
.collect();
let comp_rgba: Vec<RGBA<u8>> = comp
.chunks(3)
.map(|c| RGBA::new(c[0], c[1], c[2], 255))
.collect();
let orig_img = attr.create_image_rgba(&orig_rgba, width, height).unwrap();
let comp_img = attr.create_image_rgba(&comp_rgba, width, height).unwrap();
let (dssim, _) = attr.compare(&orig_img, comp_img);
dssim.into()
}
#[test]
fn test_quality() {
let (orig, width, height) = load_test_image();
let encoded = encode(&orig, width, height, 80);
let decoded = decode(&encoded);
let dssim = compute_dssim(&orig, &decoded, width, height);
assert!(dssim < 0.003, "Quality too low: DSSIM {}", dssim);
}
```
**After** (codec-eval helpers):
```rust
use codec_eval::{assert_quality, metrics::prelude::*};
#[test]
fn test_quality() -> Result<(), Box<dyn std::error::Error>> {
let (orig, width, height) = load_test_image();
let encoded = encode(&orig, width, height, 80);
let decoded = decode(&encoded);
// Convert to ImgVec<RGB8>
let reference = ImgVec::new(
orig.chunks_exact(3)
.map(|c| RGB8::new(c[0], c[1], c[2]))
.collect(),
width, height
);
let distorted = ImgVec::new(
decoded.chunks_exact(3)
.map(|c| RGB8::new(c[0], c[1], c[2]))
.collect(),
width, height
);
assert_quality(&reference, &distorted, Some(80.0), Some(0.003))?;
Ok(())
}
```
**Benefits:**
- ~10 lines vs 30+ lines per test
- Unified imports via `metrics::prelude`
- Single dependency instead of 3+ metric crates
- Consistent metric versions across projects
### Regression Testing Against Baseline
```rust
#[test]
fn test_no_quality_regression() {
let session = setup_session();
let report = session.evaluate_corpus("./test_images").unwrap();
// Load previous baseline
let baseline: CorpusReport = serde_json::from_str(
&std::fs::read_to_string("baseline.json").unwrap()
).unwrap();
// Compare each image/quality combination
for result in &report.results {
if let Some(base) = baseline.find_result(&result.image_name, result.quality) {
let dssim = result.metrics.dssim.unwrap();
let base_dssim = base.metrics.dssim.unwrap();
// Allow 5% regression tolerance
let tolerance = base_dssim * 1.05;
assert!(
dssim <= tolerance,
"Quality regression: {} at q{} DSSIM {:.6} > baseline {:.6}",
result.image_name,
result.quality,
dssim,
base_dssim
);
}
}
}
```
### Size Regression Testing
```rust
#[test]
fn test_no_size_regression() {
let session = setup_session();
let report = session.evaluate_corpus("./test_images").unwrap();
let baseline = load_baseline();
for result in &report.results {
if let Some(base) = baseline.find_result(&result.image_name, result.quality) {
// Allow 2% size increase
let max_size = (base.encoded_size as f64 * 1.02) as usize;
assert!(
result.encoded_size <= max_size,
"Size regression: {} at q{} is {} bytes, baseline {} bytes",
result.image_name,
result.quality,
result.encoded_size,
base.encoded_size
);
}
}
}
```
## GitHub Actions Integration
Add this to your codec's `.github/workflows/quality.yml`:
```yaml
name: Quality Benchmarks
on:
pull_request:
push:
branches: [main]
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Rust
uses: dtolnay/rust-action@stable
- name: Download test corpus
run: |
cargo run -p codec-eval-cli -- sparse clone \
https://github.com/imazen/codec-corpus \
./corpus \
--depth 1 \
--format png \
--category photos
- name: Run benchmarks
run: cargo test --release quality_benchmarks
- name: Compare to baseline
run: |
cargo run -p codec-eval-cli -- stats \
-i benchmark-reports/results.json \
--compare baseline.json
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: benchmark-reports/
# Optional: Update baseline on main branch
- name: Update baseline
if: github.ref == 'refs/heads/main'
run: cp benchmark-reports/results.json baseline.json
```
## Interpreting Results
### DSSIM Thresholds
| < 0.0003 | Imperceptible | Mathematically different, visually identical |
| < 0.0007 | Marginal | Requires careful inspection to notice |
| < 0.0015 | Subtle | Visible if you know where to look |
| < 0.003 | Noticeable | Casual viewers may notice |
| >= 0.003 | Degraded | Obviously degraded |
### Quality Targets by Use Case
| Archival | < 0.0003 | q95+ |
| Photography site | < 0.001 | q85-92 |
| General web | < 0.002 | q75-85 |
| Thumbnails | < 0.01 | q60-75 |
| Aggressive compression | < 0.05 | q40-60 |
### Viewing Conditions
Choose based on your target audience:
```rust
// Desktop/laptop users at normal distance
ViewingCondition::desktop() // 40 PPD
// Mobile-first or retina displays
ViewingCondition::smartphone() // 90 PPD
// Mixed audience (conservative)
ViewingCondition::laptop() // 60 PPD
// Custom: high-end photo site on retina
ViewingCondition::new(60.0)
.with_browser_dppx(2.0)
.with_image_intrinsic_dppx(2.0)
```
Higher PPD = more demanding quality threshold. Mobile users on retina displays will notice artifacts that desktop users miss.
## Pareto Analysis
Find the best codec at each quality/size tradeoff:
```rust
use codec_eval::stats::{ParetoFront, RDPoint};
fn analyze_pareto() -> anyhow::Result<()> {
let report = load_benchmark_results()?;
// Convert to rate-distortion points
let points: Vec<RDPoint> = report.results.iter().map(|r| {
RDPoint {
codec: r.codec_id.clone(),
bpp: r.bits_per_pixel(),
quality: 1.0 - r.metrics.dssim.unwrap(), // Convert to quality (higher = better)
quality_metric: "dssim".to_string(),
}
}).collect();
let front = ParetoFront::compute(&points);
println!("Pareto-optimal points:");
for point in front.points() {
println!(" {} @ {:.3} bpp: quality {:.4}",
point.codec, point.bpp, point.quality);
}
// Find best codec at specific bit rate
if let Some(best) = front.best_at_bpp(0.5) {
println!("Best codec at 0.5 bpp: {}", best.codec);
}
Ok(())
}
```
## CLI Usage
The codec-eval CLI provides quick benchmarking without writing code:
```bash
# Discover images in a directory
codec-eval corpus discover ./test_images -o corpus.json
# Import results from another tool's CSV
codec-eval import -i results.csv -o results.json
# Calculate Pareto front
codec-eval pareto -i results.json -o pareto.json --metric dssim
# Show statistics
codec-eval stats -i results.json --by-codec
# Sparse checkout of test corpus
codec-eval sparse clone https://github.com/imazen/codec-corpus ./corpus \
--depth 1 --format png --category photos
```
## Recommended Test Corpora
| Kodak | 24 | Quick validation | `--category kodak` |
| CLIC 2024 | 62 | High-res photos | `--category clic` |
| Tecnick | 100 | Diverse content | `--category tecnick` |
| CID22 | 250 | Research validation | Full checkout recommended |
## Troubleshooting
### "Dimension mismatch" errors
Ensure your decode function returns the same dimensions as input:
```rust
// Bad: decoder may return different size
let decoded = my_codec::decode(&encoded)?;
// Good: verify dimensions match
let decoded = my_codec::decode(&encoded)?;
assert_eq!(decoded.width(), original.width());
assert_eq!(decoded.height(), original.height());
```
### Inconsistent DSSIM results
DSSIM is sensitive to color space. Ensure you're using the same color space throughout:
```rust
// Ensure sRGB throughout the pipeline
let img = image::open(path)?.to_rgb8();
```
### High DSSIM on certain images
Some image types naturally compress worse:
- Fine text/diagrams: Use higher quality or lossless
- Film grain: Consider denoising before compression
- Gradients: Check for banding artifacts
## Contributing Improvements
We welcome contributions from codec developers! See [CONTRIBUTING.md](CONTRIBUTING.md) for:
- Adding new metrics
- Improving viewing condition models
- Adding codec-specific optimizations
- Sharing benchmark results