# jpegli-rs
> **⚠️ This crate is being renamed to [`zenjpeg`](https://crates.io/crates/zenjpeg).**
>
> After six rewrites and significant divergence from the original jpegli, we're renaming to better reflect that this is now an independent project. Please migrate to `zenjpeg` for future updates.
>
> ```toml
> # Old (deprecated)
> jpegli-rs = "0.11"
>
> # New (recommended)
> zenjpeg = "0.12"
> ```
[](https://crates.io/crates/jpegli-rs)
[](https://docs.rs/jpegli-rs)
[](https://github.com/imazen/jpegli-rs/actions/workflows/ci.yml)
[](LICENSE)
A pure Rust JPEG encoder and decoder with perceptual optimizations.
## Heritage and Divergence
This project started as a port of [jpegli](https://github.com/libjxl/libjxl/tree/main/lib/jpegli), Google's improved JPEG encoder from the JPEG XL project. After six rewrites it has diverged significantly and is being renamed to **zenjpeg**.
**Ideas adopted from jpegli:**
- Adaptive quantization (content-aware bit allocation)
- XYB color space with ICC profiles (note: XYB support is currently poor, ~5 SSIMULACRA2 behind C++)
- Perceptually-tuned quantization tables
- Zero-bias strategies for coefficient rounding
**Ideas adopted from mozjpeg:**
- Overshoot deringing for documents/graphics
- Trellis quantization for optimal coefficient selection
- Hybrid approach combining jpegli's AQ with mozjpeg's trellis
**Where we went our own way:**
- Pure Rust, `#![forbid(unsafe_code)]` by default (unsafe SIMD is opt-in)
- Streaming encoder API for memory efficiency (process images row-by-row)
- Portable SIMD via `wide` crate instead of platform intrinsics
- Parallel encoding support
- UltraHDR support (HDR gain maps for backward-compatible HDR JPEGs)
- Independent optimizations and bug fixes
## Features
- **Pure Rust** - No C/C++ dependencies, builds anywhere Rust does
- **Perceptual optimization** - Adaptive quantization for better visual quality at smaller sizes
- **Trellis quantization** - Optimal coefficient selection from mozjpeg
- **Overshoot deringing** - Eliminates ringing artifacts on documents and graphics (enabled by default)
- **Backward compatible** - Produces standard JPEG files readable by any decoder
- **SIMD accelerated** - Portable SIMD via `wide` crate
- **Streaming API** - Memory-efficient row-by-row encoding for large images
- **Parallel encoding** - Multi-threaded for large images (1024x1024+)
- **UltraHDR support** - Encode/decode HDR gain maps (optional `ultrahdr` feature)
- **Color management** - Optional ICC profile support
## Known Limitations
- **XYB color space** - Currently ~5 SSIMULACRA2 points behind C++ jpegli. Use YCbCr for best quality.
- **Decoder speed** - Prioritizes precision (12-bit pipeline) over speed; ~8x slower than zune-jpeg.
## API Reference
### Encoder API
All encoder types are in `jpegli::encoder`:
```rust
use jpegli::encoder::{
EncoderConfig, PixelLayout, Quality, ChromaSubsampling, Unstoppable
};
```
#### Quick Start
```rust
use jpegli::encoder::{EncoderConfig, PixelLayout, ChromaSubsampling, Unstoppable};
// Create reusable config (quality and color mode set in constructor)
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter)
.progressive(true);
// Encode from raw bytes
let mut enc = config.encode_from_bytes(1920, 1080, PixelLayout::Rgb8Srgb)?;
enc.push_packed(&rgb_bytes, Unstoppable)?;
let jpeg = enc.finish()?;
```
#### Three Encoder Entry Points
| `encode_from_bytes(w, h, layout)` | `&[u8]` | Raw byte buffers |
| `encode_from_rgb::<P>(w, h)` | `rgb` crate types | `RGB<u8>`, `RGBA<f32>`, etc. |
| `encode_from_ycbcr_planar(w, h)` | `YCbCrPlanes` | Video decoder output |
#### Examples
```rust
use jpegli::encoder::{EncoderConfig, PixelLayout, ChromaSubsampling, Unstoppable};
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter);
// From raw RGB bytes
let mut enc = config.encode_from_bytes(800, 600, PixelLayout::Rgb8Srgb)?;
enc.push_packed(&rgb_bytes, Unstoppable)?;
let jpeg = enc.finish()?;
// From rgb crate types
use rgb::RGB;
let mut enc = config.encode_from_rgb::<RGB<u8>>(800, 600)?;
enc.push_packed(&pixels, Unstoppable)?;
let jpeg = enc.finish()?;
// From planar YCbCr (video pipelines)
let mut enc = config.encode_from_ycbcr_planar(1920, 1080)?;
enc.push(&planes, num_rows, Unstoppable)?;
let jpeg = enc.finish()?;
```
#### EncoderConfig Constructors
Choose one constructor based on desired color mode:
| `EncoderConfig::ycbcr(q, sub)` | YCbCr | Standard JPEG (most compatible) |
| `EncoderConfig::xyb(q, b_sub)` | XYB | Perceptual color space (better quality) |
| `EncoderConfig::grayscale(q)` | Grayscale | Single-channel output |
#### Builder Methods
| `.progressive(bool)` | Progressive JPEG (~3% smaller) | `false` |
| `.optimize_huffman(bool)` | Optimal Huffman tables | `true` |
| `.deringing(bool)` | Overshoot deringing for documents/graphics | `true` |
| `.sharp_yuv(bool)` | SharpYUV downsampling | `false` |
| `.separate_chroma_tables(bool)` | Use 3 quant tables (Y, Cb, Cr) vs 2 (Y, shared) | `true` |
| `.icc_profile(bytes)` | Attach ICC profile | None |
| `.exif(exif)` | Embed EXIF metadata | None |
| `.xmp(data)` | Embed XMP metadata | None |
| `.restart_interval(n)` | MCUs between restart markers | 0 |
#### Quality Options
```rust
use jpegli::encoder::{EncoderConfig, Quality, ChromaSubsampling};
// Simple quality scale (0-100)
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter);
// Quality enum variants
let config = EncoderConfig::ycbcr(
Quality::ApproxJpegli(85.0), // Default scale
ChromaSubsampling::Quarter
);
// Or: Quality::ApproxMozjpeg(80) - Match mozjpeg output
// Or: Quality::ApproxSsim2(90.0) - Target SSIMULACRA2 score
// Or: Quality::ApproxButteraugli(1.0) - Target butteraugli distance
```
#### Pixel Layouts
| `Rgb8Srgb` | 3 | Default, sRGB gamma |
| `Bgr8Srgb` / `Bgrx8Srgb` | 3/4 | Windows/GDI order |
| `Rgbx8Srgb` | 4 | 4th byte ignored |
| `Gray8Srgb` | 1 | Grayscale sRGB |
| `Rgb16Linear` | 6 | 16-bit linear |
| `RgbF32Linear` | 12 | HDR float (0.0-1.0) |
| `YCbCr8` / `YCbCrF32` | 3/12 | Pre-converted YCbCr |
#### Chroma Subsampling
```rust
use jpegli::encoder::{EncoderConfig, ChromaSubsampling, XybSubsampling};
// YCbCr subsampling
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter); // 4:2:0 (best compression)
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::None); // 4:4:4 (best quality)
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::HalfHorizontal); // 4:2:2
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::HalfVertical); // 4:4:0
// XYB B-channel subsampling
let config = EncoderConfig::xyb(85, XybSubsampling::BQuarter); // B at 4:2:0
let config = EncoderConfig::xyb(85, XybSubsampling::Full); // No subsampling
```
#### Resource Estimation
```rust
use jpegli::encoder::{EncoderConfig, ChromaSubsampling};
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter);
// Typical memory estimate
let estimate = config.estimate_memory(1920, 1080);
// Guaranteed upper bound (for resource reservation)
let ceiling = config.estimate_memory_ceiling(1920, 1080);
```
---
### Decoder API
> **Prerelease:** The decoder API is behind the `decoder` feature flag and will have breaking changes.
> Enable with `jpegli-rs = { version = "...", features = ["decoder"] }`.
All decoder types are in `jpegli::decoder`:
```rust
use jpegli::decoder::{Decoder, DecodedImage, DecodedImageF32, DecoderConfig};
```
#### Basic Decoding
```rust
// Decode to RGB (default)
let image = Decoder::new().decode(&jpeg_data)?;
let pixels: &[u8] = image.pixels();
let (width, height) = image.dimensions();
```
#### High-Precision Decoding (f32)
Preserves jpegli's 12-bit internal precision:
```rust
let image: DecodedImageF32 = Decoder::new().decode_f32(&jpeg_data)?;
let pixels: &[f32] = image.pixels(); // Values in 0.0-1.0
// Convert to 8-bit or 16-bit when needed
let u8_pixels: Vec<u8> = image.to_u8();
let u16_pixels: Vec<u16> = image.to_u16();
```
#### YCbCr Output (Zero Color Conversion)
For video pipelines or re-encoding:
```rust
use jpegli::decoder::{Decoder, DecodedYCbCr};
let ycbcr: DecodedYCbCr = Decoder::new().decode_to_ycbcr_f32(&jpeg_data)?;
// Access Y, Cb, Cr planes directly (f32, range [-128, 127])
```
#### Reading JPEG Info Without Decoding
```rust
let info = Decoder::new().read_info(&jpeg_data)?;
println!("{}x{}, {} components", info.width, info.height, info.num_components);
```
#### Decoder Options
| `.output_format(fmt)` | Output pixel format | `Rgb` |
| `.fancy_upsampling(bool)` | Smooth chroma upsampling | `true` |
| `.block_smoothing(bool)` | DCT block edge smoothing | `false` |
| `.apply_icc(bool)` | Apply embedded ICC profile | `true` |
| `.max_pixels(n)` | Pixel count limit (DoS protection) | 100M |
| `.max_memory(n)` | Memory limit in bytes | 512 MB |
#### Decoded Image Methods
```rust
let image = Decoder::new().decode(&jpeg_data)?;
image.width() // Image width
image.height() // Image height
image.dimensions() // (width, height) tuple
image.pixels() // &[u8] pixel data
image.bytes_per_pixel() // Bytes per pixel for format
image.stride() // Bytes per row
```
#### DecoderConfig (Advanced)
```rust
use jpegli::decoder::{Decoder, DecoderConfig};
// Most users should use the builder methods instead:
let image = Decoder::new()
.fancy_upsampling(true)
.block_smoothing(false)
.apply_icc(true)
.max_pixels(100_000_000)
.max_memory(512 * 1024 * 1024)
.decode(&jpeg_data)?;
// Or construct DecoderConfig directly:
let config = DecoderConfig::default();
let decoder = Decoder::from_config(config);
```
## Performance
### Encoding Speed
| 512x512 | 118 MP/s | 58 MP/s | Small images |
| 1024x1024 | 92 MP/s | 36 MP/s | Medium images |
| 2048x2048 | 87 MP/s | 46 MP/s | Large images |
### Sequential vs Progressive
| Q50 | 322 KB | 313 KB | **-2.8%** | 2.5x |
| Q70 | 429 KB | 416 KB | **-3.0%** | 2.0x |
| Q85 | 586 KB | 568 KB | **-3.1%** | 2.1x |
| Q95 | 915 KB | 887 KB | **-3.1%** | 2.2x |
**Progressive produces ~3% smaller files** at the same quality, but takes ~2x longer.
**Recommendation:**
- Use **Sequential** for: real-time encoding, high throughput
- Use **Progressive** for: web delivery, storage optimization
### Decoding Speed
| zune-jpeg | 392 MP/s | Integer IDCT, AVX2 |
| jpeg-decoder | 120 MP/s | Integer IDCT |
| **jpegli-rs** | **47 MP/s** | f32 IDCT, 12-bit precision |
The decoder prioritizes precision over speed, matching C++ jpegli's 12-bit pipeline.
## Table Optimization
The `EncodingTables` API provides fine-grained control over quantization and zero-bias
tables for researching better encoding parameters.
### Quick Start
```rust
use jpegli::encoder::{EncoderConfig, ChromaSubsampling};
use jpegli::encoder::tuning::{EncodingTables, ScalingParams, dct};
// Start from defaults and modify
let mut tables = EncodingTables::default_ycbcr();
// Scale a specific coefficient (component 0 = Y, k = coefficient index)
tables.scale_quant(0, 5, 1.2); // 20% higher quantization at position 5
// Or use exact quantization values (no quality scaling)
tables.scaling = ScalingParams::Exact;
tables.quant.c0[0] = 16.0; // DC quantization for Y
let config = EncoderConfig::ycbcr(85.0, ChromaSubsampling::Quarter)
.tables(Box::new(tables));
```
### Understanding the Parameters
**Quantization Tables** (`quant`): 64 coefficients per component (Y/Cb/Cr or X/Y/B)
- Lower values = more precision = larger file
- Higher values = more compression = smaller file
- DC (index 0) affects brightness uniformity
- Low frequencies (indices 1, 8, 9, 16, 17) affect gradients
- High frequencies affect edges and texture
**Zero-Bias Tables** (`zero_bias_mul`, `zero_bias_offset_*`):
- Control rounding behavior during quantization
- `zero_bias_mul[k]` multiplies the dead zone around zero
- Higher values = more aggressive zeroing of small coefficients = smaller files
- `zero_bias_offset_dc/ac` add to the threshold before zeroing
**Scaling Params**:
- `ScalingParams::Scaled { global_scale, frequency_exponents }` - quality-dependent scaling
- `ScalingParams::Exact` - use raw values (must be valid u16 range)
### DCT Coefficient Layout
```
Position in 8x8 block (row-major index k):
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
k=0 is DC (average brightness)
k=1,8 are lowest AC frequencies (horizontal/vertical gradients)
k=63 is highest frequency (diagonal detail)
```
Use `dct::freq_distance(k)` to get Manhattan distance from DC (0-14).
Use `dct::IMPORTANCE_ORDER` for coefficients sorted by perceptual impact.
### Research Methodology
#### 1. Corpus-Based Optimization
```rust
use jpegli::encoder::tuning::{EncodingTables, dct};
fn evaluate_tables(tables: &EncodingTables, corpus: &[Image]) -> f64 {
let mut total_score = 0.0;
for image in corpus {
let jpeg = encode_with_tables(image, tables);
let score = ssimulacra2_per_byte(&jpeg, image); // quality/size
total_score += score;
}
total_score / corpus.len() as f64
}
// Grid search over coefficient k
fn optimize_coefficient(k: usize, component: usize, corpus: &[Image]) {
let mut best_score = f64::MIN;
let mut best_value = 1.0;
for scale in [0.5, 0.75, 1.0, 1.25, 1.5, 2.0] {
let mut tables = EncodingTables::default_ycbcr();
tables.scale_quant(component, k, scale);
let score = evaluate_tables(&tables, corpus);
if score > best_score {
best_score = score;
best_value = scale;
}
}
println!("Coefficient {} best scale: {}", k, best_value);
}
```
#### 2. Gradient-Free Optimization
For automated discovery, use derivative-free optimizers:
```rust
// Using argmin crate with Nelder-Mead
use argmin::solver::neldermead::NelderMead;
fn objective(params: &[f64], corpus: &[Image]) -> f64 {
let mut tables = EncodingTables::default_ycbcr();
// Map params to table modifications (e.g., first 10 most impactful coefficients)
for (i, &scale) in params.iter().enumerate() {
let k = dct::IMPORTANCE_ORDER[i + 1]; // Skip DC
tables.scale_quant(0, k, scale as f32); // Y component
}
-evaluate_tables(&tables, corpus) // Negative because we minimize
}
```
**Recommended optimizers:**
- **CMA-ES** (Covariance Matrix Adaptation): Best for 10-50 parameters
- **Nelder-Mead**: Good for quick exploration, 5-20 parameters
- **Differential Evolution**: Robust, handles constraints well
- **Bayesian Optimization**: Sample-efficient when evaluations are expensive
#### 3. Image-Adaptive Tables
Different image categories may benefit from different tables:
| Photographs | Lower DC/low-freq quant, preserve gradients |
| Graphics/UI | Higher high-freq quant, preserve edges |
| Text on photos | Balance - preserve both |
| Skin tones | Lower Cb/Cr quant in mid frequencies |
```rust
fn classify_and_encode(image: &Image) -> Vec<u8> {
let tables = match classify_content(image) {
ContentType::Photo => tables_optimized_for_photos(),
ContentType::Graphic => tables_optimized_for_graphics(),
ContentType::Mixed => EncodingTables::default_ycbcr(),
};
encode_with_tables(image, &tables)
}
```
#### 4. Perceptual Weighting
Use quality metrics to weight optimization:
```rust
// SSIMULACRA2 weights certain frequencies more than others
// Butteraugli penalizes different artifacts
fn multi_metric_score(jpeg: &[u8], original: &Image) -> f64 {
let ssim2 = ssimulacra2(jpeg, original);
let butteraugli = butteraugli_distance(jpeg, original);
let size = jpeg.len() as f64;
// Combine: higher quality, lower butteraugli, smaller size
(ssim2 * 100.0 - butteraugli * 10.0) / (size / 1000.0)
}
```
### Ideas for Research
1. **Content-aware table selection**: Train a classifier to select optimal tables
2. **Quality-dependent tables**: Different tables for Q50 vs Q90
3. **Resolution-dependent**: High-res images may need different high-freq handling
4. **Per-block adaptive**: Use AQ to modulate per-block quantization
5. **Machine learning**: Use differentiable JPEG approximations to train tables
6. **Genetic algorithms**: Evolve table populations over a corpus
7. **Transfer learning**: Start from optimized tables for similar content
### Available Helpers
```rust
use jpegli::encoder::tuning::dct;
// Coefficient analysis
dct::freq_distance(k) // Manhattan distance from DC (0-14)
dct::row_col(k) // (row, col) in 8x8 block
dct::to_zigzag(k) // Row-major to zigzag order
dct::from_zigzag(z) // Zigzag to row-major
dct::IMPORTANCE_ORDER // Coefficients by perceptual impact
// Table manipulation
tables.scale_quant(c, k, factor) // Scale one coefficient
tables.perturb_quant(c, k, delta) // Add delta to coefficient
tables.blend(&other, t) // Linear interpolation (0.0-1.0)
tables.quant.scale_component(c, f) // Scale entire component
tables.quant.scale_all(f) // Scale all coefficients
```
## Overshoot Deringing
**Enabled by default.** This technique was pioneered by [@kornel](https://github.com/kornelski)
in [mozjpeg](https://github.com/mozilla/mozjpeg) and significantly improves quality for
documents, screenshots, and graphics without any quality penalty for photographic content.
### The Problem
JPEG uses DCT (Discrete Cosine Transform) which represents pixel blocks as sums of cosine
waves. Hard edges—like text on a white background—create high-frequency components that
are difficult to represent accurately. The result is "ringing": oscillating artifacts that
look like halos or waves emanating from sharp transitions.
### The Insight
JPEG decoders clamp output values to 0-255. This means to display white (255), any encoded
value ≥255 works identically after clamping. The encoder can exploit this "headroom" above
the displayable range.
### The Solution
Instead of encoding a flat plateau at the maximum value, deringing creates a smooth curve
that "overshoots" above the maximum:
- The peak (above 255) gets clamped to 255 on decode
- The result looks identical to the original
- But the smooth curve compresses much better with fewer artifacts!
This is analogous to "anti-clipping" in audio processing.
### When It Helps Most
- Documents and screenshots with white backgrounds
- Text and graphics with hard edges
- Any image with saturated regions (pixels at 0 or 255)
- UI elements with sharp corners
### Usage
Deringing is **on by default**. To disable it (not recommended):
```rust
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter)
.deringing(false); // Disable deringing
```
## C++ Parity Status
Tested against C++ jpegli on frymire.png (1118x1105):
| File size (Q85 seq) | 586.3 KB | 586.7 KB | **-0.1%** |
| File size (Q85 prog) | 568.2 KB | 565.1 KB | **+0.5%** |
| SSIM2 (Q85) | 69.0 | 69.0 | **identical** |
Quality is identical; file sizes within 0.5%.
### Comparing with C++ jpegli: 2 vs 3 Quantization Tables
When comparing output between jpegli-rs and C++ jpegli, **use `jpegli_set_distance()`
in C++**, not `jpeg_set_quality()`. Here's why:
**The issue:**
- `jpeg_set_quality()` in C++ uses **2 chroma tables** (Cb and Cr share the same table)
- `jpegli_set_distance()` in C++ uses **3 tables** (separate Y, Cb, Cr tables)
- jpegli-rs **always uses 3 tables**
Using `jpeg_set_quality()` for comparison will show ~4% file size differences and
different quantization behavior because the encoders are configured differently.
**Correct comparison (FFI):**
```c
// C++ - use distance-based quality (3 tables)
jpegli_set_distance(&cinfo, 1.0, JPEGLI_TRUE); // distance 1.0 ≈ quality 90
// NOT: jpeg_set_quality(&cinfo, 90, TRUE); // 2 tables - invalid comparison!
```
**Quality to distance conversion:**
```rust
fn quality_to_distance(q: f32) -> f32 {
if q >= 100.0 { 0.01 }
else if q >= 30.0 { 0.1 + (100.0 - q) * 0.09 }
else { 53.0 / 3000.0 * q * q - 23.0 / 20.0 * q + 25.0 }
}
// q90 → distance 1.0, q75 → distance 2.35
```
With proper distance-based comparison, size and quality differences are typically
within ±1%.
**Matching jpeg_set_quality() behavior:**
If you need output that matches tools using `jpeg_set_quality()` (2 tables),
use the `.separate_chroma_tables(false)` option:
```rust
// Match jpeg_set_quality() behavior (2 tables: Y, shared chroma)
let config = EncoderConfig::ycbcr(85, ChromaSubsampling::Quarter)
.separate_chroma_tables(false);
```
## Feature Flags
| `decoder` | No | Enable decoder API (prerelease, API will change) |
| `ultrahdr` | No | UltraHDR HDR gain map encoding/decoding (requires `decoder`) |
| `cms-lcms2` | Yes | Color management via lcms2 |
| `cms-moxcms` | No | Pure Rust color management |
| `unsafe_simd` | No | Raw AVX2/SSE intrinsics (~10-20% faster) |
| `test-utils` | Yes | Testing utilities |
By default, the crate uses `#![forbid(unsafe_code)]`. SIMD is provided via the safe, portable `wide` crate. Enable `unsafe_simd` for raw intrinsics on x86_64.
```toml
[dependencies]
jpegli-rs = "0.11"
# With UltraHDR support:
jpegli-rs = { version = "0.11", features = ["ultrahdr"] }
# Minimal (no CMS):
jpegli-rs = { version = "0.11", default-features = false }
# With unsafe SIMD (x86_64 only):
jpegli-rs = { version = "0.11", features = ["unsafe_simd"] }
```
## Encoder Status
| Baseline JPEG | Working |
| Progressive JPEG | Working |
| Adaptive quantization | Working |
| Huffman optimization | Working |
| 4:4:4 / 4:2:0 / 4:2:2 / 4:4:0 | Working |
| XYB color space | Working |
| Grayscale | Working |
| Custom quant tables | Working |
| ICC profile embedding | Working |
| YCbCr planar input | Working |
## Decoder Status
> **Prerelease:** Enable with `features = ["decoder"]`. API will have breaking changes.
| Baseline JPEG | Working |
| Progressive JPEG | Working |
| All subsampling modes | Working |
| Restart markers | Working |
| ICC profile extraction | Working |
| XYB decoding | Working (with CMS) |
| f32 output | Working |
## Future Optimization Opportunities
Profiling against C++ jpegli reveals these bottlenecks (2K image, progressive 4:2:0):
| **RGB→YCbCr** | 11.7% | 1.7% | **6.9x** | Biggest opportunity |
| **Adaptive quantization** | 28.6% | 12.1% | **2.4x** | Algorithm efficiency |
| **Huffman freq counting** | 5.7% | 0.5% | **11x** | Already SIMD, still slow |
| DCT | 7.3% | 5.5% | 1.3x | Reasonable |
| Entropy encoding | 10.9% | 35.9% | — | C++ slower here |
**Crates to investigate for RGB→YCbCr:**
- [`yuv`](https://lib.rs/crates/yuv) (0.8.9) - Faster than libyuv, AVX-512/AVX2/SSE/NEON
- [`yuvutils-rs`](https://lib.rs/crates/yuvutils-rs) - AVX2/SSE/NEON, optional AVX-512
- [`dcv-color-primitives`](https://lib.rs/crates/dcv-color-primitives) - AWS, AVX2/NEON
Current gap: Rust is **~1.6-1.9x slower** than C++ jpegli (fair FFI comparison).
## Development
### Verify C++ Parity
```bash
# Quick parity test (no C++ build needed)
cargo test --release --test cpp_parity_locked
# Full comparison (requires C++ jpegli built)
cargo test --release --test comprehensive_cpp_comparison -- --nocapture --ignored
```
### Building C++ Reference (Optional)
```bash
git submodule update --init --recursive
cd internal/jpegli-cpp && mkdir -p build && cd build
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DJPEGXL_ENABLE_TOOLS=ON ..
ninja cjpegli djpegli
```
## License
**AGPL-3.0-or-later**
A commercial license is available from https://imageresizing.net/pricing
## Acknowledgments
Originally a port of [jpegli](https://github.com/libjxl/libjxl/tree/main/lib/jpegli)
from the JPEG XL project by Google (BSD-3-Clause). After six rewrites, this is now
an independent project that shares ideas but little code with the original.
## AI Disclosure
Developed with assistance from Claude (Anthropic). Extensively tested against
C++ reference with 340+ tests. Report issues at https://github.com/imazen/jpegli-rs/issues