Expand description
SIMD-accelerated primitives for jxl_encoder.
This crate wraps platform-specific SIMD intrinsics behind safe public functions.
The main encoder crate (jxl_encoder) maintains #![forbid(unsafe_code)] and
calls into these safe wrappers.
Uses archmage for token-based SIMD dispatch and magetypes for cross-platform vector types.
§Direct variant access
Each kernel is available in three forms:
- A dispatching function (e.g.
dct_8x8) that picks the best at runtime - Concrete
_avx2(token, ...)/_neon(token, ...)/_scalar(...)variants
For hot loops, callers should summon a token once, then call the concrete
variant directly from an #[arcane] function so LLVM can inline across the
target-feature boundary.
Structs§
- Entropy
Coeff Result - Results from vectorized entropy coefficient processing.
- Neon
Token - Proof that NEON is available.
Traits§
- Simd
Token - Marker trait for SIMD capability tokens.
Functions§
- compute_
block_ l2_ errors - Compute per-block masked weighted L2 error between original and reconstructed XYB planes.
- compute_
block_ l2_ errors_ neon - compute_
block_ l2_ errors_ scalar - compute_
mask1x1 - Compute per-pixel masking field from XYB Y channel.
- compute_
mask1x1_ neon - compute_
mask1x1_ scalar - dct_8x8
- Compute scaled 8x8 forward DCT with SIMD acceleration.
- dct_
8x8_ neon - NEON 8x8 forward DCT: two-pass (4 columns at a time), in-register transpose.
- dct_
8x8_ scalar - dct_
8x16 - Compute scaled 8x16 forward DCT with SIMD acceleration.
- dct_
8x16_ neon - NEON 8x16 forward DCT.
- dct_
8x16_ scalar - dct_
16x8 - Compute scaled 16x8 forward DCT with SIMD acceleration.
- dct_
16x8_ neon - NEON 16x8 forward DCT.
- dct_
16x8_ scalar - dct_
16x16 - Compute 16x16 forward DCT with SIMD acceleration.
- dct_
16x16_ neon - NEON 16x16 forward DCT: process 4 rows at a time.
- dct_
16x16_ scalar - dequant_
block_ dct8 - Dequantize a DCT8 block and apply CfL (chroma-from-luma) in one pass.
- dequant_
dct8_ neon - dequant_
dct8_ scalar - entropy_
coeffs_ neon - entropy_
coeffs_ scalar - entropy_
estimate_ coeffs - Vectorized entropy coefficient processing.
- epf_
step1 - Apply EPF Step 1 to 3-channel XYB planes.
- epf_
step2 - Apply EPF Step 2 to 3-channel XYB planes.
- epf_
step1_ neon - epf_
step1_ scalar - epf_
step2_ neon - epf_
step2_ scalar - forward_
xyb_ neon - forward_
xyb_ scalar - gab_
smooth_ channel - Apply 3x3 weighted gaborish smooth to a single channel in-place.
- gab_
smooth_ neon - NEON gab smooth: processes 4 pixels per iteration in interior rows.
- gab_
smooth_ scalar - gaborish_
5x5_ channel - Apply the 5x5 gaborish inverse kernel to a single channel.
- gaborish_
5x5_ neon - NEON gaborish 5x5: processes 4 pixels per iteration in interior region.
- gaborish_
5x5_ scalar - idct_
8x8 - Compute scaled 8x8 inverse DCT with SIMD acceleration.
- idct_
8x8_ neon - NEON 8x8 inverse DCT.
- idct_
8x8_ scalar - idct_
8x16 - Compute 8x16 inverse DCT with SIMD acceleration.
- idct_
8x16_ neon - NEON 8x16 inverse DCT.
- idct_
8x16_ scalar - idct_
16x8 - Compute 16x8 inverse DCT with SIMD acceleration.
- idct_
16x8_ neon - NEON 16x8 inverse DCT.
- idct_
16x8_ scalar - idct_
16x16 - Compute 16x16 inverse DCT with SIMD acceleration.
- idct_
16x16_ neon - NEON 16x16 inverse DCT: process 4 rows at a time.
- idct_
16x16_ scalar - inverse_
xyb_ neon - inverse_
xyb_ planar_ neon - inverse_
xyb_ planar_ scalar - inverse_
xyb_ scalar - linear_
rgb_ to_ xyb_ batch - Convert separate R, G, B channel buffers to separate X, Y, B channel buffers.
- pixel_
domain_ loss - Compute pixel-domain loss for one channel of a block.
- pixel_
domain_ loss_ neon - pixel_
domain_ loss_ scalar - quantize_
block_ dct8 - Quantize a DCT8 block (64 coefficients) with dead-zone thresholding.
- quantize_
dct8_ neon - quantize_
dct8_ scalar - transpose_
8x8 - Transpose an 8x8 f32 matrix.
- transpose_
8x8_ neon - NEON 8x8 transpose using four 4x4 sub-transposes.
- xyb_
to_ linear_ rgb_ batch - Convert separate X, Y, B channel buffers to interleaved linear RGB.
- xyb_
to_ linear_ rgb_ planar - Convert separate X, Y, B channel buffers to planar linear RGB.