oxideav_jpegxl/lib.rs
1//! JPEG XL (JXL) codec — decoder-side header parsing.
2//!
3//! JPEG XL is ISO/IEC 18181 (final specification 2022). It supersedes
4//! classic JPEG with a modal design that separates a "VarDCT" path
5//! (variable-size DCT + LF/HF subbands, quality-competitive with AVIF
6//! and modern JPEG) from a "Modular" path (grid-of-pixels predictor +
7//! MA-tree range coder, strong at lossless + non-photo material).
8//!
9//! This crate currently ships:
10//!
11//! * Container + signature detection for both JXL wrappings:
12//! raw codestream (`FF 0A`) and ISOBMFF-wrapped
13//! (`00 00 00 0C 4A 58 4C 20 0D 0A 87 0A`), including extraction of
14//! the codestream from `jxlc` / `jxlp` boxes.
15//! * An LSB-first [`bitreader::BitReader`] matching the reference
16//! bit packing used by the codestream.
17//! * Parsing of the codestream preamble: [`metadata::SizeHeader`] and the
18//! [`metadata::ImageMetadata`] fields up to `num_extra_channels`
19//! (bit depth, orientation, preview/animation flags). Fuller
20//! ColorEncoding + ToneMapping decoding is deferred.
21//! * Modular sub-bitstream pixel decode (per the 2019 committee draft,
22//! Annexes C.9 + D.7), made of:
23//! - [`abrac::Abrac`] — the bit-level adaptive range coder (D.7);
24//! - [`begabrac::Begabrac`] — bounded-Exp-Golomb integer coder over a
25//! known signed range (D.7.1);
26//! - [`matree::MaTree`] — the meta-adaptive decision tree that picks
27//! a per-context BEGABRAC for each pixel (D.7.2 / D.7.3);
28//! - [`predictors`] — the five named pixel predictors (Zero, Average,
29//! Gradient, Left, Top) from C.9.3.1;
30//! - [`modular`] — the channel header parser plus the per-pixel
31//! property + predictor + entropy decode loop.
32//!
33//! The integrated registered decoder is not yet wired: the registered
34//! `make_decoder` reports [`Error::Unsupported`] because the surrounding
35//! codestream framing (FrameHeader + TOC + frame-byte alignment) is not
36//! yet wired to the per-channel path. Programs that only need
37//! probe-level information (dimensions, bit depth) should call
38//! [`probe`] directly; programs that want to drive the per-channel
39//! Modular decode end-to-end should instantiate
40//! [`modular::decode_single_channel`] against a hand-built fixture
41//! (unit tests in `modular` show the expected wire format).
42//!
43//! Follow-up work (tracked for the eventual landing PR):
44//!
45//! * GlobalModular wiring (C.4.8) so the FDIS path can actually drive
46//! the Modular sub-bitstream end-to-end.
47//! * Squeeze inverse transform (I.3) for multi-resolution Modular
48//! images.
49//! * VarDCT-path decoder (variable-size DCT + LF/HF, Chroma-from-Luma,
50//! Gaborish smoothing, EPF) — out of scope for this round.
51//! * MABrotli / MAANS entropy coders (the 2019 committee draft's
52//! `entropy_coder` ∈ {1, 2}); only MABEGABRAC (`entropy_coder == 0`)
53//! is implemented today.
54//!
55//! ## FDIS 18181-1:2021 layer
56//!
57//! In addition to the committee-draft pipeline above, the FDIS layer
58//! is being built up additively across rounds:
59//!
60//! * Round 1: [`ans`] — FDIS Annex D entropy decoder (prefix codes,
61//! ANS, distribution clustering, hybrid integer coding).
62//! * Round 2: [`extensions`] — A.5 Extensions; [`metadata_fdis`] —
63//! full A.6 ImageMetadata refresh including ColorEncoding,
64//! ToneMapping, ExtraChannelInfo, AnimationHeader, OpsinInverseMatrix,
65//! PreviewHeader; [`frame_header`] — C.2 FrameHeader bundle including
66//! Passes, BlendingInfo, RestorationFilter; [`toc`] — C.3 TOC with
67//! Lehmer-code permutation decoder driven by the round-1 ANS layer;
68//! [`ans::cluster::read_general_clustering`] — D.3.5 general path.
69//! * Round 3 (planned): GlobalModular wiring + cjxl fixture decode.
70//!
71//! ## Round-1 (2024-spec) status (this commit)
72//!
73//! `make_decoder` returns a live decoder ([`JxlDecoder`]) that handles
74//! the simplest end-to-end Modular bitstreams:
75//!
76//! * Raw codestream OR ISOBMFF wrapping.
77//! * Grey (1 plane) OR RGB (3 planes), 8 bits per sample (integer).
78//! * Single-group, single-pass frame (`num_groups == 1 &&
79//! num_passes == 1`).
80//! * `nb_transforms` arbitrary at the *parse* level (TransformInfo
81//! bundles per H.7 are decoded for any nb_transforms > 0); inverse
82//! application of Palette / Squeeze defers to round 2 with a clean
83//! `Error::Unsupported` exit point. RCT (no channel-list change)
84//! passes through the layout step.
85//! * Multi-leaf MA tree evaluated end-to-end (decision-node
86//! `property[k] > value` traversal per H.4.1).
87//! * `use_global_tree` is honoured.
88//! * No Patches / Splines / NoiseParameters — those are LfGlobal
89//! features round 2 will land alongside the VarDCT path.
90//! * No ICC profile (Annex E.4).
91//! * Predictor 6 (Annex H.5 Self-correcting) only resolved at the
92//! (0, 0) origin; full WP defers to round 2.
93//!
94//! The acceptance fixture for round 1 is `pixel-1x1.jxl` (1×1 RGB
95//! lossless, 22 B): decodes to R=255 G=0 B=0 matching its
96//! `expected.png`.
97//!
98//! Anything outside this envelope returns
99//! [`Error::Unsupported`](oxideav_core::Error::Unsupported) at the
100//! relevant gate point. Wider coverage (VarDCT, Squeeze inverse,
101//! Palette inverse, ICC, full WP predictor 6) lands in round 2+.
102//!
103//! ## Round-6 (2024-spec) additions
104//!
105//! * **Annex E.4 ICC profile decode** ([`icc`]): the 7-state-equivalent
106//! entropy-coded ICC byte stream (41 pre-clustered distributions +
107//! `IccContext(i, b1, b2)` 41-context function) is decoded into the
108//! final ICC profile bytes per E.4.3 (header), E.4.4 (tag list) and
109//! E.4.5 (main content). When `metadata.colour_encoding.want_icc ==
110//! true` the bit-position is now correctly advanced past the ICC
111//! stream rather than failing with `Error::Unsupported` outright;
112//! the decoded bytes are validated for the "acsp" magic at offset 36
113//! but are not yet propagated to `oxideav_core::VideoFrame` (which
114//! has no ICC slot in 0.1.x).
115//! * **G.2 LfGroup / G.4 PassGroup type scaffolding** ([`lf_group`],
116//! [`pass_group`]): typed bundles + per-group rectangle geometry +
117//! `(minshift, maxshift)` computation per pass. Per-LfGroup and
118//! per-PassGroup decode itself is not yet wired (round-7 follow-up
119//! gated on the GlobalModular `nb_meta_channels`-aware refactor —
120//! see `lf_group` crate-level docs).
121//! * Multi-LfGroup / multi-group / multi-pass / VarDCT frames fail
122//! with precise round-7-targeting error messages instead of the
123//! round-3 generic "TOC with N entries" rejection.
124//!
125//! ## Round-7 (2024-spec) additions
126//!
127//! Four-piece refactor coordinating the GlobalModular partial-decode
128//! path with per-PassGroup decode + post-PassGroup transforms (Annex
129//! G.1.3 last paragraph + G.4.2):
130//!
131//! * **Partial GlobalModular** — [`global_modular::GlobalModular::read`]
132//! stops decoding at any non-meta channel exceeding `group_dim`
133//! (G.1.3 last paragraph). Such channels are zero-filled placeholders
134//! in `image.channels` until per-PassGroup decode fills them.
135//! * **`stream_index` threading** —
136//! [`modular_fdis::decode_channels_at_stream`] takes the stream index
137//! from Table H.4: `0` for GlobalModular,
138//! `1 + 3*num_lf_groups + 17 + num_groups * pass_idx + group_idx` for
139//! ModularGroup. Threaded through `get_properties` so the MA tree's
140//! `property[1] > value` decisions select the correct per-section
141//! leaf.
142//! * **TOC layout + empty entries** — [`toc::Toc::read`] now accepts
143//! zero-size entries (e.g. an empty LfGroup or PassGroup section is
144//! legal when no channel matches that section's filter). The
145//! `decode_codestream` consumer addresses sections by their TOC
146//! offsets (computed from the entry running sum), with permutation
147//! already handled in the round-2 TOC reader.
148//! * **Post-PassGroup transforms** —
149//! [`global_modular::apply_inverse_transforms`] is invoked AFTER all
150//! PassGroups complete (G.4.2 last paragraph), not inside
151//! `GlobalModular::read`, so the inverse transform sees the
152//! fully-assembled image rather than a half-decoded one.
153//!
154//! Per-PassGroup decode is in
155//! [`pass_group::decode_modular_group_into`]; the
156//! `(minshift, maxshift)` computation in [`pass_group::compute_pass_shift_range`]
157//! models an implicit `n=num_ds` final-resolution entry that the
158//! printed spec text omits but whose absence would make single-pass
159//! frames decode no modular data (documented SPECGAP).
160//!
161//! **Round-7 SPECGAP** — cjxl 0.11.1 emits multi-group lossless modular
162//! fixtures where the per-cluster ANS distribution's `alphabet_size`
163//! exceeds `1 << log_alphabet_size` (specifically: alphabet_size=33
164//! against table_size=32 when `log_alphabet_size = 5 + u(2) = 5`). The
165//! 2024 spec text in C.2.5 is silent on the cap (the introductory
166//! paragraph describes D as a `1 << log_alphabet_size`-element array
167//! but the listing's alphabet_size-iterating loop can exceed it).
168//!
169//! ## Round-8 (2024-spec) additions
170//!
171//! Two themes:
172//!
173//! 1. **C.2.5 SPECGAP partial resolution** ([`ans::distribution`]):
174//! [`ans::distribution::read_distribution`] now returns
175//! `(D, log_eff)` where `log_eff` is the effective log_alphabet_size
176//! for downstream alias-table sizing. Round 8 picks
177//! "interpretation C": iterate the logcounts loop for
178//! `min(alphabet_size, table_size)` entries, treating the
179//! bitstream's signalled `alphabet_size > table_size` as a
180//! soft cap (the encoder advertises a wider alphabet but only
181//! serialises `table_size` per-symbol entries). Empirically
182//! validated by parsing the LfGlobal section of
183//! `tests/fixtures/synth_320_grey/synth_320.jxl` cleanly past
184//! the round-7 SPECGAP error. Interpretations A (grow D to
185//! accommodate alphabet_size) and B (drop writes at i >=
186//! table_size, accumulate total_count only over stored entries)
187//! were both tried and rejected — see [`ans::distribution`]
188//! crate docs for the comparison. The synth_320 fixture is
189//! still NOT decoded end-to-end: a separate post-LfGlobal blocker
190//! appears (cjxl emits a 0-byte PassGroup[0][0] slot which
191//! contradicts the spec's "all groups carry data per pass"
192//! rule); that is round-9+ work.
193//!
194//! 2. **VarDCT scaffold** ([`vardct`]): the FrameHeader's
195//! `encoding == kVarDCT` path is now structurally recognised
196//! rather than rejected with a generic `Error::Unsupported`.
197//! The module exposes
198//! [`vardct::recognise_vardct_codestream`] which validates the
199//! round-8 envelope (single LF group, single pass, no extra
200//! channels, Grey or RGB colour space) and returns a
201//! [`vardct::VarDctScaffold`] geometry record. The IDCT-II
202//! primitive for the 8x8 block size ([`vardct::idct1d_8`] +
203//! [`vardct::idct2d_8x8`]) is also wired with unit tests. End-
204//! to-end VarDCT pixel decode (LF subband, HF subband, dequant,
205//! inverse transform dispatch across block sizes 8x8 / 8x16 /
206//! 16x8 / 16x16 / 32x32 / 64x64 / DCT4 / IDENTITY / AFV,
207//! Chroma-from-Luma, Gaborish smoothing, EPF) is round-9+
208//! work.
209//!
210//! ## Round-9 (2024-spec) additions
211//!
212//! Three concurrent fixes that together unblock the synth_320 fixture
213//! (multi-group lossless grey, 320×320, num_groups=9):
214//!
215//! 1. **§F.3.1 HfGlobal slot is unconditional** — the 2024 spec
216//! bullets list `HfGlobal` for every TOC, with NOTE 1 calling out
217//! that the slot is 0-byte for `encoding == kModular`. Round 8's
218//! `num_toc_entries` / [`toc::Toc::read`] gated HfGlobal on
219//! `encoding == kVarDCT`, off-by-oning every PassGroup index in
220//! multi-group kModular frames. Also: `HfPass[num_passes]` is part
221//! of the `HfGlobal` section per Annex G.3 Table G.4 — round 8 had
222//! incorrectly counted it as separate TOC entries. With both off-
223//! by-ones fixed, synth_320's TOC reads as 12 entries
224//! `[33, 0, 0, 9, 20, 7, 20, 9, 24, 7, 23, 7]` (slot 2 is the 0-
225//! byte HfGlobal, not PG[0][0]).
226//!
227//! 2. **§F.3 first-paragraph zero-padding** — "When decoding a
228//! section, no more bits are read from the codestream than 8 times
229//! the byte size indicated in the TOC; if fewer bits are read,
230//! then the remaining bits of the section all have the value
231//! zero." Round 8's [`bitreader::BitReader`] errored on EOF for
232//! section sub-readers, breaking PassGroup ANS decodes whose
233//! modular sub-bitstreams consumed fewer real bits than the
234//! TOC-stated section size. Round 9 adds
235//! [`bitreader::BitReader::new_section`] which returns 0 for any
236//! read past the end of the section data; the legacy
237//! [`bitreader::BitReader::new`] preserves strict EOF for whole-
238//! codestream parsing.
239//!
240//! 3. **Per-PassGroup transforms (Annex H.6 inside G.4.2)** —
241//! observed in cjxl 0.11.1's synth_320 edge groups: the encoder
242//! emits a per-group Palette transform (`begin_c=0, num_c=1,
243//! nb_colours=191`) for the 64-pixel-wide column-2 / row-2 groups.
244//! [`pass_group::decode_modular_group_into`] now applies the
245//! transform layout adjustment to the per-group channel descs,
246//! decodes against the adjusted descs, and applies the inverse
247//! transforms LOCALLY before copying samples back into the parent
248//! image. [`global_modular::apply_transforms_to_channel_layout`]
249//! is now `pub` so the per-group reuse path doesn't duplicate the
250//! table.
251//!
252//! **Round-9 status** — synth_320 reaches end-of-frame without
253//! erroring and ~21k of 102400 pixels match the expected
254//! `(y + x) & 0xFF` gradient (the first 6 rows across the first two
255//! group columns); the remaining pixels drift mid-decode in the
256//! smaller edge groups. Full pixel-correctness is round-10 work
257//! (suspected residual: ANS state nuance specific to F.3 zero-
258//! padded tail OR per-group WP bookkeeping). All five small
259//! lossless fixtures still pixel-correct vs round 4's
260//! `expected.png`.
261//!
262//! ## Round-10 (2024-spec) additions
263//!
264//! Two themes:
265//!
266//! 1. **C.1 + C.3.3 `lz_dist_ctx` spec-conformance fix** —
267//! [`modular_fdis::decode_uint_in`] and `decode_uint_in_with_dist`
268//! previously passed the per-symbol leaf context for both the
269//! literal token AND the LZ77 distance token, which contradicts
270//! the spec's "the LZ77 distance token is read using
271//! `D[clusters[lz_dist_ctx]]`" with `lz_dist_ctx = num_dist`
272//! (the dedicated extra context the codestream reserves whenever
273//! `lz77.enabled`). When LZ77 fires, that bug would distort
274//! every copy. Fixed: derive `lz_dist_ctx = cluster_map.len() -
275//! 1` from the post-prelude state of the `EntropyStream` and
276//! thread it to `HybridUintState::decode`'s `ctx_lz` argument.
277//! No-op for fixtures whose symbol stream has `lz77.enabled =
278//! false` (synth_320 included).
279//!
280//! 2. **synth_320 edge-group drift bisect** — instrumented per-
281//! decode tracing pinpoints the first mismatch at PG[0][0]
282//! decode #3087 (frame coords y=24, x=14). State 0x9CA780
283//! alias-maps to symbol 30 (cluster 0's low-prob `D[30] = 1`
284//! entry), forcing an ANS refill plus extra bits that
285//! over-consume 21 bits beyond the 9-byte slot. Bisect ruled
286//! out: per-PassGroup transform layout (PG[0][0] carries no
287//! transforms; only edge groups do); LZ77 path (off in the
288//! symbol stream); per-channel WP state reset (PG[0][0] is the
289//! first group); cluster_map / `dist_multiplier` derivation
290//! (matches H.3). Round-11+ work will need a finer state-by-
291//! state diff against djxl `--debug` (deferred to an Auditor
292//! round) since the implementer wall bars djxl source / the
293//! reference-decoder-trace doc.
294//!
295//! **Round-10 status** — synth_320 still decodes ~21k of 102400
296//! pixels matching the gradient (first 24 rows of PG[0][0] and
297//! PG[0][1] are pixel-correct; drift begins at y=24, x=14). All
298//! five small lossless fixtures still pixel-correct.
299//!
300//! ## Round-11 (2024-spec) additions
301//!
302//! Three pieces wire LF subband decode (Annex G.2.2 / I.2):
303//!
304//! 1. **LfGlobal VarDCT bundles** ([`lf_global`]):
305//! [`lf_global::Quantizer`] (FDIS C.4.3 — `global_scale` +
306//! `quant_lf`) drives LF dequant per Listing C.1.
307//! [`lf_global::LfChannelCorrelation`] (C.4.4) carries the CfL
308//! factors used by Annex G to reconstruct X/B from dY (default
309//! `colour_factor=84`, `base_correlation_x=0.0`,
310//! `base_correlation_b=1.0`). [`lf_global::HfBlockContext`]
311//! (C.8.4) implements only the `u(1)==1` default-table fast path
312//! in round 11; the per-LF-threshold / qf-threshold / clustering-
313//! map branch returns `Error::Unsupported`. With these bundles
314//! wired, `LfGlobal::read` advances correctly past the VarDCT-only
315//! region of the LfGlobal slot rather than rejecting outright.
316//!
317//! 2. **GlobalModular zero-channel acceptance**
318//! ([`global_modular`]): `GlobalModular::read` previously rejected
319//! any frame where `derive_channel_descs` returned 0 channels (the
320//! common VarDCT-without-extras case). Round 11 accepts the empty
321//! case: the inner ModularHeader (`use_global_tree`, `WPHeader`,
322//! `nb_transforms`) is still consumed, but the MA-tree and per-
323//! cluster distribution decode are skipped per FDIS C.9.1 last
324//! sentence ("In the trivial case where N is zero, the decoder
325//! takes no action.").
326//!
327//! 3. **LfGroup + LfCoefficients** ([`lf_group`]):
328//! [`lf_group::LfCoefficients::read`] reads `extra_precision = u(2)`
329//! per FDIS C.5.3, builds the per-LfGroup channel layout (3 LF
330//! channels of `ceil(group_w/8) × ceil(group_h/8)` samples,
331//! optionally further right-shifted by `frame_header.jpeg_upsampling`
332//! on chroma planes), then drives a Modular sub-bitstream with
333//! `stream_index = 1 + lf_group_index` per Table H.4.
334//! [`lf_group::LfGroup::read`] composes ModularLfGroup (G.2.3 —
335//! round-11 only handles the empty-channel-list case for now)
336//! with LfCoefficients. HfMetadata (G.2.4) is round-12+ work.
337//!
338//! Acceptance fixture: hand-built minimal VarDCT bitstream — no cjxl
339//! dependency, encoded directly from spec listings — covering an
340//! 8×8 frame with 1×1 LF coefficient channels, MA tree of one
341//! Zero-predictor leaf, and prefix-code symbol stream with
342//! alphabet_size=1 (so every decoded LF coefficient is 0). The
343//! fixture parses through `LfGlobal::read` → `LfGroup::read` →
344//! `LfCoefficients::read` end-to-end. Test:
345//! `lf_group::tests::round11_lfgroup_minimal_vardct_one_block_parses`.
346//!
347//! All five small lossless fixtures stay pixel-correct (see
348//! `tests/round11_lf_subband.rs`).
349//!
350//! ## Round-13 (2024-spec) additions
351//!
352//! Three pieces tighten the VarDCT pipeline so round-12's unit-tested
353//! F.1 / F.2 work actually runs on real codestreams:
354//!
355//! 1. **DctSelect / HfMul derivation from BlockInfo** ([`dct_select`]):
356//! walks each column of the per-LfGroup BlockInfo channel decoded
357//! in round 12, looks up the transform type in Table C.16, and
358//! places the varblock at the next-empty 8×8 cell of the LfGroup's
359//! block grid (raster order). HfMul is computed as `1 + mul`. The
360//! 27-entry transform-type table is committed verbatim with
361//! per-entry `(block_cols, block_rows)` from the FDIS spec.
362//!
363//! 2. **HfGlobal C.6 default-fast-path** ([`hf_global`]): reads the
364//! `u(1)` dequant-default flag and the `num_hf_presets - 1 =
365//! u(ceil(log2(num_groups)))` field. The non-default-encoding
366//! branch (per-matrix `encoding_mode = u(3)` + Listing C.7
367//! `ReadDctParams()`) returns `Error::Unsupported` until round 14+
368//! wires the full table.
369//!
370//! 3. **VarDCT pipeline wiring** ([`decode_vardct_round13`]): the
371//! top-level `decode_one_frame` no longer rejects VarDCT
372//! codestreams at the round-8 scaffold gate. Instead, for
373//! `num_lf_groups == 1 && num_passes == 1`, it now drives:
374//! LfGlobal → LfGroup (LfCoefficients + HfMetadata) → DctSelect
375//! derivation → HfGlobal → F.1 LF dequantisation → F.2 adaptive
376//! smoothing (when not skipped). The round-13 pipeline returns
377//! `Error::Unsupported` with a "round 14+: HF subband decode +
378//! IDCT not yet wired" message AFTER all round-12 work has run on
379//! the real input.
380//!
381//! Round-13 status — five small lossless Modular fixtures stay
382//! pixel-correct; both VarDCT fixtures (`vardct_256x256_d1.jxl` and
383//! `vardct_256x256_d3.jxl`, copied from `docs/image/jpegxl/fixtures/`)
384//! reach the round-13 pipeline (no longer hit the round-8 scaffold
385//! gate).
386//!
387//! Round-14 candidates (in dependency order):
388//!
389//! * HfBlockContext non-default-table branch (per-LF thresholds + qf
390//! thresholds + clustering map), required for any cjxl-encoded VarDCT
391//! fixture that doesn't take the `u(1)=1` default-table fast path.
392//! * HfGlobal C.6.2 dequant-matrix encoding modes (Listing C.7) +
393//! Listing C.10 `GetDCTQuantWeights` for per-DctSelect dequant
394//! matrices.
395//! * HfPass C.7.1 coefficient orders (`used_orders` 13-bit mask,
396//! `DecodePermutation`) + C.7.2 histograms (495 × num_hf_presets ×
397//! nb_block_ctx clustered distributions).
398//! * PassGroup HF coefficients C.8.3: per-block `hfp =
399//! u(ceil(log2(num_hf_presets)))` + clustered ANS coeff decode +
400//! F.3 HF dequantisation (Listing F.2 + per-channel scale +
401//! per-DctSelect dequant matrix multiply).
402//! * Inverse DCT dispatch across block sizes (8×8 IDCT wired round 8;
403//! 8×16 / 16×8 / 16×16 / 32×32 / 64×64 / DCT4 / DCT4×8 / DCT8×4 /
404//! IDENTITY / AFV remain).
405//! * Listing I.5 LLF-from-downsampled-LF composition (the bridge from
406//! F.2-smoothed LF samples to varblock LF coefficients) — pure-math
407//! step landed round 121 as [`llf_from_lf`] (FDIS Listings I.15 +
408//! I.16). Still pending: per-LfGroup wiring that drives the
409//! per-varblock invocation from the [`pass_group_hf`] coefficient
410//! buffer.
411//! * Chroma-from-Luma (Annex G), Gaborish (Annex J?), EPF.
412//!
413//! ## Round-16 (2024-spec) additions
414//!
415//! [`lf_group::HfMetadata::read`] now wires nested transforms (FDIS
416//! §C.5.4 + §C.9.4): the four-channel HfMetadata sub-bitstream parses
417//! `nb_transforms` + `TransformInfo[]` exactly like the GlobalModular
418//! section, applies the transform-rewritten channel layout via
419//! [`global_modular::apply_transforms_to_channel_layout`] before the
420//! inner per-channel decode, then walks
421//! [`global_modular::apply_inverse_transforms`] in reverse bitstream
422//! order to recover the four-channel base layout
423//! `[XFromY, BFromY, BlockInfo, Sharpness]`.
424//!
425//! Round-15 left the d1 fixture stuck on the round-12 deferral inside
426//! `HfMetadata::read` (`nb_transforms > 0` errored with "transforms
427//! inside HF metadata sub-bitstream not yet supported"). With round 16
428//! the parse succeeds; the d1 fixture surfaces a strictly-later
429//! blocker — its HfMetadata Squeeze step references channels beyond
430//! the four-channel baseline (`begin_c=39` on step 0), tripping the
431//! `apply_transforms_to_channel_layout` channel-count invariant.
432//! That's the round-17 candidate (suspected upstream bit-position
433//! drift in LfGlobal or LfCoefficients).
434//!
435//! `HfMetadata::read` and `LfGroup::read` now both take a
436//! `&ImageMetadataFdis` argument so the inverse Palette transform can
437//! read `bit_depth.bits_per_sample` for delta-palette prediction.
438//!
439//! ## Round-26 (2024-spec) — Annex L colour transforms
440//!
441//! Parent-dispatch "r11". New [`xyb`] module transcribes FDIS §L.2.2
442//! inverse XYB → linear RGB and §L.3 inverse YCbCr → RGB verbatim
443//! from the ISO/IEC 18181-1:2024 PDF. Three public entry points:
444//! [`xyb::inverse_xyb_to_rgb`], [`xyb::inverse_ycbcr_to_rgb`], and
445//! the convenience composite [`xyb::modular_xyb_to_linear_rgb`]
446//! (§L.2.2 preamble + inverse XYB in one call).
447//!
448//! Wired into [`decode_codestream`] modular output stage: when
449//! `metadata.xyb_encoded == true` and `colour_encoding.colour_space`
450//! is `Rgb`, the per-channel pass-through is replaced with
451//! [`build_rgb_planes_from_xyb`]; symmetric branch for
452//! `frame_header.do_ycbcr == true`. Pre-round-26 pass-through path
453//! preserved for `xyb_encoded == false && do_ycbcr == false` modular
454//! frames so the five small lossless fixtures stay pixel-correct.
455//!
456//! Round-26 SPECGAP: §L.2.2 NOTE describes the output as
457//! linear-domain RGB but doesn't prescribe a gamma encoding step
458//! before display. [`xyb::linear_rgb_to_u8`] emits linear bytes
459//! (clamp + scale by 255 + round); callers that need sRGB-encoded
460//! bytes apply the sRGB transfer function downstream.
461//!
462//! ## Round-27 (2024-spec) — IDCT dispatch
463//!
464//! Parent-dispatch "r12" item (5). New [`idct`] module wires the
465//! spec-conformant 1-D inverse DCT for power-of-two sizes
466//! `s ∈ {1, 2, 4, 8, 16, 32, 64, 128, 256}` (FDIS Annex I.2.1) and
467//! the 2-D inverse DCT (Annex I.2.2 Listing I.4) for rectangular
468//! `R × C` blocks. Three public entry points: [`idct::idct_1d`],
469//! [`idct::idct_2d`] (taking coefficients in spec `(short × long)`
470//! row-major natural-ordering layout per Annex I.2.4 and returning
471//! `(R × C)` row-major samples), and [`idct::idct_for_transform`]
472//! which dispatches on a [`dct_select::TransformType`] to the 2-D
473//! IDCT for the 18 plain-DCT block sizes in Table C.16.
474//!
475//! The 9 non-DCT transforms (Hornuss, DCT2x2, DCT4x4, DCT4x8,
476//! DCT8x4, AFV0..AFV3) — Listings I.7..I.13 — return
477//! `Err(Unsupported)` from [`idct::idct_for_transform`] and are
478//! deferred to round 28+ alongside HF coefficient decode + F.3
479//! dequantisation. The legacy [`vardct::idct1d_8`] /
480//! [`vardct::idct2d_8x8`] (round-8 scaffold, scaled-orthonormal
481//! IDCT) are retained for backward compatibility but are NOT
482//! spec-conformant; new HF-decode wiring will call through
483//! [`idct::idct_for_transform`] exclusively.
484
485pub mod abrac;
486pub mod afv;
487pub mod ans;
488pub mod begabrac;
489pub mod bitreader;
490pub mod block_context_resolver;
491pub mod block_dequant;
492pub mod chroma_from_luma;
493pub mod coeff_order;
494pub mod container;
495pub mod dct_quant_weights;
496pub mod dct_select;
497pub mod epf;
498pub mod extensions;
499pub mod frame_header;
500pub mod gaborish;
501pub mod global_modular;
502pub mod hf_coeff_histogram_size;
503pub mod hf_coefficient_histograms;
504pub mod hf_dequant;
505pub mod hf_global;
506pub mod hf_pass;
507pub mod icc;
508pub mod idct;
509pub mod lf_dequant;
510pub mod lf_global;
511pub mod lf_group;
512pub mod llf_from_lf;
513pub mod matree;
514pub mod metadata;
515pub mod metadata_fdis;
516pub mod modular;
517pub mod modular_fdis;
518pub mod multi_pass_decode;
519pub mod multi_pass_hf_header;
520pub mod multi_pass_hf_histogram_decoder;
521pub mod non_zeros_grid;
522pub mod pass_group;
523pub mod pass_group_hf;
524pub mod per_channel_non_zeros;
525pub mod per_pass_non_zeros;
526pub mod predictors;
527pub mod residual_plane;
528pub mod toc;
529pub mod varblock_walk;
530pub mod vardct;
531pub mod xyb;
532
533pub use container::{detect, extract_codestream, Signature};
534pub use metadata::{parse_headers, BitDepth, Headers, ImageMetadata, SizeHeader};
535
536use oxideav_core::{CodecCapabilities, CodecId, CodecParameters, Error, Result};
537use oxideav_core::{
538 CodecInfo, CodecRegistry, Decoder, Encoder, Frame, Packet, RuntimeContext, VideoFrame,
539 VideoPlane,
540};
541
542use crate::bitreader::BitReader;
543use crate::frame_header::{FrameDecodeParams, FrameHeader};
544use crate::lf_global::LfGlobal;
545use crate::metadata_fdis::{ColourSpace, ImageMetadataFdis, SizeHeaderFdis};
546use crate::toc::Toc;
547
548/// Public codec id string. Matches the aggregator feature name `jpegxl`.
549pub const CODEC_ID_STR: &str = "jpegxl";
550
551/// Register the JPEG XL decoder stub into the supplied
552/// [`CodecRegistry`]. The encoder slot is intentionally left
553/// unregistered: the crate is decoder-side only and currently
554/// retired-pending-cleanroom (see crate-level docs).
555pub fn register_codecs(reg: &mut CodecRegistry) {
556 let caps = CodecCapabilities::video("jpegxl_headers_only")
557 .with_lossy(true)
558 .with_intra_only(true);
559 reg.register(
560 CodecInfo::new(CodecId::new(CODEC_ID_STR))
561 .capabilities(caps)
562 .decoder(make_decoder),
563 );
564}
565
566/// Unified entry point: install the JPEG XL codec into a
567/// [`RuntimeContext`].
568pub fn register(ctx: &mut RuntimeContext) {
569 register_codecs(&mut ctx.codecs);
570}
571
572oxideav_core::register!("jpegxl", register);
573
574fn make_decoder(params: &CodecParameters) -> Result<Box<dyn Decoder>> {
575 let codec_id = params.codec_id.clone();
576 Ok(Box::new(JxlDecoder {
577 codec_id,
578 pending: None,
579 eof: false,
580 }))
581}
582
583/// Round-1 (2024-spec) JXL decoder. Drives `decode_one_frame` per packet.
584///
585/// Limitations (round 1):
586/// * Only Modular-encoded frames (kModular, not kVarDCT).
587/// * Grey (1ch) OR RGB (3ch) only — XYB / YCbCr defer.
588/// * Single-group, single-pass frames.
589/// * Inverse Palette / Squeeze transforms defer (parsing + RCT
590/// layout pass-through is wired).
591/// * Predictor 6 (Self-correcting) only at (0, 0) origin.
592/// * No Patches / Splines / Noise / ICC profile.
593///
594/// Anything outside this envelope returns `Error::Unsupported` from a
595/// well-defined point in the bitstream rather than panicking.
596struct JxlDecoder {
597 codec_id: CodecId,
598 pending: Option<Packet>,
599 eof: bool,
600}
601
602impl Decoder for JxlDecoder {
603 fn codec_id(&self) -> &CodecId {
604 &self.codec_id
605 }
606
607 fn send_packet(&mut self, packet: &Packet) -> Result<()> {
608 if self.pending.is_some() {
609 return Err(Error::other(
610 "jxl decoder: receive_frame must be called before sending another packet",
611 ));
612 }
613 self.pending = Some(packet.clone());
614 Ok(())
615 }
616
617 fn receive_frame(&mut self) -> Result<Frame> {
618 let Some(pkt) = self.pending.take() else {
619 return if self.eof {
620 Err(Error::Eof)
621 } else {
622 Err(Error::NeedMore)
623 };
624 };
625 let vf = decode_one_frame(&pkt.data, pkt.pts)?;
626 Ok(Frame::Video(vf))
627 }
628
629 fn flush(&mut self) -> Result<()> {
630 self.eof = true;
631 Ok(())
632 }
633}
634
635/// Decode the ICC stream (Annex E.4) at the current bit position and
636/// return the resulting ICC profile bytes.
637///
638/// The caller has already verified that
639/// `metadata.colour_encoding.want_icc == true`. Round 6 wires the
640/// decode end-to-end; the returned bytes are valid per E.4.3..E.4.5 if
641/// `Ok`. The function also performs a minimal ICC.1 sanity check —
642/// for outputs >= 40 bytes the magic "acsp" must be at offset 36 —
643/// because the predicted-header rule in E.4.3 forces those bytes when
644/// the encoded delta is zero, but a malformed delta could shift them.
645fn decode_icc_stream_at(br: &mut BitReader<'_>) -> Result<Vec<u8>> {
646 let encoded = icc::decode_encoded_icc_stream(br)?;
647 let profile = icc::reconstruct_icc_profile(&encoded)?;
648 if profile.len() >= 40 && &profile[36..40] != b"acsp" {
649 return Err(Error::InvalidData(format!(
650 "JXL ICC: decoded profile lacks 'acsp' magic at offset 36 (got {:02X?})",
651 &profile[36..40]
652 )));
653 }
654 Ok(profile)
655}
656
657/// Decode the entire JXL packet (raw codestream OR ISOBMFF-wrapped) and
658/// return the first frame as a [`VideoFrame`]. Round-3 envelope.
659pub fn decode_one_frame(input: &[u8], pts: Option<i64>) -> Result<VideoFrame> {
660 let sig = container::detect(input)
661 .ok_or_else(|| Error::InvalidData("jxl decoder: no JXL signature".into()))?;
662 match sig {
663 container::Signature::RawCodestream => decode_codestream(&input[2..], pts),
664 container::Signature::Isobmff => {
665 // The jxlc/jxlp box payload concatenation is itself a JXL
666 // codestream and therefore begins with the 2-byte `FF 0A`
667 // codestream signature (FDIS Annex B.1). Skip those 2 bytes
668 // before handing off to `decode_codestream` (which expects
669 // bits *after* the signature, matching the raw-codestream
670 // entry point above). Without this strip the SizeHeader
671 // parse below would misalign by 16 bits and cascade into
672 // corrupted FrameHeader/TOC reads.
673 let codestream_owned = container::extract_codestream(input)?;
674 let cs: &[u8] = &codestream_owned;
675 if cs.len() < 2 || cs[0] != 0xFF || cs[1] != 0x0A {
676 return Err(Error::InvalidData(
677 "JXL ISOBMFF: jxlc/jxlp payload missing FF 0A codestream signature".into(),
678 ));
679 }
680 decode_codestream(&cs[2..], pts)
681 }
682 }
683}
684
685fn decode_codestream(codestream: &[u8], pts: Option<i64>) -> Result<VideoFrame> {
686 let mut br = BitReader::new(codestream);
687
688 // 1. SizeHeader (FDIS A.3).
689 let size = SizeHeaderFdis::read(&mut br)?;
690
691 // 2. ImageMetadata (FDIS A.6).
692 let metadata = ImageMetadataFdis::read(&mut br)?;
693
694 // 3. ICC profile (Annex E.4) — round-6 lands the decoder. The
695 // decoded ICC bytes are validated (must contain "acsp" magic at
696 // offset 36 if length >= 40) but not currently propagated to
697 // `VideoFrame` because `oxideav_core::VideoFrame` has no ICC
698 // slot. The decode is still run because (a) it advances the
699 // bit reader past the ICC stream so subsequent FrameHeader /
700 // TOC parsing finds the right bit offset, and (b) it gives a
701 // direct `Error::InvalidData` if the codestream's ICC stream
702 // is malformed.
703 if metadata.colour_encoding.want_icc {
704 let _icc_bytes = decode_icc_stream_at(&mut br)?;
705 }
706
707 // 4. Byte-align before frame data per FDIS 6.3.
708 br.pu0()?;
709
710 // 5. FrameHeader (FDIS C.2).
711 let fh_params = FrameDecodeParams {
712 xyb_encoded: metadata.xyb_encoded,
713 num_extra_channels: metadata.num_extra_channels,
714 have_animation: metadata.have_animation,
715 have_animation_timecodes: metadata
716 .animation
717 .map(|a| a.have_timecodes)
718 .unwrap_or(false),
719 image_width: size.width,
720 image_height: size.height,
721 };
722 let fh = FrameHeader::read(&mut br, &fh_params)?;
723
724 // 6. TOC (FDIS C.3) — entries byte-aligned per spec.
725 let toc = Toc::read(&mut br, &fh)?;
726
727 // 7. Single-group frames have a single TOC entry containing all
728 // frame data. Round 6 only handled that case; round 7 wires
729 // multi-group via per-section bit readers, with inverse
730 // transforms applied AFTER all PassGroups complete (G.4.2).
731 let num_groups = fh.num_groups();
732 let num_lf_groups = fh.num_lf_groups();
733 if num_lf_groups > 1 {
734 return Err(crate::lf_group::unsupported_multi_lf_group_error(
735 num_lf_groups,
736 fh.encoding,
737 ));
738 }
739 // Diagnostic on unhandled features. Round 13 wires LfGlobal +
740 // LfGroup (incl. LfCoefficients + HfMetadata) + HfGlobal + F.1 LF
741 // dequant + F.2 adaptive smoothing into the VarDCT pipeline. End-
742 // to-end pixel decode (HF coefficient subband + IDCT dispatch +
743 // CfL + restoration filters) is round-14+ work — the fast path
744 // below errors with a precise round-14 message AFTER consuming
745 // the LfGlobal/LfGroup/HfGlobal sections + computing the
746 // dequantised LF samples per Listing F.1 + applying F.2 smoothing
747 // when `kSkipAdaptiveLFSmoothing == 0`.
748 if fh.encoding == crate::frame_header::Encoding::VarDct {
749 let scaffold = crate::vardct::recognise_vardct_codestream(&fh, &metadata)?;
750 return decode_vardct_round13(&fh, &metadata, &toc, &mut br, scaffold);
751 }
752 if fh.encoding != crate::frame_header::Encoding::Modular {
753 return Err(Error::Unsupported(format!(
754 "jxl decoder: encoding {:?} not supported",
755 fh.encoding
756 )));
757 }
758 if fh.width == 0 || fh.height == 0 {
759 return Err(Error::InvalidData("jxl decoder: zero-dim frame".into()));
760 }
761
762 // Map TOC entries to byte ranges (post-permutation order). Each
763 // section starts byte-aligned and runs `entries[i]` bytes. The
764 // bit reader is currently aligned to a byte (TOC consumed); the
765 // first section begins at the current byte offset.
766 let frame_data_start = br.bytes_consumed();
767 let codestream_data = br.data();
768 if frame_data_start > codestream_data.len() {
769 return Err(Error::InvalidData(
770 "JXL decoder: frame data start past codestream end".into(),
771 ));
772 }
773 let frame_bytes = &codestream_data[frame_data_start..];
774 // Validate total length against TOC sum.
775 let total_frame_len: u64 = toc.entries.iter().map(|&e| e as u64).sum();
776 if total_frame_len > frame_bytes.len() as u64 {
777 return Err(Error::InvalidData(format!(
778 "JXL decoder: TOC declares {total_frame_len} frame bytes but only {} remaining",
779 frame_bytes.len()
780 )));
781 }
782 // Compute per-section start offsets in the *bitstream* order from
783 // the running sum. The TOC permutation has already been applied to
784 // `entries` and `group_offsets` so they're in the order the spec
785 // says the sections appear on the wire (LfGlobal first, etc.).
786 let mut section_starts: Vec<usize> = Vec::with_capacity(toc.entries.len());
787 let mut acc: u64 = 0;
788 for &e in &toc.entries {
789 section_starts.push(acc as usize);
790 acc = acc.saturating_add(e as u64);
791 }
792 let section_byte_range = |idx: usize| -> Result<&[u8]> {
793 let start = section_starts[idx];
794 let len = toc.entries[idx] as usize;
795 let end = start + len;
796 if end > frame_bytes.len() {
797 return Err(Error::InvalidData(format!(
798 "JXL decoder: section {idx} byte range [{start}..{end}) exceeds frame bytes ({})",
799 frame_bytes.len()
800 )));
801 }
802 Ok(&frame_bytes[start..end])
803 };
804
805 // Slot index helpers per ISO/IEC 18181-1:2024 §F.3.1 TOC layout
806 // (round-9 fix: HfGlobal slot is unconditional, 0-byte for
807 // kModular; HfPass is part of HfGlobal, NOT separate slots):
808 // slot 0 — LfGlobal
809 // slots 1..1+num_lf_groups — LfGroup[*]
810 // slot 1+num_lf_groups — HfGlobal (0-byte for kModular)
811 // slots 2+num_lf_groups + p*num_groups + g — PassGroup[p][g]
812 let lf_global_slot = 0usize;
813 let lf_group_slot = |lf_group_idx: u64| -> usize { 1 + lf_group_idx as usize };
814 let hf_global_slot = 1 + num_lf_groups as usize;
815 let pass_group_slot = |pass_idx: u32, group_idx: u32| -> usize {
816 2 + num_lf_groups as usize + (pass_idx as u64 * num_groups + group_idx as u64) as usize
817 };
818
819 // 8. LfGlobal (slot 0) — read the GlobalModular prelude. For images
820 // where every channel fits in group_dim, this fully populates
821 // `lf_global.global_modular.image`. Otherwise the larger
822 // channels are zero-filled placeholders that PassGroups fill.
823 //
824 // Round-31 (parent-dispatch r16, noise-64x64-lossless unblock):
825 // The single-TOC-entry case still constitutes a section per
826 // FDIS §F.3: "no more bits are read from the codestream than 8
827 // times the byte size indicated in the TOC; if fewer bits are
828 // read, then the remaining bits of the section all have the
829 // value zero." High-entropy modular fixtures (e.g. `cjxl -e 7`
830 // noise) can leave the ANS / hybrid-uint refill loop trying to
831 // read past the last byte by a few bits — those reads must
832 // return zero, not error. Pre-round-31 the fast path used the
833 // non-padding main reader and rejected the `cjxl -e 7` noise
834 // fixture with `unexpected end of JXL bitstream` mid-pixel
835 // decode. Now every LfGlobal read goes through a section reader
836 // sliced to the TOC-declared length so the §F.3 zero-pad rule
837 // applies uniformly to single-TOC-entry frames and multi-TOC
838 // frames alike.
839 let lf_global_bytes = section_byte_range(lf_global_slot)?;
840 let mut lf_global = {
841 let mut lf_br = BitReader::new_section(lf_global_bytes);
842 LfGlobal::read(&mut lf_br, &fh, &metadata)?
843 };
844
845 // 8b. LfGroups (slots 1..1+num_lf_groups) — round 7 only handles
846 // num_lf_groups <= 1 (gated above). For num_lf_groups == 1 with
847 // a fully-decoded GlobalModular image (small-image case), the
848 // LfGroup section is empty (no channel has hshift>=3, vshift>=3
849 // by default for round-7 lossless fixtures). We still consume
850 // the slot bytes by reading the empty ModularLfGroup
851 // sub-bitstream — for round 7 the slot is allowed to be
852 // ignored when no channel matches the LfGroup criterion.
853
854 // 8c. PassGroups (slots 1+num_lf_groups + p*num_groups + g) —
855 // decode each per-pass per-group modular sub-bitstream and
856 // copy samples back into `lf_global.global_modular.image`.
857 if !lf_global.global_modular.fully_decoded || num_groups > 1 || fh.passes.num_passes > 1 {
858 for pass_idx in 0..fh.passes.num_passes {
859 for group_idx in 0..(num_groups as u32) {
860 let slot = pass_group_slot(pass_idx, group_idx);
861 let pg_bytes = section_byte_range(slot)?;
862 let mut pg_br = BitReader::new_section(pg_bytes);
863 crate::pass_group::decode_modular_group_into(
864 &mut pg_br,
865 &fh,
866 &mut lf_global,
867 pass_idx,
868 group_idx,
869 )?;
870 }
871 }
872 // After all PassGroups complete, apply inverse transforms over
873 // the now fully-assembled GlobalModular image (G.4.2 last
874 // paragraph).
875 let bit_depth = metadata.bit_depth.bits_per_sample.max(1);
876 let transforms = lf_global.global_modular.transforms.clone();
877 crate::global_modular::apply_inverse_transforms(
878 &mut lf_global.global_modular.image,
879 &transforms,
880 bit_depth,
881 )?;
882 }
883 let _ = lf_group_slot; // currently only used by round-8 multi-LfGroup
884 let _ = hf_global_slot; // round-10+ VarDCT consumer; for kModular the slot is 0-byte
885
886 // 9. Map the decoded modular image to a VideoFrame.
887 //
888 // Round-1 (2024-spec) supports:
889 // - Grey colour_space (single channel, 1 plane)
890 // - RGB colour_space (3 channels → 3 planes in R/G/B order)
891 // - 8-bit integer bit depth
892 //
893 // Round-11 (2024-spec) adds: kModular + `metadata.xyb_encoded == true`
894 // path through Annex L.2.2 inverse XYB → linear RGB; and the
895 // `frame_header.do_ycbcr == true` path through Annex L.3 inverse
896 // YCbCr → RGB. Both paths land 3-channel RGB output (Grey colour
897 // encoding remains a 1-channel pass-through).
898 //
899 // Other colour spaces (CMYK, etc.) and float bit depths fall in
900 // later rounds.
901 if metadata.bit_depth.float_sample {
902 return Err(Error::Unsupported(
903 "jxl decoder (round 1): float bit depth not supported".into(),
904 ));
905 }
906 // Round 30 (2024-spec) — accept 1..=16 integer samples for the
907 // pass-through path. The 8-bit and 16-bit cases each have their
908 // own pack loop further down; other widths in 1..=16 emit byte
909 // planes whose samples are clamped into the integer range
910 // `[0, 2^bps - 1]` (1 byte/sample for `bps <= 8`, 2 bytes/sample
911 // little-endian for `9 <= bps <= 16`).
912 //
913 // FDIS Annex A.6 + Table A.22 (`bit_depth.bits_per_sample`).
914 // The XYB / YCbCr branches further down still hard-require 8-bit
915 // because their dequantisation lattice is calibrated against the
916 // 8-bit output range; high-bit-depth XYB / YCbCr is round-31+.
917 if metadata.bit_depth.bits_per_sample == 0 || metadata.bit_depth.bits_per_sample > 16 {
918 return Err(Error::Unsupported(format!(
919 "jxl decoder (round 30): bits_per_sample {} not supported (1..=16 only)",
920 metadata.bit_depth.bits_per_sample
921 )));
922 }
923 let img = lf_global.global_modular.image;
924 let n_chans = img.channels.len();
925 let expected_chans = match metadata.colour_encoding.colour_space {
926 ColourSpace::Grey => 1,
927 ColourSpace::Rgb => 3,
928 _ => {
929 return Err(Error::Unsupported(format!(
930 "jxl decoder (round 1): colour_space {:?} not supported (Grey/RGB only)",
931 metadata.colour_encoding.colour_space
932 )));
933 }
934 };
935 // Round 29 (parent-dispatch r14) extends the channel-count contract:
936 // a kModular frame may carry `expected_chans` colour channels plus
937 // `metadata.num_extra_channels` extra channels (alpha, depth, …).
938 // The Modular decoder produces them as a flat channel array in
939 // colour-then-extras order (FDIS Annex G.1.3 "channel order" rule).
940 let n_extra = metadata.num_extra_channels as usize;
941 let expected_with_extras = expected_chans + n_extra;
942 if n_chans != expected_chans && n_chans != expected_with_extras {
943 return Err(Error::Unsupported(format!(
944 "jxl decoder (round 29): {} channels but colour_space wants {} (with {} extra channels = {})",
945 n_chans, expected_chans, n_extra, expected_with_extras
946 )));
947 }
948
949 // Round-11 inverse colour transform decision. The decoded modular
950 // image's first three channels are reinterpreted per Annex L:
951 // * `metadata.xyb_encoded` true → §L.2.2 inverse XYB → linear RGB
952 // (channel order on input: Y', X', B').
953 // * `frame_header.do_ycbcr` true (xyb_encoded must be false per
954 // §L.1) → §L.3 inverse YCbCr → RGB (channel order: Cb, Y, Cr).
955 // * else → channels are already in colour_encoding's space; pass
956 // through (round-1 behaviour).
957 if expected_chans == 3 && metadata.xyb_encoded {
958 if metadata.bit_depth.bits_per_sample != 8 {
959 return Err(Error::Unsupported(format!(
960 "jxl decoder (round 30): XYB high-bit-depth (bps={}) deferred",
961 metadata.bit_depth.bits_per_sample
962 )));
963 }
964 let planes = build_rgb_planes_from_xyb(&img, &lf_global.lf_dequant, &metadata)?;
965 return Ok(VideoFrame { pts, planes });
966 }
967 if expected_chans == 3 && fh.do_ycbcr {
968 if metadata.bit_depth.bits_per_sample != 8 {
969 return Err(Error::Unsupported(format!(
970 "jxl decoder (round 30): YCbCr high-bit-depth (bps={}) deferred",
971 metadata.bit_depth.bits_per_sample
972 )));
973 }
974 let planes = build_rgb_planes_from_ycbcr(&img)?;
975 return Ok(VideoFrame { pts, planes });
976 }
977
978 // Pass-through path: each channel becomes a plane (no colour
979 // conversion). Pre-round-11 behaviour, retained for the five
980 // small lossless fixtures and any non-XYB / non-YCbCr modular
981 // image. Round 30 (2024-spec) extends the per-sample pack rule:
982 //
983 // bps ≤ 8 → 1 byte/sample, plane stride == width;
984 // 9 ≤ bps ≤ 16 → 2 bytes/sample, little-endian, plane stride
985 // == width × 2.
986 //
987 // Choice of LE pack: PNG ships its 16-bit samples big-endian
988 // (RFC 2083 §2.1) whereas the JXL ImageMetadata bit-depth field
989 // is endian-agnostic; we therefore pick LE so a downstream
990 // consumer can treat each plane as a `&[u16]` after a
991 // `bytemuck::cast_slice` or `<u16>::from_le_bytes` step on a
992 // little-endian host without a swap. The convention is
993 // documented in this crate's README under "Plane byte layout".
994 let bps = metadata.bit_depth.bits_per_sample;
995 let max_sample: i32 = (1i32 << bps) - 1;
996 let mut planes: Vec<VideoPlane> = Vec::with_capacity(n_chans);
997 for (i, ch_data) in img.channels.iter().enumerate() {
998 let desc = img.descs[i];
999 let w = desc.width as usize;
1000 let h = desc.height as usize;
1001 let plane = if bps <= 8 {
1002 let mut bytes = Vec::with_capacity(w * h);
1003 for &v in ch_data.iter() {
1004 bytes.push(v.clamp(0, max_sample) as u8);
1005 }
1006 VideoPlane {
1007 stride: w,
1008 data: bytes,
1009 }
1010 } else {
1011 let mut bytes = Vec::with_capacity(w * h * 2);
1012 for &v in ch_data.iter() {
1013 let s = v.clamp(0, max_sample) as u16;
1014 bytes.extend_from_slice(&s.to_le_bytes());
1015 }
1016 VideoPlane {
1017 stride: w * 2,
1018 data: bytes,
1019 }
1020 };
1021 planes.push(plane);
1022 // Sanity check height while we're here.
1023 let expected_len = if bps <= 8 { w * h } else { w * h * 2 };
1024 debug_assert_eq!(planes[i].data.len(), expected_len);
1025 }
1026 Ok(VideoFrame { pts, planes })
1027}
1028
1029/// Convert a 3-channel decoded modular image whose channels carry
1030/// `(Y', X', B')` XYB-domain integer samples into an `R G B` plane
1031/// triple (per §L.2.2). All three channels must share the same
1032/// dimensions; the output planes are byte-stride packed at the same
1033/// width × height.
1034fn build_rgb_planes_from_xyb(
1035 img: &crate::modular_fdis::ModularImage,
1036 lf_dequant: &crate::lf_global::LfChannelDequantization,
1037 metadata: &ImageMetadataFdis,
1038) -> Result<Vec<VideoPlane>> {
1039 if img.channels.len() != 3 {
1040 return Err(Error::InvalidData(format!(
1041 "JXL XYB inverse: expected 3 channels (Y', X', B'), got {}",
1042 img.channels.len()
1043 )));
1044 }
1045 let desc0 = img.descs[0];
1046 for (i, d) in img.descs.iter().enumerate().take(3) {
1047 if d.width != desc0.width || d.height != desc0.height {
1048 return Err(Error::InvalidData(format!(
1049 "JXL XYB inverse: channel {i} dims {}x{} differ from channel 0 {}x{} \
1050 — chroma subsampling not supported in modular XYB output",
1051 d.width, d.height, desc0.width, desc0.height
1052 )));
1053 }
1054 }
1055 let w = desc0.width as usize;
1056 let h = desc0.height as usize;
1057 let n = w * h;
1058 if img.channels[0].len() < n || img.channels[1].len() < n || img.channels[2].len() < n {
1059 return Err(Error::InvalidData(format!(
1060 "JXL XYB inverse: channel sample count short of {}x{}={n}",
1061 w, h
1062 )));
1063 }
1064 let mut r_bytes = Vec::with_capacity(n);
1065 let mut g_bytes = Vec::with_capacity(n);
1066 let mut b_bytes = Vec::with_capacity(n);
1067 let oim = &metadata.opsin_inverse_matrix;
1068 let tm = &metadata.tone_mapping;
1069 for idx in 0..n {
1070 // Channel order on input is `(Y', X', B')` per FDIS §L.2.2
1071 // first paragraph.
1072 let y_prime = img.channels[0][idx];
1073 let x_prime = img.channels[1][idx];
1074 let b_prime = img.channels[2][idx];
1075 let (r_lin, g_lin, b_lin) =
1076 crate::xyb::modular_xyb_to_linear_rgb(y_prime, x_prime, b_prime, lf_dequant, oim, tm);
1077 r_bytes.push(crate::xyb::linear_rgb_to_u8(r_lin));
1078 g_bytes.push(crate::xyb::linear_rgb_to_u8(g_lin));
1079 b_bytes.push(crate::xyb::linear_rgb_to_u8(b_lin));
1080 }
1081 Ok(vec![
1082 VideoPlane {
1083 stride: w,
1084 data: r_bytes,
1085 },
1086 VideoPlane {
1087 stride: w,
1088 data: g_bytes,
1089 },
1090 VideoPlane {
1091 stride: w,
1092 data: b_bytes,
1093 },
1094 ])
1095}
1096
1097/// Convert a 3-channel decoded modular image whose channels carry
1098/// `(Cb, Y, Cr)` samples (YCbCr-encoded modular path) into an
1099/// `R G B` plane triple per §L.3. Outputs 8-bit bytes; the spec
1100/// formula treats inputs as floats in the [0, 1] interval, so we
1101/// rescale `[0..=255]` integer samples by `1/255` first then re-
1102/// quantise the RGB outputs by 255.
1103fn build_rgb_planes_from_ycbcr(img: &crate::modular_fdis::ModularImage) -> Result<Vec<VideoPlane>> {
1104 if img.channels.len() != 3 {
1105 return Err(Error::InvalidData(format!(
1106 "JXL YCbCr inverse: expected 3 channels (Cb, Y, Cr), got {}",
1107 img.channels.len()
1108 )));
1109 }
1110 let desc0 = img.descs[0];
1111 for (i, d) in img.descs.iter().enumerate().take(3) {
1112 if d.width != desc0.width || d.height != desc0.height {
1113 return Err(Error::Unsupported(format!(
1114 "JXL YCbCr inverse: channel {i} dims {}x{} differ from channel 0 {}x{} \
1115 — chroma subsampling not yet supported in YCbCr modular output",
1116 d.width, d.height, desc0.width, desc0.height
1117 )));
1118 }
1119 }
1120 let w = desc0.width as usize;
1121 let h = desc0.height as usize;
1122 let n = w * h;
1123 let mut r_bytes = Vec::with_capacity(n);
1124 let mut g_bytes = Vec::with_capacity(n);
1125 let mut b_bytes = Vec::with_capacity(n);
1126 for idx in 0..n {
1127 // Spec §L.3 channel order: (Cb, Y, Cr).
1128 let cb = img.channels[0][idx] as f32 / 255.0;
1129 let y = img.channels[1][idx] as f32 / 255.0;
1130 let cr = img.channels[2][idx] as f32 / 255.0;
1131 let (r_lin, g_lin, b_lin) = crate::xyb::inverse_ycbcr_to_rgb(cb, y, cr);
1132 r_bytes.push(crate::xyb::linear_rgb_to_u8(r_lin));
1133 g_bytes.push(crate::xyb::linear_rgb_to_u8(g_lin));
1134 b_bytes.push(crate::xyb::linear_rgb_to_u8(b_lin));
1135 }
1136 Ok(vec![
1137 VideoPlane {
1138 stride: w,
1139 data: r_bytes,
1140 },
1141 VideoPlane {
1142 stride: w,
1143 data: g_bytes,
1144 },
1145 VideoPlane {
1146 stride: w,
1147 data: b_bytes,
1148 },
1149 ])
1150}
1151
1152/// VarDCT round-13 driver. Reads LfGlobal + LfGroup + HfGlobal off
1153/// the TOC for a single-LfGroup frame, computes per-channel LF
1154/// multipliers per Listing C.1 / F.1, runs Listing F.1 dequant on the
1155/// LfCoefficients, and (when `kSkipAdaptiveLFSmoothing` is clear and no
1156/// channel is subsampled) applies F.2 adaptive smoothing in place. The
1157/// dequantised LF samples are then dropped — round 14 will pick up from
1158/// here and dispatch IDCT / CfL / Gaborish / EPF. Returns
1159/// `Error::Unsupported` with a precise "round 14+: HF subband decode +
1160/// IDCT not yet wired" message at the end of the round-13 pipeline.
1161fn decode_vardct_round13(
1162 fh: &FrameHeader,
1163 metadata: &ImageMetadataFdis,
1164 toc: &Toc,
1165 br: &mut BitReader<'_>,
1166 scaffold: crate::vardct::VarDctScaffold,
1167) -> Result<VideoFrame> {
1168 let num_groups = fh.num_groups();
1169 let num_lf_groups = fh.num_lf_groups();
1170 if num_lf_groups != 1 || fh.passes.num_passes != 1 {
1171 return Err(Error::Unsupported(format!(
1172 "jxl VarDCT decoder (round 13): num_lf_groups={num_lf_groups} num_passes={} \
1173 — multi-LfGroup / multi-pass VarDCT defers to round 14+",
1174 fh.passes.num_passes
1175 )));
1176 }
1177
1178 let frame_data_start = br.bytes_consumed();
1179 let codestream_data = br.data();
1180 if frame_data_start > codestream_data.len() {
1181 return Err(Error::InvalidData(
1182 "JXL VarDCT round 13: frame data start past codestream end".into(),
1183 ));
1184 }
1185 let frame_bytes = &codestream_data[frame_data_start..];
1186 let total_frame_len: u64 = toc.entries.iter().map(|&e| e as u64).sum();
1187 if total_frame_len > frame_bytes.len() as u64 {
1188 return Err(Error::InvalidData(format!(
1189 "JXL VarDCT round 13: TOC declares {total_frame_len} frame bytes but only {} \
1190 remaining",
1191 frame_bytes.len()
1192 )));
1193 }
1194 let mut section_starts: Vec<usize> = Vec::with_capacity(toc.entries.len());
1195 let mut acc: u64 = 0;
1196 for &e in &toc.entries {
1197 section_starts.push(acc as usize);
1198 acc = acc.saturating_add(e as u64);
1199 }
1200 let section_byte_range = |idx: usize| -> Result<&[u8]> {
1201 if idx >= toc.entries.len() {
1202 return Err(Error::InvalidData(format!(
1203 "JXL VarDCT round 13: TOC slot {idx} out of range (entries={})",
1204 toc.entries.len()
1205 )));
1206 }
1207 let start = section_starts[idx];
1208 let len = toc.entries[idx] as usize;
1209 let end = start + len;
1210 if end > frame_bytes.len() {
1211 return Err(Error::InvalidData(format!(
1212 "JXL VarDCT round 13: section {idx} byte range [{start}..{end}) exceeds frame bytes ({})",
1213 frame_bytes.len()
1214 )));
1215 }
1216 Ok(&frame_bytes[start..end])
1217 };
1218
1219 // Slot indexing per F.3.1 (round-9 fix: HfGlobal slot is unconditional):
1220 // slot 0 — LfGlobal
1221 // slots 1..1+num_lf_groups — LfGroup[*]
1222 // slot 1+num_lf_groups — HfGlobal (contains HfPass for kVarDCT)
1223 let lf_global_slot = 0usize;
1224 let lf_group_slot = |lf_group_idx: u64| -> usize { 1 + lf_group_idx as usize };
1225 let hf_global_slot = 1 + num_lf_groups as usize;
1226
1227 // Round 15: F.3.1 says when `num_groups == 1 && num_passes == 1`
1228 // the TOC has a SINGLE entry containing all section bytes
1229 // concatenated WITHOUT byte alignment between sections. Each section
1230 // continues from the previous section's bit cursor. When the TOC
1231 // has multiple entries, each section is sliced into its own byte
1232 // range and read against a fresh BitReader.
1233 let single_toc = toc.entries.len() == 1
1234 && num_groups == 1
1235 && fh.passes.num_passes == 1
1236 && num_lf_groups == 1;
1237
1238 let (lf_global, lf_group, _hf_global) = if single_toc {
1239 // Single-TOC-entry path: chain section reads on the same bit
1240 // reader, no byte-aligned slicing between sections.
1241 let lf_global_bytes = section_byte_range(lf_global_slot)?;
1242 let mut shared_br = BitReader::new_section(lf_global_bytes);
1243 let lf_global = LfGlobal::read(&mut shared_br, fh, metadata)?;
1244 let _quantizer = lf_global
1245 .quantizer
1246 .ok_or_else(|| Error::InvalidData("JXL VarDCT round 13: Quantizer missing".into()))?;
1247 let _ = lf_global.hf_block_context.as_ref().ok_or_else(|| {
1248 Error::InvalidData("JXL VarDCT round 13: HfBlockContext missing".into())
1249 })?;
1250 let _ = lf_global.lf_channel_correlation.ok_or_else(|| {
1251 Error::InvalidData("JXL VarDCT round 13: LfChannelCorrelation missing".into())
1252 })?;
1253
1254 let lf_group = crate::lf_group::LfGroup::read(&mut shared_br, fh, &lf_global, metadata, 0)?;
1255
1256 let hf_global = crate::hf_global::HfGlobal::read(&mut shared_br, num_groups)?;
1257 (lf_global, lf_group, hf_global)
1258 } else {
1259 // Multi-TOC-entry path: slice each section into its own byte
1260 // range and read against a fresh BitReader.
1261 let lf_global_bytes = section_byte_range(lf_global_slot)?;
1262 let mut lf_br = BitReader::new_section(lf_global_bytes);
1263 let lf_global = LfGlobal::read(&mut lf_br, fh, metadata)?;
1264 let _quantizer = lf_global
1265 .quantizer
1266 .ok_or_else(|| Error::InvalidData("JXL VarDCT round 13: Quantizer missing".into()))?;
1267 let _ = lf_global.hf_block_context.as_ref().ok_or_else(|| {
1268 Error::InvalidData("JXL VarDCT round 13: HfBlockContext missing".into())
1269 })?;
1270 let _ = lf_global.lf_channel_correlation.ok_or_else(|| {
1271 Error::InvalidData("JXL VarDCT round 13: LfChannelCorrelation missing".into())
1272 })?;
1273
1274 let lf_group_bytes = section_byte_range(lf_group_slot(0))?;
1275 let mut lg_br = BitReader::new_section(lf_group_bytes);
1276 let lf_group = crate::lf_group::LfGroup::read(&mut lg_br, fh, &lf_global, metadata, 0)?;
1277
1278 let hf_global_bytes = section_byte_range(hf_global_slot)?;
1279 let mut hg_br = BitReader::new_section(hf_global_bytes);
1280 let hf_global = crate::hf_global::HfGlobal::read(&mut hg_br, num_groups)?;
1281 (lf_global, lf_group, hf_global)
1282 };
1283
1284 // Re-extract Quantizer for the dequant path below (it was already
1285 // checked for presence above in both branches).
1286 let quantizer = lf_global
1287 .quantizer
1288 .ok_or_else(|| Error::InvalidData("JXL VarDCT round 13: Quantizer missing".into()))?;
1289
1290 let lf_coeff = lf_group.lf_coeff.ok_or_else(|| {
1291 Error::InvalidData("JXL VarDCT round 13: LfCoefficients missing on VarDCT LfGroup".into())
1292 })?;
1293 let hf_meta = lf_group.hf_meta.ok_or_else(|| {
1294 Error::InvalidData("JXL VarDCT round 13: HfMetadata missing on VarDCT LfGroup".into())
1295 })?;
1296
1297 // Derive DctSelect / HfMul from BlockInfo per FDIS C.5.4 prose.
1298 // The grid covers the LfGroup's pixel rectangle; for a single-
1299 // LfGroup frame that's the full frame.
1300 let lf_w = lf_group.mlf_group.lf_group_width;
1301 let lf_h = lf_group.mlf_group.lf_group_height;
1302 let _dct_grid = crate::dct_select::derive_dct_select(&hf_meta, lf_w, lf_h)?;
1303
1304 // HfGlobal already decoded above (in either single-TOC or multi-TOC
1305 // branch); `_hf_global` is the parsed bundle for round 14+ wiring.
1306
1307 // F.1 LF dequantisation (Listing F.1) over the per-LfGroup
1308 // LfCoefficients. Unwrap the lf_quant Vec into a fixed-size [3]
1309 // array as expected by `dequant_lf`.
1310 if lf_coeff.lf_quant.len() != 3 {
1311 return Err(Error::InvalidData(format!(
1312 "JXL VarDCT round 13: LfCoefficients has {} channels, expected 3",
1313 lf_coeff.lf_quant.len()
1314 )));
1315 }
1316 let lf_quant: [Vec<i32>; 3] = [
1317 lf_coeff.lf_quant[0].clone(),
1318 lf_coeff.lf_quant[1].clone(),
1319 lf_coeff.lf_quant[2].clone(),
1320 ];
1321 let multipliers = crate::lf_dequant::LfMultipliers::compute(&lf_global.lf_dequant, &quantizer);
1322 let mut dequant = crate::lf_dequant::dequant_lf(
1323 &lf_quant,
1324 lf_coeff.lf_quant_widths,
1325 lf_coeff.lf_quant_heights,
1326 lf_coeff.extra_precision,
1327 &multipliers,
1328 );
1329
1330 // F.2 adaptive LF smoothing (gated by kSkipAdaptiveLFSmoothing flag
1331 // + no channel subsampled).
1332 if crate::lf_dequant::should_apply_adaptive_lf_smoothing(fh) {
1333 crate::lf_dequant::apply_adaptive_lf_smoothing(&mut dequant, &multipliers);
1334 }
1335 // The dequantised LF samples in `dequant` are now the inputs to
1336 // round-14's IDCT / CfL / Gaborish / EPF chain. For now we drop
1337 // them and report a precise "next round" Unsupported.
1338 let _ = dequant;
1339
1340 Err(Error::Unsupported(format!(
1341 "jxl VarDCT decoder (round 13): codestream parsed and LfCoefficients dequantised + \
1342 smoothed ({}x{}, {} colour channels, group_dim={}, num_groups={}) — HF coefficient \
1343 subband + IDCT dispatch + CfL + Gaborish + EPF defer to round 14+",
1344 scaffold.width,
1345 scaffold.height,
1346 scaffold.num_colour_channels,
1347 scaffold.group_dim,
1348 num_groups
1349 )))
1350}
1351
1352/// FDIS-side `Headers` returned by [`probe_fdis`]. Mirrors the
1353/// committee-draft [`Headers`] but uses the FDIS bundle types.
1354#[derive(Debug, Clone)]
1355pub struct HeadersFdis {
1356 pub signature: container::Signature,
1357 pub size: SizeHeaderFdis,
1358 pub metadata: ImageMetadataFdis,
1359}
1360
1361/// FDIS-side probe: parse SizeHeader + full A.6 ImageMetadata. Falls
1362/// back to the committee-draft probe if the FDIS path errors (so that
1363/// container detection still works on edge cases the committee-draft
1364/// path tolerates).
1365pub fn probe_fdis(input: &[u8]) -> Result<HeadersFdis> {
1366 let signature = container::detect(input)
1367 .ok_or_else(|| Error::InvalidData("jxl probe: no JXL signature".into()))?;
1368 match signature {
1369 container::Signature::RawCodestream => probe_fdis_codestream(&input[2..], signature),
1370 container::Signature::Isobmff => {
1371 let codestream_owned = container::extract_codestream(input)?;
1372 probe_fdis_codestream(&codestream_owned, signature)
1373 }
1374 }
1375}
1376
1377fn probe_fdis_codestream(
1378 codestream: &[u8],
1379 signature: container::Signature,
1380) -> Result<HeadersFdis> {
1381 let mut br = BitReader::new(codestream);
1382 let size = SizeHeaderFdis::read(&mut br)?;
1383 let metadata = ImageMetadataFdis::read(&mut br)?;
1384 Ok(HeadersFdis {
1385 signature,
1386 size,
1387 metadata,
1388 })
1389}
1390
1391/// Inspect a JXL file (raw codestream or ISOBMFF-wrapped) and return the
1392/// signature type + parsed `SizeHeader` + `ImageMetadata` preamble.
1393///
1394/// This is the main API users can reach today: it covers identification,
1395/// dimensions and sample format without needing an actual decoder.
1396pub fn probe(input: &[u8]) -> Result<Headers> {
1397 parse_headers(input)
1398}
1399
1400/// Encoder slot, always rejected. Exposed for completeness so callers
1401/// that wire an `Encoder` factory by codec id get a clean `Unsupported`
1402/// error instead of `CodecNotFound`.
1403pub fn make_encoder(_params: &CodecParameters) -> Result<Box<dyn Encoder>> {
1404 Err(Error::Unsupported(
1405 "jxl encode is out of scope for this crate".into(),
1406 ))
1407}
1408
1409#[cfg(test)]
1410mod tests {
1411 use super::*;
1412
1413 #[test]
1414 fn decoder_factory_returns_live_decoder() {
1415 let mut ctx = RuntimeContext::new();
1416 register(&mut ctx);
1417 let params = CodecParameters::video(CodecId::new(CODEC_ID_STR));
1418 let dec = ctx
1419 .codecs
1420 .first_decoder(¶ms)
1421 .expect("expected live decoder");
1422 assert_eq!(dec.codec_id().as_str(), CODEC_ID_STR);
1423 }
1424
1425 #[test]
1426 fn probe_rejects_non_jxl() {
1427 let err = probe(&[0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]).unwrap_err();
1428 assert!(matches!(err, Error::InvalidData(_)));
1429 }
1430
1431 #[test]
1432 fn probe_accepts_minimal_raw_codestream() {
1433 // small=1, 8x8 square (ratio=1), all_default=1 → 10 bits total.
1434 // LSB-first packing: byte0 holds bits 0..=7, byte1 holds bits 8..=9.
1435 // bit0=1, bits1..=5=0, bits6..=8=001 (ratio=1), bit9=1 (all_default)
1436 // → byte0 = 0b01000001 = 0x41, byte1 = 0b00000010 = 0x02.
1437 let input = [0xFF, 0x0A, 0x41, 0x02];
1438 let h = probe(&input).unwrap();
1439 assert_eq!(h.size.width, 8);
1440 assert_eq!(h.size.height, 8);
1441 assert!(h.metadata.all_default);
1442 }
1443
1444 #[test]
1445 fn encoder_factory_rejects_cleanly() {
1446 let params = CodecParameters::video(CodecId::new(CODEC_ID_STR));
1447 assert!(matches!(make_encoder(¶ms), Err(Error::Unsupported(_))));
1448 }
1449}