oxideav-webp 0.2.2

Pure-Rust WebP image codec — orphan-rebuild scaffold pending clean-room re-implementation.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
# oxideav-webp

Pure-Rust WebP image codec (RIFF + VP8 + VP8L + VP8X + ALPH + ANIM +
ANMF). Decoder and encoder both at production status as of 2026-05-27.

* Full **decode** of every container variant: simple-lossy (VP8),
  simple-lossless (VP8L), extended (`VP8X`) with `ALPH` alpha plane,
  ICCP / EXIF / XMP metadata, and animated WebP (`ANIM` + `ANMF`).
* **Encode** of complete `.webp` files in both lossless (VP8L) and
  lossy (VP8) modes, plus complete animated `.webp` files.
* Decoded pixels land in a tightly-packed `Vec<u8>` of `width * height
  * 4` RGBA bytes — drops directly into [`image`]https://crates.io/crates/image's
  `ImageBuffer::from_raw` with zero copy.
* The full crates.io `0.1.2` public surface is reachable, both with
  the default `registry` build and under `--no-default-features`.
  [`tests/api_compat_0_1_2.rs`]./tests/api_compat_0_1_2.rs is the
  29-test compile-only assertion suite that pins every published
  symbol in place.

## Install

```toml
# Standalone — flat RGBA in / flat RGBA out, no framework dep:
[dependencies]
oxideav-webp = { version = "0.1", default-features = false }

# With the OxideAV runtime:
[dependencies]
oxideav-webp = "0.1"
```

| Feature | Default | What it does |
|---|---|---|
| `registry` | ✅ on | Pulls `oxideav-core` plus the framework-trait factories. Cascades into `oxideav-vp8/registry` so the VP8-lossy encode delegation can reach the sibling crate's factories. With this off, **lossless encode/decode + animation + metadata extraction all still work**; only the VP8-lossy *encode* requires `registry`. |
| `simd` | off (nightly only) | Opt-in `std::simd` acceleration of the hottest pixel-repack loop (`Vp8lImage::to_rgba`). Requires a nightly rustc because it activates `#![feature(portable_simd)]`. Byte-identical to the scalar path (asserted by `vp8l::tests::to_rgba_simd_matches_scalar_byte_for_byte`); see [`BENCHMARKS.md`]./BENCHMARKS.md for the round-170 before/after numbers. |

### Benchmarks

The crate ships twenty-nine criterion benches under `benches/`,
grouped by domain:

* **End-to-end**`lossless_decode`, `lossless_encode`,
  `stacked_transform_encode` (round 307: full-file `encode_webp_lossless`
  separated by the three content regimes the §3.5 stacked-transform
  chains target — `palette_indexed` (§4.4 color-indexing → §4.1
  predictor), `photo_decorrelated` (§4.2 cross-color → predictor and
  → §4.3 subtract-green → predictor), and `smooth_gradient` (the §4.1
  predictor sub-image lambda sweep across the residual-vs-§7.2-sub-image
  cost crossover rounds 302–306 tuned). Gives those cost-model rounds a
  committed per-regime A/B harness for both encode time and output size;
  all three inputs round-trip losslessly),
  `lossless_decode_mixes` (round 283: full-file decode per elected
  §4 transform mix — predictor / color-indexing / cross-color /
  subtract-green / no-transform, the elected list asserted at
  setup), `anim_decode` (round 283: §2.7.1.1 full-timeline animation
  decode, all-keyframe vs dirty-rect-delta `ANMF` layouts),
  `metadata_walk` (round 283: `extract_metadata` chunk walk at three
  chunk-count / payload tiers), and `lossy_decode` (round 289: the
  §2.5 `VP8 ` lossy path at three altitudes — full public
  `decode_webp`, `decode_lossy_rgba` on the extracted bitstream, and
  the crate-owned `yuv420_to_rgba` YCbCr→RGB conversion loop in
  isolation; the sibling `oxideav-vp8` decoder owns the
  entropy/IDCT/loop-filter work), and `alpha_decode` (round 291: the
  §2.7.1.2 `ALPH` alpha-plane decode — the rank-1 webp-owned lossy
  cost — at three altitudes: public `decode_alpha_plane` e2e,
  `alph::decode_alpha` on the extracted payload, and the Stage-2
  inverse-filter per-pixel loop in isolation, one cell per `F` method).
* **Decoder §4.x inverse transforms**`inverse_predictor`
  (per-mode), `inverse_color` (per `size_bits`),
  `inverse_color_indexing` (per palette tier),
  `inverse_subtract_green`, `inverse_color_table`, plus the
  `argb_to_rgba` repack.
* **Encoder forward passes**`predictor_subtract`,
  `apply_subtract_green`, `lz77_match`, `lz77_chain` (round 286: the
  §5.2.2 matcher across five hash-chain-depth regimes — period-2/4/64
  repeats, near-unique, gradient), `pick_block_cte` (the §3.5.2
  chooser walk), `meta_prefix_cluster` (round 294: the §6.2.2
  entropy-image block-clustering heuristic behind
  `encode_with_meta_prefix` — coarse-RGB-histogram Lloyd's k-means
  across content regime / `num_groups` / image-size sweeps), the
  §5.2.2 `value_to_prefix` split, and `distance_code` (round 300: the
  §5.2.2 `pixel_distance_to_distance_code` chooser run twice per match
  — a 120-entry `DISTANCE_MAP` scan picking the smallest distance code
  — across RLE / row-above / close-neighbour / no-match regimes). Round
  301 gave the chooser a smallest-code early-out: because map codes
  occupy `1..=120` and the scan-line fallback is `D + 120 ≥ 121`, the
  *first* entry (in ascending code order) whose `max(xi + yi·W, 1)`
  equals the distance is already the smallest valid code, so the scan
  returns on first match instead of running all 120 entries. The chosen
  code — and therefore every emitted byte — is unchanged (proven by an
  equivalence test against the full no-early-out scan over distances
  1..=400 + a large tail across six widths); the matching regimes drop
  from ~64 µs/cell to ~0.8–2.4 µs (≈30–160× on the inner-loop probe),
  while the genuine no-match worst case (`dist_large_nomatch`) still
  scans all 120 entries as before.
* **Entropy / prefix-code chain**`build_code_lengths` and
  `canonical_codes` (encoder §3.7.2) and `prefix_from_code_lengths`
  (decoder §6.2.1), each over the four §3.7.1 alphabets
  (distance-40 / literal-256 / green-281 / green-2328) in dense and
  sparse frequency regimes, plus `read_symbol` (round 286: the
  §6.2.1 per-symbol reader — the rank-1 decode hotspot — across the
  primary-table fast path vs the > 8-bit walk continuation), the
  per-call `read_lz77_value` (§3.6.2.2 Table 4 regimes) and
  `color_cache_hash` (§3.6.2.3 `code_bits` 1 / 4 / 8 / 11) decoder
  benches, plus `backward_reference` (round 297: the §5.2.2 decoder
  LZ77 copy-back `apply_backward_reference` — the run replay that
  mirrors the `lz77_match` / `lz77_chain` encoder matchers — across
  non-overlap / partial-overlap / `dist == 1` RLE / many-short-run
  regimes).

Rounds 277 / 278 rewrote the §3.7.2 / §6.2.1 length-then-code chain
bit-identically (sorted-leaf two-queue merge + single-rescan counting
sort + O(1)-per-adjustment length cap; capped dense `green2328`
417.8 µs → 26.4 µs end to end), rounds 280 / 281 hoisted the
§4.1 / §3.5.2 encoder chooser walks out of their per-pixel loops
(`lossless_encode_natural_128` ~170 ms → ~120 ms), and round 284 gave
the §6.2.1 `read_symbol` decoder a 256-entry peeked-bits primary
lookup table (entropy-heavy full-file decodes −37% to −49%,
bit-identical across the full fixture corpus and pinned in CI by a
corpus-wide decode digest test). Round 286 (benchmark mode, `src/`
byte-identical) added the `read_symbol` and `lz77_chain` harnesses
that isolate the rank-1 decode and rank-3 encode hotspots, measured
the long-code (> 8-bit) read path at +27% per symbol over the
primary-table floor, and ranked the decoder 9–11-bit spill table as
the next PROFILE-OPT target. Round 287 acted on that candidate:
the per-bit §6.2.1 walk now resolves "is there a code row at this
length?" through a 16-byte direct length→row side table instead of a
linear rescan per bit — a 2.33× speedup on the worst-case
many-distinct-length walk (`read_symbol_manylen16_walk` 86.8 → 37.2 µs),
byte-identical, with no added cache footprint; the spill table itself
was prototyped and rejected as an L1-thrashing regression. Round 289
(benchmark mode, decoded bytes identical) added the `lossy_decode`
harness — the first coverage of the §2.5 `VP8 ` lossy path — and a
ranked lossy-decode hotspot map: of a 128×128 lossy frame's ≈359 µs
end-to-end decode, the container walk + `ALPH` layering is ≈52%, the
sibling `oxideav-vp8` decode (entropy + IDCT + intra-pred + loop
filter, out of this crate's scope) ≈39%, and the crate-owned
`yuv420_to_rgba` YCbCr→RGB conversion ≈9% — the latter purely
per-pixel-bound and the cleanest A/B target for a future SIMD pass (the
lossy analogue of the `argb_to_rgba` SIMD treatment). Round 290 acted on
that candidate: `yuv420_to_rgba` now hoists the §9.2 chroma-matrix terms
out of the per-pixel loop — the two luma pixels of a 4:2:0 pair share one
chroma column, so the three `(Cb−128, Cr−128)` contributions are computed
once per column and reused, and the output is written through pre-sized
per-row slices instead of per-pixel `Vec::push`. The conversion drops from
≈34 µs to ≈10.5 µs at fixture size (−68%; −72% at 256×256), byte-for-byte
identical — proven by a per-pixel oracle test across 9 even/odd dimensions
and by `cargo fuzz` (decode_still_paths + decode, no divergence). Round 291
(benchmark mode, no `src/` change) added the `alpha_decode` harness over
the §2.7.1.2 `ALPH` decode — the rank-1 webp-owned lossy cost the round-289
map had sized only by subtraction — and refined that map: a direct
measurement shows the container walk is ≈1 µs (negligible) and the rank-1
cost is almost entirely the headerless VP8L lossless decode inside
`decode_alpha` (already covered by `read_symbol` / `lossless_decode*`),
while the genuinely alpha-specific §2.7.1.2 inverse-filter loop ranks
Gradient (43.7 µs) > Horizontal (21.8 µs) > Vertical (13.5 µs) > None
(9.5 µs) at 128×128 — flagging a per-method border-rule hoist (the r180
`inverse_predictor` treatment) as the next PROFILE-OPT target. Round 293
acted on that candidate: the §2.7.1.2 Stage-2 inverse filter now dispatches
on `F` once and splits each method into a one-shot border pass (top-left /
first-row / first-column) plus a tight interior loop, instead of
re-evaluating a `match (x, y)` + `match filtering` on every pixel. `None`
becomes a plain identity move (no per-pixel work) and drops 9.5 µs → 0.23 µs
(−97%); `Vertical` — whose predictor reads the row above, so the interior
loop vectorises — drops 13.5 µs → 1.57 µs (−88%); `Horizontal` / `Gradient`
are flat (their left-neighbour serial dependency, not the dispatch, was the
bound). Byte-for-byte identical — proven by a new per-pixel oracle test
across 9 dimension/method combinations and by 400 K `decode_alph` fuzz runs
with no divergence. Round 294
(benchmark mode, no behavioural change — one `fn` → `pub fn` visibility
widen on `cluster_blocks_by_histogram_distance`, matching the
`pick_block_cte` exposure pattern) added the `meta_prefix_cluster`
harness over the encoder's §6.2.2 entropy-image block-clustering
heuristic — the last encode stage sized only by subtraction inside the
`lossless_encode` e2e number — and ranked it: the per-pixel
feature-binning pass dominates (≈70–80% of clustering self-time, isolated
by the uniform-content cell that skips the Lloyd loop; the kernel is
pixel-bound not block-bound), the Lloyd assignment/update loop is a clear
second only on poorly-separated content (gradient +36% over a clean
bimodal split), and `num_groups` 2→4 is nearly free at the default block
size — flagging a feature-pass scattered-write reduction as the next
PROFILE-OPT target. Round 296 returned to the rank-1 lossless-decode
hotspot `inverse_predictor`: its interior loads the per-pixel predictor
mode every pixel though the mode is constant across each `1 <<
size_bits` block (`size_bits = ReadBits(3) + 2 ∈ [2, 9]`, so blocks are
always multi-pixel). A per-block mode hoist (mirroring the round-207
`inverse_color` CTE hoist) was implemented and proven byte-identical —
the existing cross-check test plus an FNV-1a A/B over all seven
`lossless-*` fixtures both matched — but yielded no measurable win (the
interior is dominated by the 14-way `predict()` dispatch, not the mode
load) and the host was saturated during measurement, so the original
body was retained per the round-224 precedent. The realistic block path
is now benched (`inverse_predictor_blocks16_mixed_256x256`, `size_bits
= 4`), filling the gap left by the pre-existing `size_bits = 0` cells.
Numbers,
profile findings, the full
round-283 regression re-run (stable + nightly `simd`), and the
optimization log live in [`BENCHMARKS.md`](./BENCHMARKS.md). Run
with:

```text
CARGO_TARGET_DIR=/tmp/oxideav-webp-bench-target \
  cargo bench --manifest-path crates/oxideav-webp/Cargo.toml \
    --bench <name> -- --quick
```

### Fuzzing

Thirty-two [`cargo-fuzz`](https://rust-fuzz.github.io/book/cargo-fuzz.html)
targets live under [`fuzz/fuzz_targets/`](./fuzz/fuzz_targets):
`decode` and `extract_metadata` feed arbitrary bytes through the two
public single-shot entry points; `roundtrip_lossless` synthesises a
≤64 × 64 RGBA tile from fuzz-controlled bytes and asserts the §3
lossless contract pixel-for-pixel across `encode_webp_lossless` →
`decode_webp`; `roundtrip_animated` (round 238) widens the same
contract across the §2.7.1.1 animation carrier — a fuzz-controlled
1..8-frame animation (canvas ≤ 32 × 32) goes through
`build_animated_webp` → `decode_webp` and the frame count + per-frame
width/height + per-frame `duration_ms` + per-frame RGBA bytes are
asserted byte-identical; `decode_alph` (round 255) drives the
§2.7.1.2 ALPH standalone entry point `alph::decode_alpha` directly
across the four filter methods (none / horizontal / vertical /
gradient) and the two compression methods (raw + headerless §3 VP8L)
with `plane.len() == width * height` asserted on success;
`parse_vp8x` (round 256) drives the §2.7.1 VP8X chunk parser
standalone entry point `vp8x::Vp8xHeader::parse` directly across the
full §2.7.1 Figure 7 flag-octet / reserved-field / canvas-dimension
cross-product with every successfully-decoded field cross-checked
against the input bytes the parser observed and every error branch
cross-checked against the §2.7.1 refusal triggers; `parse_anmf`
(round 257) drives the §2.7.1.1 ANMF chunk header parser standalone
entry point `anmf::AnmfHeader::parse` directly across the full
§2.7.1.1 Figure 9 5 × uint24 + info-byte cross-product (Frame X * 2
doubling, Frame W/H Minus One + 1 resolution, uint24 LE duration,
info-byte Reserved / B / D extraction at bits 7..2 / 1 / 0) with
every successfully-decoded field cross-checked against the input
bytes the parser observed and the `PayloadTooShort` branch
cross-checked against the §2.7.1.1 16-byte minimum; `parse_anim`
(round 258) drives the §2.7.1.1 ANIM chunk parser standalone entry
point `anim::AnimHeader::parse` directly across the full §2.7.1.1
Figure 8 BGRA × loop-count cross-product (BGRA byte-order
background, `as_u32_le()` matching the LE u32 reload, LE u16 loop
count, `loops_forever()` predicate) with the `BadPayloadLength`
branch cross-checked against the §2.7.1.1 fixed 6-byte length;
`parse_alph` (round 259) drives the §2.7.1.2 ALPH info-byte parser
standalone entry point `alph::AlphHeader::parse` directly across the
full §2.7.1.2 Figure 10 `Rsv|P|F|C` 2-bit-field cross-product
(MSB-first bit decomposition at bits 7..6 / 5..4 / 3..2 / 1..0,
typed-variant mapping for the `C` / `F` / `P` enums including the
`Reserved(_)` variants on undefined 2 / 3, fixed `bitstream_offset
== 1`) with the `EmptyPayload` branch cross-checked against the
§2.7.1.2 requirement that the payload carry at minimum the one info
byte; `parse_transform_list` (round 260) drives the §4 VP8L
transform-list reader standalone entry point
`vp8l_stream::TransformList::read` directly across the full §4
transform-presence loop (per-type fixed fields, duplicate-detection
refusal, deferred §5 entropy-body boundary) with `Ok(list)` cross-checked
against `transforms().len() <= 4`, no repeated `TransformType` across
entries, §4.1 / §4.2 `size_bits ∈ [2, 9]`, §4.4 `color_table_size ∈
[1, 256]` plus the threshold-table `width_bits` derivation, the
`body_bit_position()` within the slice's bit length, and the
`stopped_at_entropy_body()` flag consistent with the last entry's
`has_entropy_body()`; `parse_meta_prefix` (round 261) drives the
§5.2.3 color-cache info + §6.2.2 meta-prefix + §6.2
5-prefix-code-group reader standalone entry point
`meta_prefix::MetaPrefixHeader::read` directly across the full
§5.2.3 + §6.2.2 preamble cross-product (color-cache enable bit +
4-bit `color_cache_code_bits` range gate, §6.2.2 `ImageRole`
dispatch, `EntropyImagePending` `prefix_bits = ReadBits(3) + 2`
range, and the §6.2.2 `DIV_ROUND_UP(image_dim, 1 << prefix_bits)`
entropy-image dimension derivation) with `Ok(header)` cross-checked
against the §5.2.3 `code_bits ∈ {0} ∪ [1, 11]` range, the
`is_enabled()` / `size()` derivations, the `EntropyCoded` role
never reaching `EntropyImagePending` (the meta-prefix bit is
absent for sub-images), the `EntropyImagePending` branch's
`prefix_bits ∈ [2, 9]`, the recomputed entropy-image
width/height matching the recorded values, and the
`entropy_image_bit_position` within the slice's bit length;
`Err(InvalidColorCacheCodeBits)` cross-checked against the
`value ∈ {0} ∪ [12, 15]` rejection-window;
`parse_container` (round 262) drives the §2.3 / §2.4 RIFF/WEBP
chunk-walker standalone entry point `container::parse` directly
with every byte of the fuzz buffer attacker-controlled (including
the §2.4 `File Size` field at bytes 4..8 and every per-chunk `Size`
field at offsets `+4..+8` relative to its header) with `Ok(container)`
cross-checked against the §2.3 + §2.4 carrier rules (`riff_file_size`
== LE uint32 at `buf[4..8]`, every recorded `WebpChunk` cross-checked
byte-for-byte against the buffer it points into — FourCC at
`buf[header_offset..+4]`, LE uint32 `Size` at
`buf[header_offset + 4..+8]`, `payload_end - payload_start ==
size as usize`, `payload_end` inside both the buffer length and the
§2.4 declared RIFF window, on-disk order with
`chunks[i+1].header_offset == chunks[i].payload_end + (size & 1)`, the
`is_extended()` / `is_vp8_lossy()` / `is_vp8_lossless()` predicates
pure functions of FourCC, the `chunks_with_fourcc` /
`first_chunk_with_fourcc` helpers matching a manual filter) and every
error variant cross-checked against the §2.3 / §2.4 refusal trigger
(TooShortForHeader.got == buf.len() < 12; NotRiff.got == buf[0..4]
!= 'RIFF'; NotWebp.got == buf[8..12] != 'WEBP' with buf[0..4] ==
'RIFF'; RiffSizeOverflowsBuffer.declared == LE uint32 at buf[4..8]
with 8 + declared > buffer_len; TruncatedChunkHeader.offset >= 12
inside declared window with < 8 bytes remaining;
ChunkPayloadOverflowsRiff.offset >= 12 with 8-byte header fitting,
declared == LE uint32 at chunk header, available == declared_end
- (offset + 8), declared > available; MissingPadByte.offset >= 12
with declared Size odd, payload itself fitting, and pad byte at
payload_end + 1 outside declared window); `distance_code` (round 263)
drives the §5.2.2 distance-code-to-pixel-distance pure-function
lookup standalone entry point
`vp8l_decode::distance_code_to_pixel_distance` directly across the
full attacker-reachable `(distance_code, image_width)` cross-product
(a series of `(image_width, distance_code)` u32 LE pairs sliced
out of the fuzz buffer, with the §3.4 14-bit image-width ceiling
applied and the §5.2.2 `distance_code >= 1` precondition honoured)
with every returned `D` cross-checked against the §5.2.2 spec
formula (`max(1, xi + yi * image_width)` for codes `1..=120` via
the 120-entry `DISTANCE_MAP`, `distance_code - 120` for codes
`> 120`) and the §5.2.2 clamp guarantee (`D >= 1` always — either
from the clamp on the neighborhood-lookup branch or from the
smallest reachable raw scan-line distance of `121 - 120 = 1`),
plus pure-function determinism asserted via a double-call equality
check; `color_cache` (round 264) drives the §5.2.3
lossless-color-cache primitives standalone entry point
`vp8l_decode::ColorCache` directly across the full attacker-reachable
`code_bits ∈ [1, 11]` × `argb ∈ [0, u32::MAX]` cross-product (the
first fuzz byte fixes the §5.2.3 `code_bits` remapped into the
permitted window per the §5.2.3 "compliant decoders MUST indicate a
corrupted bitstream for other values" rule, every subsequent 4-byte
word is forwarded verbatim as a fuzz-controlled ARGB color into
`ColorCache::insert`) with every hash cross-checked against the
§5.2.3 spec formula `(0x1e35a7bd * argb) >> (32 - code_bits)`, every
insert/lookup round trip cross-checked against the §5.2.3 single-slot
single-write spec text ("Only one lookup is done in a color cache;
there is no conflict resolution"), every per-slot lookup cross-checked
against a parallel shadow model that records the §5.2.3
most-recently-inserted-wins overwrite behaviour, the §5.2.3 cache
initialization invariant cross-checked on a fresh cache (`size() ==
1 << code_bits`, every slot reads as `Some(0)`, `lookup(size())`
reads as `None`), and pure-function determinism asserted on the
insert sequence by rebuilding a replay cache from the same fuzz
bytes and verifying every slot agrees with the primary cache;
`inverse_predictor_color` (round 265) drives the §4.1 inverse-predictor
+ §4.2 inverse-color in-place transform passes standalone entry
points `vp8l_transform::inverse_predictor` +
`vp8l_transform::inverse_color` directly across the full
attacker-reachable `(width, height, size_bits, residual_pixels,
sub_resolution_image)` cross-product (the first three fuzz bytes
fix the §4.1 / §4.2 `(width, height, size_bits)` carrier triple with
`width` / `height` masked into `[1, 32]` for iteration cost and
`size_bits` remapped into `[0, 9]` to cover the full §4.1 / §4.2
`ReadBits(3) + 2` window plus the `size_bits == 0` hoist branch;
every subsequent 4-byte little-endian word is forwarded verbatim as
a fuzz-controlled ARGB residual pixel and, after `width * height`
words, as a fuzz-controlled sub-resolution predictor / color image
pixel) with the §4.1 left-topmost rule cross-checked against the
spec text (`pred_pixels[0] == residual[0] + 0xff000000` per channel
mod 256), the §4.1 single-column left-column rule cross-checked
against the §4.1 "all pixels on the leftmost column are T-pixel"
spec text (every `(0, y)` for `y >= 1` equals `residual + T` per
channel mod 256), the §4.1 single-row top-row rule cross-checked
against the §4.1 "all pixels on the top row are L-pixel" spec text
(every `(x, 0)` for `x >= 1` equals `residual + L` per channel mod
256), the §4.2 alpha-and-green preservation invariant cross-checked
against the §4.2 spec text ("The alpha and green channels are left
as is"), the §4.2 zero-CTE no-op invariant cross-checked by
re-running the pass against an all-zero sub-resolution image (every
per-pixel output equals the input), the §4.2 per-block constancy
invariant cross-checked against the §4.2 block structure (two
same-block pixels with equal pre-pass RGB produce equal post-pass
red + blue), and both passes' early-return contract cross-checked
against the §4.1 / §4.2 `(width == 0 || height == 0)` no-op (the
pixel buffer is byte-identical to the pre-call snapshot);
`inverse_subtract_green_indexing` (round 266) drives the §4.3
inverse-subtract-green + §4.4 inverse-color-table + §4.4
inverse-color-indexing transform passes standalone entry points
`vp8l_transform::{inverse_subtract_green, inverse_color_table,
inverse_color_indexing}` directly across their full
attacker-reachable input cross-products (the first three fuzz bytes
fix the §4.3 / §4.4 `(orig_width, height, table_size)` carrier triple
with `orig_width` / `height` masked into `[1, 32]` for iteration cost
and `table_size` mapped into the §4.4 wire window `[1, 256]`; every
subsequent 4-byte little-endian word is forwarded verbatim first as a
fuzz-controlled ARGB §4.3 input pixel, then as a fuzz-controlled §4.4
color-table delta entry, then as a fuzz-controlled §4.4 packed-index
ARGB pixel) with the §4.3 alpha-and-green preservation invariant
cross-checked against the spec text (every pixel's red byte equals
input red + input green mod 256, every pixel's blue byte equals input
blue + input green mod 256, alpha + green bytes byte-identical), the
§4.3 per-pixel locality invariant cross-checked by running the pass
on single-pixel inputs at the first eight positions and asserting the
solo output matches the multi-pixel output, the §4.3 zero-green-byte
no-op cross-checked against the `(red + 0) = red` reduction, the §4.4
color-table seed preservation cross-checked against the spec text
(`table[0]` is left untouched), the §4.4 color-table running-sum
invariant cross-checked against the §4.4 "adding the previous color
component values by each ARGB component separately and storing the
least significant 8 bits of the result" spec text (every `i >= 1`
entry is the per-channel running sum mod 256 of the original input
bytes), the §4.4 color-indexing output-length cross-checked against
the `orig_width * height` carrier contract, the §4.4 color-indexing
palette-lookup cross-checked against the §4.4 spec formula (output
pixel `(x, y)` is `color_table[((packed_green >> ((x % count) *
bits)) & mask) as usize]` with `width_bits` derived from the table
size via the §4.4 threshold table, falling back to transparent black
`0x00000000` when the index is out of range), and the §4.4
color-indexing empty-table edge case cross-checked against the §4.4
"unused indices map to transparent black" rule; the §4.3 empty-buffer
and §4.4 single-element-table degenerate no-op branches are
cross-checked unconditionally on every iteration. `backward_reference`
(round 267) drives the §5.2.2 backward-reference assembler standalone
entry point `vp8l_decode::apply_backward_reference` directly: the fuzz
buffer fixes a `(prefill_len, length, dist, total_pixels)` carrier
tuple (`prefill_len` masked to `[0, 4096]`; `dist` floored at 1 to
honour the §5.2.2 `D >= 1` precondition the
`distance_code_to_pixel_distance` clamp guarantees; `total_pixels`
alternated between `prefill_len + length + headroom` and a shrunk
value below `prefill_len + length` so both the success / exact-fit
path and the §5.2.2 overflow refusal are routinely reached) plus a
stream of fuzz-controlled ARGB pre-fill pixels, with every `Ok`
outcome cross-checked against the §5.2.2 copy contract (returned range
equals `position..position + length`, exactly `length` pixels
appended, the already-decoded prefix byte-identical, every appended
pixel matching a parallel reference LZ77 walk `out[position + i] ==
out[position + i - dist]` read after the preceding writes — the
overlapping `dist < length` self-repeat included), the §5.2.2
underflow refusal cross-checked against its `dist > position` trigger
(fields echo the call, buffer byte-identical to its pre-call
snapshot), the §5.2.2 overflow refusal cross-checked against its
`position + length > total_pixels` trigger (with the underflow guard
having passed), and pure-function determinism cross-checked by
replaying a successful run from the same pre-fill;
`meta_prefix_index` (round 268) drives the §6.2.2 meta-prefix
block-lookup table standalone entry points
`vp8l_decode::MetaPrefixIndex::{from_parts, meta_code_for}` directly
across the full `(prefix_bits, block_width, block_height,
meta_codes)` cross-product (the first fuzz byte fixes `prefix_bits`
masked to `[0, 15]` so the §6.2.2 `ReadBits(3) + 2` window `[2, 9]`
and its rejection are both routinely reached; the next two bytes fix
the block grid in `[0, 32]²` with 0 reaching the degenerate-grid
refusal; a skew byte shifts the supplied code count off the
`block_width * block_height` expectation by `[-2, +2]`; every
remaining 2-byte LE word is forwarded verbatim as a meta-prefix code)
with every `Ok` index cross-checked against the §6.2.2 carrier rules
(accessors echo the parts, `num_prefix_groups() == max(entropy image)
+ 1`, and `meta_code_for(x, y)` at all four corners of every block's
`(1 << prefix_bits)`-pixel-square covered area matching the §6.2.2
position formula `meta_codes[(y >> prefix_bits) * block_width + (x >>
prefix_bits)]`), every error variant cross-checked against its §6.2.2
refusal trigger in precedence order (`InvalidPrefixBits``prefix_bits ∉ [2, 9]`; `EmptyIndex` ⇔ zero-block grid with the
prefix-bits gate passed; `CodeCountMismatch` ⇔ count off the
expectation with both earlier gates passed, `expected` / `got`
echoing the call), and constructor determinism cross-checked by
rebuilding from the same parts plus round-tripping the index's own
accessors back through `from_parts`; `decode_entropy_image`
(round 270) drives the §6.2.2 *entropy image* decode path standalone
entry point `vp8l_decode::decode_entropy_image` directly across the
`(prefix_bits, prefix_image_width, prefix_image_height, bitstream)`
cross-product (the first three fuzz bytes fix the §6.2.2 carrier triple
`prefix_bits` masked to `[0, 15]` since the function records it as an
opaque carrier without re-deriving a block size, the block dimensions
modulo 9 so the §7.3 sub-image decode stays bounded and 0 reaches the
§6.2.2 degenerate-dimension refusal; the remaining bytes feed a
zero-positioned `BitReader` the §7.3 `entropy-coded-image` bit
sequence) with every `Ok` index cross-checked against the §6.2.2 +
§7.3 carrier rules (accessors echo the carrier triple; §7.3 one
meta-code per block `meta_codes().len() == prefix_image_width *
prefix_image_height`; §6.2.2 `num_prefix_groups() == max(meta_codes) +
1`; the §6.2.2 fold `meta_prefix_code == (entropy_pixel >> 8) & 0xffff`
cross-checked against an independent decode of the same bytes through
the public sibling `decode_entropy_coded_image` — the harness refolds
that decode's raw ARGB pixels and asserts byte-equality with the
meta-codes plus both readers advancing to the same bit position; the
§6.2.2 carrier asymmetry where `from_parts` reproduces the index iff
`prefix_bits ∈ [2, 9]` and refuses with `InvalidPrefixBits` otherwise;
and determinism by replaying the same bytes + carrier triple), the
§6.2.2 degenerate-dimension refusal pinned to the `EmptyEntropyImage`
variant echoing the carrier dimensions iff at least one is zero, and
every other bitstream-level refusal required only to return a `Result`
rather than panic. A 30 s smoke pass cleared 8.9 M runs with no
crashes (reaching the §5.2 `read_lz77_value` / `apply_backward_reference`
/ `distance_code_to_pixel_distance` core through the entropy-coded
sub-image); `decode_entropy_coded_image` (round 271) drives the §7.3
*entropy-coded-image* decode path standalone entry point
`vp8l_decode::decode_entropy_coded_image` directly — the §7.3 ABNF
building block beneath the round-270 §6.2.2 entropy image (which wraps
it and folds its pixels) and the §4.1 / §4.2 / §4.4 sub-resolution
images — across the `(width, height, bitstream)` cross-product (the
first two fuzz bytes fix the §7.3 carrier dimensions each modulo 9 so
the §5.2 / §6.2 decode loop stays bounded and 0 reaches the §7.3
degenerate-dimension `EmptyEntropyImage` refusal; the remaining bytes
feed a zero-positioned `BitReader` the §5.2.3 color-cache-info bit +
one §6.2 prefix-code group + §5.2 LZ77 / color-cache data) with every
`Ok` image cross-checked against the §7.3 carrier rules (`width()` /
`height()` echo the carrier, `pixels().len() == width * height`, the
success path reachable only with both dimensions ≥ 1, the reader never
advancing past the slice's bit length) and against the §6.2.2 wrapper
(an independent `decode_entropy_image` over the same bytes reproduces
the `(pixel >> 8) & 0xffff` per-pixel fold as its per-block meta-codes
and advances the reader to the same bit position), plus pure-function
determinism cross-checked by replaying the same bytes + dimensions for
a byte-identical pixel buffer at an identical bit position; the §7.3
degenerate-dimension refusal pinned to the `EmptyEntropyImage` variant
echoing the carrier dimensions iff at least one is zero and every other
refusal required only to return a `Result` rather than panic;
`decode_argb` (round 272) drives the §6.2.2 top-level VP8L ARGB
main-image decode path standalone entry point
`vp8l_decode::decode_argb` directly — the §5.1 `spatially-coded-image`
ARGB-role decoder one layer above the round-270 / round-271 entropy-image
harnesses, reading the §5.2.3 `color-cache-info` bit + §6.2.2
meta-prefix bit, dispatching the single-group (one §6.2 prefix-code
group everywhere) vs multi-group (§6.2.2 entropy image →
`num_prefix_groups = max + 1` groups → per-pixel-block group selection
via `meta_code_for`, single §5.2.3 color cache in stream order) paths,
and running the §6.2.3 decode loop — across the `(width, height,
bitstream)` cross-product (the first two fuzz bytes fix the carrier
dimensions each clamped into `[1, 8]` so the success contract holds —
mirroring the §3.4-validated dimensions `decode_argb` is reachable with
— and the image stays ≤ 64 pixels; the remaining bytes feed a
zero-positioned `BitReader` the §6.2.2 ARGB image bit sequence) with
every `Ok` image cross-checked against the §6.2.2 carrier rules
(`width()` / `height()` echo the carrier, `pixels().len() == width *
height`, the reader never advancing past the slice's bit length), plus
pure-function determinism cross-checked by replaying the same bytes +
dimensions for a byte-identical pixel buffer at an identical bit
position, and every refusal (truncation, meta-prefix/color-cache-info
parse failure, entropy-image fault, prefix-code parse failure,
out-of-range green symbol, color-cache or backward-reference fault, or a
meta-prefix code beyond `num_prefix_groups`) required only to return a
`Result` rather than panic. A 30 s smoke pass cleared 2.66 M runs with
no crashes (476 cov / 1690 features over a 269-input corpus);
`decode_lossless` (round 273) drives the §4 transform-list + main-image
full lossless-bitstream decode path standalone entry points
`vp8l_transform::{decode_lossless, decode_lossless_headerless}` directly
— the layer immediately above the round-272 §6.2.2 `decode_argb`: it
walks the §4 / §7.2 optional-transform loop (per-transform §4.x fixed
fields + §5-encoded body via the §7.3 `decode_entropy_coded_image`, the
§4 once-each duplicate refusal, the §4.4 `color_table_size` /
`width_bits` width subsampling), decodes the main §5.1 ARGB image at the
subsampled width, then applies the §4 inverse-transform chain in reverse
read order ("last one first"); `decode_lossless_headerless` is the
§2.7.1.2 / §3 `ALPH` twin reading the same bytes from bit 0 (no §3.4
5-byte image-header skip). The first two fuzz bytes fix the
`(width, height)` carrier each clamped into `[1, 8]` (so the success
contract holds and the decode stays ≤ 64 pixels); the remaining bytes are
the VP8L chunk-payload bits, with every `Ok` image cross-checked against
the §4 / §6.2.2 carrier rules (`width()` / `height()` echo the carrier
even after a §4.4 color-indexing transform un-bundles the internal width
back to the canvas width, `pixels().len() == width * height`) plus
replay determinism, and every refusal required only to return a `Result`
rather than panic. This harness surfaced (on its first run) and the round
fixed a `BitReader::bits_remaining` `usize` underflow that let a
sub-5-byte VP8L chunk payload index out of bounds past the §3.4
image-header skip — now `saturating_sub`. A 40 s smoke pass cleared
3.62 M runs with no crashes after the fix; `prefix_code_group`
(round 274) drives the §6.2 / §6.2.1 *prefix-code-group* reader
standalone entry point `meta_prefix::PrefixCodeGroup::read` directly —
the surface immediately below the round-271 §7.3
`decode_entropy_coded_image`, which reads a §5.2.3 color-cache-info bit
then exactly one `PrefixCodeGroup::read` before the §5.2 pixel loop. A
§6.2 group is the five canonical §6.2.1 prefix codes every VP8L pixel is
decoded with: green + backref-length + color-cache (alphabet `256 + 24 +
color_cache_size` per §6.2.3), red/blue/alpha (each `256`), and backref
distance (`40`), each read via `PrefixCode::read` then the §6.2.1
simple/normal `read_code_lengths` dispatch and the §6.2.1 canonical
`from_code_lengths` Kraft completeness build. The first fuzz byte selects
the §5.2.3 cache size from `{0}` (disabled) or `1 << code_bits` for
`code_bits ∈ [1, 11]` (`{2, 4, …, 2048}`), sizing the §6.2.3 green
alphabet; the remaining bytes feed a zero-positioned `BitReader`. Every
`Ok(group)` is cross-checked against the §6.2.3 / §6.2.1 carrier rules:
each of the five codes' `code_lengths().len()` equals its alphabet, every
nonzero length is `<= 15` (the `MAX_CODE_LENGTH` ceiling), `single_symbol()`
is `Some(s)` iff the length table has exactly one nonzero entry (at `s`)
and `None` iff two or more, `read_symbol` against an all-zero reader
resolves an in-range symbol index, the reader never advances past the
slice bit length, and replaying the same bytes + cache size yields an
equal group at an identical bit position; the §5.2.3
`InvalidColorCacheCodeBits` variant is asserted unreachable (the cache
size is caller-supplied, never read here). A 41 s smoke pass cleared
4.97 M runs with no crashes. `prefix_code` (round 275) drops one layer
further to the §6.2.1 *single canonical prefix-code* reader standalone
entry point `vp8l_prefix::PrefixCode::read` directly — the surface
`PrefixCodeGroup::read` calls five times in green/red/blue/alpha/distance
order. It reads one code's lengths off the wire (the §6.2.1 simple/normal
`read_code_lengths` dispatch) and builds the canonical decoder via
`from_code_lengths` with its Kraft completeness gate and single-leaf
exception, isolated across an attacker-controlled `(alphabet_size,
bitstream)` cross-product where the first fuzz byte selects one of the
wire-reachable §6.2.3 alphabets — `40` (distance), `256` (red/blue/alpha),
or the green `256 + 24 + color_cache_size` for the full
`color_cache_size ∈ {0} ∪ {2, …, 2048}` range — and the remaining bytes
feed a zero-positioned `BitReader`. Every `Ok(code)` is cross-checked
against the §6.2.3 / §6.2.1 carrier rules: `code_lengths().len()` equals
the selected alphabet, every nonzero length is `<= 15`, `single_symbol()`
is `Some(s)` iff exactly one nonzero entry (at `s`) and `None` iff two or
more, `read_symbol` against an all-zero reader resolves an in-range symbol
index, rebuilding from the returned length table through
`from_code_lengths` reproduces an equal code (the §6.2.1 `sum 2^-len == 1`
completeness invariant), the reader never advances past the slice bit
length, and replaying the same bytes + alphabet yields an equal code at an
identical bit position. A 14 s smoke pass cleared 2.00 M runs with no
crashes. `roundtrip_anim_modes` (round 279) is a differential oracle on
the §2.7.1.1 animation *assembly* path
`build_animated_webp_with_options``decode_webp` with every per-frame
carrier field fuzz-driven — even `(x, y)` sub-canvas offsets, mixed
`Auto` / `Delta` / `Lossless` frame modes (the dirty-rect sub-frame
encoder), `None` / `Background` disposal, `Overwrite` / `AlphaBlend`
blending, and the `ANIM` loop-count + background-colour options — with
every decoded full-canvas frame snapshot asserted byte-identical to an
independent §2.7.1.1 canvas simulation and duration / loop count /
background colour asserted to carry through. `roundtrip_metadata`
(round 282) is a differential oracle on the §2.7 metadata *write* path:
the two independent extended-layout writers
`build::build_webp_file_with_metadata` and
`encode_vp8l_argb_with_metadata` are driven with fuzz-controlled
§2.7.1.4 `ICCP` / §2.7.1.5 `EXIF` / `XMP ` payloads (presence, length
0..=255 — odd lengths exercising the §2.3 pad byte — and content), a
fuzz-controlled §2.7.1 `L` alpha-hint flag, and fuzz-controlled canvas
dimensions + ARGB pixels; every emitted file is cross-checked against
the §2.7 documented contract — the §2.3/§2.4 walker parses it, the
chunk sequence is the canonical `VP8X, ICCP?, VP8L, EXIF?, XMP?` order
(§2.7.1.4: the color profile "MUST appear before the image data"),
each metadata chunk's `Size` + payload bytes round-trip verbatim, the
§2.7.1 flag octet declares exactly the supplied features with the
canvas dimensions echoed, `extract_metadata` and the `decode_webp`
metadata carry agree byte-for-byte, and the lossless pixels survive
the metadata-bearing layout exactly — with the writer-B
no-alpha/no-metadata demotion to the §2.6 simple single-`VP8L` layout
pinned and both writers' metadata walks asserted identical to each
other. A 12-minute ASan pass cleared 30,543 runs with no crashes and
no assertion failures (3780 cov / 9031 features over a 790-input
corpus). `read_symbol_lut_diff` (round 285) is a differential oracle
on the round-284 §6.2.1 read-symbol fast path: `PrefixCode::read_symbol`
(the 256-entry primary lookup table keyed on the next 8 peeked
wire-order bits, the > 8-bit continuation walk resuming at length 9,
the near-EOF per-bit fallback, and the `MIN_LOOKUP_USED` used-symbol
amortization gate) is run in lockstep against the crate's own
pre-table per-bit row walk, kept as the `#[doc(hidden)]`
`PrefixCode::read_symbol_reference` oracle, over the same bytes — with
the decoded symbol (or typed refusal, including the `PrefixError::Eof`
`bit_pos` / `wanted` / `available` fields), the cursor bit position
after *every* symbol, and the alphabet bound asserted identical. The
code under test is built two ways: *wire mode* reads it off the fuzz
bytes through `PrefixCode::read` at a fuzz-selected §6.2.3 alphabet
(`40` / `256` / `256 + 24 + cache_size`) with the rest of the same
stream as the symbol soup (the on-disk §5 entropy-body layout), and
*table mode* synthesises the per-symbol lengths from fuzz bytes
repaired to an exact §6.2.1 Kraft sum (greedy front fill +
binary-decomposition tail fill), so the mutator steers the used-symbol
count across the table-build gate and the length profile across the
≤ 8-bit fast path, the > 8-bit continuation rows, and the 15-bit
ceiling at will; the §6.2.1 single-leaf-node tree (consumes no bits)
is compared once instead of looped. A 15-minute ASan campaign cleared
36.1 M runs with no divergence. `decode_lossless_lut` (round 285)
re-drives the §4 transform-list + main-image lossless decode entry
points `vp8l_transform::{decode_lossless, decode_lossless_headerless}`
at carrier dimensions widened into `[1, 64]` (≤ 4096 pixels — the
round-273 `decode_lossless` sibling clamps at `[1, 8]`, so its
accepted streams read only a handful of symbols per prefix code), with
the corpus seeded from the VP8L chunk payloads of the committed
fixture corpus plus entropy-heavy reference-encoder-produced streams
(64×64 noise / gradient / plasma tiles) whose §6.2 groups carry
100+-symbol codes with 9..15-bit tails — so the round-284 lookup-table
fast path, its continuation walk, and the word-load
`BitReader::read_bits` / `peek_bits` / `advance_bits` run hot inside
the assembled pipeline (transform sub-images, color cache, LZ77,
inverse-transform chain) under adversarial mutation at every cursor
phase; the round-273 carrier-echo / pixel-count / replay-determinism
contract is asserted unchanged. A 15-minute ASan campaign cleared
16.8 M runs with no crashes; same-session 4-minute regression re-runs
of `prefix_code` (32.2 M), `decode_lossless` (9.4 M), and
`prefix_code_group` (24.2 M) also ran clean. `decode_still_paths`
(round 288) is a differential oracle on the two public still-image
decode entry points `decode_webp` (the published `WebpImage` surface)
and `decode_webp_image` (the low-level `DecodedWebp` surface), seeded
from the in-tree §2.6 lossless + §2.7-extended fixtures. For a
non-animated input the published façade builds its single still frame by
literally calling `decode_webp_image`, so the harness asserts the two
surfaces agree exactly (`Ok``frames.len() == 1` with byte-identical
`frames[0].{rgba, width, height}` + canvas-dimension echo +
`duration_ms == 0` + no §2.7.1.1 carrier; `Err` ⇒ the published path
also `Err`), re-checks the §2.5/§2.6 flat-buffer carrier invariant
(`rgba.len() == width * height * 4`, non-empty iff both dimensions
nonzero) on every decoded still and every composited animation frame,
and asserts `decode_webp_image` replay determinism. The harness surfaced
a libFuzzer OOM — a ~60-byte file declaring a §2.7.1 16 777 154 × 64
animation canvas forced a ~4 GiB eager `Vec`; the round fixed it by
bounding the animation canvas at the §3.4 still-image ceiling
(`MAX_DECODE_DIMENSION = 16384` per side, rejected with `InvalidData`
before allocating). A ~300 s ASan campaign over 25 772 runs is now
crash-free. The §2.5 `VP8 ` *lossy* decode (routed to the `oxideav-vp8`
sibling, which currently panics on some malformed bitstreams at its
inverse-DCT stage) is deliberately skipped from the cross-check pending a
sibling-side hardening. `decode_lossless_image` (round 292) drives the
public top-level lossless façade `decode_lossless_image` — the layer that
walks the §2.3 `RIFF`/`WEBP` container, selects the §2.6 `VP8L` chunk,
reads the chunk's own §3.4 image-header dimensions, and runs the full
§4/§5/§6 decode to a typed `DecodedImage`. Unlike the round-273
`decode_lossless` harness (dimensions supplied *by the harness* over a
bare payload), the decoded dimensions here come from the **file's own**
§3.4 14-bit fields, exercising the §3.4-header → §4-decode
dimension-coherence path end to end; on every `Ok(Some(image))` it
asserts `image.{width,height}` echo the §3.4-resolved chunk dimensions,
the §6.2.2 `width * height` pixel count, a non-empty buffer, and replay
determinism. A cheap structural pre-pass gates the full-decode tail by
declared pixel count so an adversarial `16384 × 16384` header can't blow
the per-iteration budget. The harness surfaced a second libFuzzer OOM —
distinct from the r288 animation-canvas finding: a ~30-byte `VP8L` chunk
declaring a §3.4 `16360 × 12284` still forced `vp8l_decode::decode_image`
to eager-reserve ~800 MiB *before* the EOF-checked §5/§6 loop ran. The
round fixed it by capping the eager `Vec::with_capacity` at
`MAX_EAGER_PIXEL_RESERVATION = 1 << 22` pixels (`eager_pixel_capacity`);
the buffer still grows on demand for a legitimately large image and the
self-terminating loop raises `DecodeError::Eof` on a truncated stream, so
decoded bytes for all valid images are unchanged. A ~120 s ASan campaign
over 48 882 runs is now crash-free, peak RSS 1.2 GiB.
`decode_alpha_plane` (round 295) drives the public *file-level
still-image alpha* entry point `decode_alpha_plane` — the layer that
walks the §2.3 `RIFF`/`WEBP` container, selects the §2.7.1.2 `ALPH`
chunk (`Ok(None)` when absent), resolves the plane dimensions *from the
file itself* (the §2.7.1 `VP8X` 24-bit canvas Width/Height, else the
§2.5 `VP8 ` keyframe header), and decodes the alpha bitstream through
both §2.7.1.2 compression methods (raw + headerless §3 lossless) and all
four filter methods. Unlike `decode_alph` (dimensions supplied *by the
harness* over a bare chunk-payload slice), the dimensions here come from
the file's own §2.7.1 / §2.5 header, exercising the dimension-source →
§2.7.1.2 alpha-decode coherence path end to end. A structural pre-pass
reads the §2.7.1 `VP8X` canvas and gates the decode tail by declared
pixel count so an adversarial canvas can't blow the per-iteration budget.
On every `Ok(Some(plane))` the §2.7.1.2 carrier invariant and replay
determinism are cross-checked. A ~90 s ASan campaign over **23 926 275
runs** (~263 K exec/s, peak RSS 541 MiB) is crash-free — no panic, OOM,
or overflow surfaced; the existing `decode_alpha` `checked_mul` and the
headerless lossless eager-reservation cap already defend this path.
`parse_vp8_chunk` (round 298) drives the §2.5 simple-lossy `VP8 ` chunk
handle standalone entry point `vp8_chunk::WebpLossyChunk::from_payload`the keyframe-header peek the §2 RIFF walker reaches only along the
well-formed-container path — over an attacker-controlled byte slice of
arbitrary length. Every successfully-decoded field is cross-checked
against the RFC 6386 §9.1 key-frame header byte layout the parser
observed (the little-endian frame tag from bytes 0..3 with the key-frame
frame-type, `version` at bits 1..3, `show_frame` at bit 4, the 19-bit
`first_partition_size` at bits 5..23, the §9.1 start code `0x9D 0x01 0x2A`
at bytes 3..6, the 14-bit `width` / 2-bit `horizontal_scale` split of the
width word at bytes 6..8, the same split of the height word at bytes
8..10, and `bitstream()` echoing the input verbatim); every refusal
branch is cross-checked against its §9.1 / §2.5 trigger
(`PayloadTooShortForKeyframe` below the 10-byte minimum, `NotAKeyframe`
on an interframe frame-type bit §2.5 forbids, `BadStartCode` echoing
bytes 3..6 verbatim). A ~60 s ASan campaign over **87 063 810 runs**
(~1.43 M exec/s, peak RSS 550 MiB) is crash-free — no panic, OOM, or
overflow surfaced. Run any one with (nightly + `cargo-fuzz` installed):

```text
cargo +nightly fuzz run decode               --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run extract_metadata     --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run roundtrip_lossless   --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run roundtrip_animated   --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_alph          --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_vp8x           --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_anmf           --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_anim           --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_alph           --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_transform_list --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_meta_prefix    --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_container      --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run distance_code        --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run color_cache          --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run inverse_predictor_color --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run inverse_subtract_green_indexing --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run backward_reference   --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run meta_prefix_index    --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_entropy_image --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_entropy_coded_image --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_argb          --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_lossless      --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run prefix_code_group    --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run prefix_code          --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run roundtrip_anim_modes --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run roundtrip_metadata   --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run read_symbol_lut_diff --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_lossless_lut  --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_still_paths   --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_lossless_image --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run decode_alpha_plane   --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
cargo +nightly fuzz run parse_vp8_chunk      --manifest-path crates/oxideav-webp/fuzz/Cargo.toml
```

## Standalone use (no `oxideav-core`)

### Decode any `.webp` file

```rust
use oxideav_webp::{decode_webp, WebpImage};

let webp_bytes: &[u8] = /* file bytes from disk, HTTP, … */;
let image: WebpImage = decode_webp(webp_bytes)?;

println!("{} × {}, {} frame(s)", image.width, image.height, image.frames.len());
for frame in &image.frames {
    // frame.rgba is a tight Vec<u8> of width*height*4 RGBA bytes,
    // row-major, no per-row padding — drops into `image::ImageBuffer`:
    //
    //   let img = image::RgbaImage::from_raw(frame.width, frame.height,
    //                                        frame.rgba.clone()).unwrap();
    //
    println!("  frame: {}×{}, {} ms", frame.width, frame.height, frame.duration_ms);
}

// ICC / EXIF / XMP are on image.metadata.{icc, exif, xmp} (each Option<Vec<u8>>).
```

### Read metadata only (no pixel decode)

```rust
use oxideav_webp::extract_metadata;

let meta = extract_metadata(webp_bytes)?;
if let Some(icc) = meta.icc.as_deref()  { /* color-management profile */ }
if let Some(exif) = meta.exif.as_deref() { /* EXIF blob */ }
if let Some(xmp) = meta.xmp.as_deref()   { /* XMP UTF-8 XML */ }
```

### Encode a lossless `.webp` from RGBA bytes

The lossless encoder is a byte-cost **super-chooser**: it builds the §3
no-transform / subtract-green baseline plus every §4 single-transform and
§3.5 stacked-transform candidate — sweeping `size_bits`, the §5.2.3 color
cache, and the §6.2.2 meta-prefix grouping — and emits the byte-shortest
stream, so adding a candidate can never enlarge the output. Round 305
brought the three §3.5 stacked chains (color + predictor, color +
subtract-green + predictor, color-indexing + predictor) up to the same
per-block §4.1 mode-selection cost models the single-transform predictor
path has carried since rounds 159–162: each chain now sweeps the
folded-L1 magnitude proxy, the round-161 Shannon-entropy bit cost, and the
round-162 sub-image-aware entropy cost over the *transform-decorrelated*
residual the predictor actually models, keeping the smallest. Round 306
widened the sub-image-aware setting on the stacked chains from the single
mid-range weight round 305 bootstrapped to the **full lambda sweep** the
single-transform path uses — `4 000` / `16 000` / `64 000` / `256 000`
milli-per-bit — so each chain lands on the residual-vs-sub-image cost
crossover its own decorrelated residual exhibits rather than one fixed
guess. On smooth,
mildly-noisy photo-like content the entropy-aware models shrink the color +
predictor chain ~12–21 % versus the L1 proxy (the per-block mode histogram
concentrates, compacting both the §7.2 predictor sub-image and the residual
stream). Round-trip output is byte-identical regardless of which cost model
is chosen — the cost model only changes which §4.1 mode is *recorded*, and
the decoder reads the same modes back. Round 308 brought the *single-transform*
§4.2 cross-color path up to the same footing: its per-block color-transform-
element chooser now sweeps the L1-magnitude proxy **and** a Shannon-entropy
bit-cost model — the §4.2 analogue of the round-161 §4.1 predictor entropy
chooser — scoring each `(green_to_red, green_to_blue, red_to_blue)` candidate
by the bit cost of the resulting per-channel residual histogram rather than its
folded magnitude. The per-axis greedy stays exact (red residual depends only on
`green_to_red`, blue only on `(green_to_blue, red_to_blue)`, and red / blue
carry independent §5.x prefix codes); the entropy candidate is evaluated at the
per-region and single-block `size_bits` across the cache sweep, and the
super-chooser keeps the byte-shortest stream so it cannot regress. On a
channel-correlated-noise fixture it shrinks the §4.2 stream ~0.1 %, and as with
the §4.1 path the recorded CTE is the only thing the cost model changes — the
decoder re-applies whatever element the §4.2 sub-image carries.

The shortest path — flat RGBA in, complete `.webp` file out:

```rust
use oxideav_webp::encode_webp_lossless;

let rgba: Vec<u8> = /* width*height*4 RGBA bytes */;
let webp_bytes: Vec<u8> = encode_webp_lossless(&rgba, width, height)?;
// Write to disk:
std::fs::write("out.webp", &webp_bytes)?;
```

### Encode lossless with metadata (ICC / EXIF / XMP)

```rust
use oxideav_webp::{encode_vp8l_argb_with_metadata, WebpMetadata};

// VP8L works in ARGB, one u32/pixel.
let argb: Vec<u32> = /* width*height ARGB pixels */;

let meta = WebpMetadata {
    icc:  Some(&my_icc_profile),
    exif: Some(&my_exif_blob),
    xmp:  Some(&my_xmp_xml),
};
let webp_bytes = encode_vp8l_argb_with_metadata(
    width, height, &argb, /* has_alpha = */ true, &meta,
)?;
```

If `has_alpha` is `true` or any metadata field is set, the output
auto-promotes to the extended `VP8X` layout; otherwise it's the
simple lossless layout.

### Bare VP8L bitstream (no RIFF wrap)

For consumers that wrap the bitstream themselves:

```rust
use oxideav_webp::vp8l::encode_vp8l_argb;
let vp8l: Vec<u8> = encode_vp8l_argb(&argb, width, height)?;
```

### Build an animated `.webp`

```rust
use oxideav_webp::{build_animated_webp, build_animated_webp_with_options,
                   AnimFrame, AnimEncoderOptions};

// Each AnimFrame is a tile (width × height RGBA) at (x, y) on the
// canvas, with a duration in milliseconds.
let frames = vec![
    AnimFrame::new(/* w */ 64, /* h */ 64, /* rgba */ frame0_rgba, /* duration_ms */ 100),
    AnimFrame::new(64, 64, frame1_rgba, 100),
    AnimFrame::new(64, 64, frame2_rgba, 100),
];

// Defaults: per-frame Auto mode (picks byte-smallest of Lossless / Delta).
let webp = build_animated_webp(&frames)?;

// Or with options (loop count, background colour, file-level metadata):
let opts = AnimEncoderOptions {
    loop_count: 0,                      // 0 = infinite
    background_rgba: [0xff, 0xff, 0xff, 0xff],
    ..Default::default()
};
let webp = build_animated_webp_with_options(&frames, &opts)?;
```

## With the OxideAV runtime (`registry` feature on)

```rust
use oxideav_core::RuntimeContext;
use oxideav_webp::{CODEC_ID_VP8, CODEC_ID_VP8L};   // "webp_vp8" / "webp_vp8l"

let mut ctx = RuntimeContext::new();
oxideav_webp::register(&mut ctx);
// ctx now exposes the "webp" container plus "webp_vp8" + "webp_vp8l" codecs.
```

This is the only way to reach the **VP8-lossy encoder** — it delegates
to the `oxideav-vp8` sibling crate's framework factory family:

```rust
use oxideav_webp::encoder_vp8::{make_encoder_with_quality, make_encoder_with_qindex};

// Returns Box<dyn oxideav_core::Encoder>; emits RIFF/WEBP-wrapped output.
let enc = make_encoder_with_quality(&params, 75.0)?;
let enc = make_encoder_with_qindex(&params, 32)?;
```

(Lossless encode + decode + animation + metadata extraction all work
without `registry`; only the VP8 *lossy* encode path needs it.)

## Clean-room sources

Implementation is derived entirely from the public format specs:

* **RFC 9649** — WebP Image Format
  (`docs/image/webp/rfc9649-webp.txt`, also `rfc9649-webp.pdf`).
* **WebP Lossless Bitstream Specification** — the LZ77 + prefix-coded
  literals + color cache + spatial / color / color-indexing transforms
  (also reproduced in RFC 9649 §3).
* **RFC 6386** — VP8 Data Format and Decoding Guide
  (`docs/video/vp8/rfc6386-vp8-bitstream.txt`) for the VP8 lossy
  framing routed through the `oxideav-vp8` sibling.

The 18-fixture corpus at `docs/image/webp/fixtures/` is consumed as
opaque byte streams; end-to-end fixture tests validate against the
ARGB pixels of each fixture's committed `expected.png`. No third-party
codec library source is consulted.

## License

MIT. See [`LICENSE`](./LICENSE).