oxideav-mjpeg
Pure-Rust JPEG / Motion-JPEG codec and still-image container —
decodes baseline (SOF0), extended-sequential (SOF1 Huffman + SOF9
arithmetic), progressive (SOF2 Huffman + SOF10 arithmetic)
and lossless (SOF3 Huffman + SOF11 arithmetic) JPEGs (single-component
grayscale at any precision P ∈ 2..=16 plus three-component RGB-class
at P = 8), encodes
baseline, progressive and lossless JPEG (the lossless path covers
single-component grayscale at every precision P ∈ 2..=16 and
three-component interleaved RGB at every precision P ∈ 2..=16, with
every Annex H Table H.1 predictor). YUV 4:4:4 / 4:2:2 / 4:2:0 and
grayscale. Zero C dependencies.
Part of the oxideav framework but usable standalone.
Installation
[]
= "0.1"
= "0.1"
= "0.1"
= "0.1"
Quick use
A JPEG file is a single SOI..EOI byte stream, so the still-image container is a pass-through: open the file, pull one packet, decode. Motion-JPEG streams (inside AVI / MOV / AMV / etc.) reuse the same codec — each video packet is a full JPEG.
use ;
let mut ctx = new;
register;
let codecs = &ctx.codecs;
let containers = &ctx.containers;
let input: = Boxnew;
let mut dmx = containers.open?;
let stream = &dmx.streams;
let mut dec = codecs.make_decoder?;
let pkt = dmx.next_packet?;
dec.send_packet?;
if let Ok = dec.receive_frame
# Ok::
Encoder
use ;
let mut params = video;
params.width = Some;
params.height = Some;
params.pixel_format = Some;
let mut enc = codecs.make_encoder?;
enc.send_frame?;
let pkt = enc.receive_packet?;
The encoder accepts Yuv444P, Yuv422P, Yuv420P, Gray8, or packed
Rgb24 planar input and emits a standalone baseline JPEG per frame:
SOI, JFIF APP0, DQT, SOF0, DHT (Annex K typical tables), optional DRI,
SOS, entropy scan, EOI. Default quality factor is 75 on the Annex K
Q=50 base-table scaling (see oxideav_mjpeg::encoder::DEFAULT_QUALITY);
encoder::encode_jpeg(frame, quality) is also exposed for sibling
crates that wrap the same bitstream in custom containers. For
single-component Gray8 callers that already hold a flat row-major
byte buffer, encoder::encode_jpeg_grayscale(width, height, samples, stride, quality) is the direct entry point; the corresponding
encode_jpeg_grayscale_with_opts(..., restart_interval) and
encode_jpeg_grayscale_with_meta(..., restart_interval, meta) variants
add DRI + RSTn emission and APP/COM pass-through respectively. The
matching encoder::encode_jpeg_rgb24(width, height, samples, stride, quality) entry point + its _with_opts / _with_meta companions emit
a baseline RGB JPEG from a packed RGB triple buffer: three components
at IDs 'R' / 'G' / 'B', every component at H = V = 1, every
component bound to the single luma quantiser table, and an Adobe APP14
transform = 0 segment alongside the JFIF APP0 to flag the stream as
plain R/G/B. The decoder mirrors the convention — RGB JPEGs (signalled
by either the Adobe APP14 flag or the 'R'/'G'/'B' component-id
triple) round-trip as a single packed Rgb24 plane with no YCbCr
conversion.
Restart markers (RSTn + DRI) are supported for interop and bitstream
resiliency. They are off by default — call
MjpegEncoder::set_restart_interval(n_mcus) (or use
encoder::encode_jpeg_with_opts(frame, quality, n_mcus)) to enable
them. A non-zero value writes a DRI segment before SOS and cycles
RST0..=RST7 every n_mcus macroblocks in the scan, resetting DC
predictors at each marker. Passing 0 preserves the historical
no-restart behaviour.
Progressive (SOF2) encode
Toggle progressive emission via MjpegEncoder::set_progressive(true)
on the concrete encoder (construct it with
MjpegEncoder::from_params), or call
encoder::encode_jpeg_progressive(frame, quality) directly. The
output is a standalone progressive JPEG with this scan decomposition:
- Interleaved DC-first scan (
Ss=0, Se=0, Ah=0, Al=0) covering every component. - Per-component low-band AC scan (
Ss=1, Se=5, Ah=0, Al=0) — luma, Cb, Cr. - Per-component high-band AC scan (
Ss=6, Se=63, Ah=0, Al=0).
That's 1 + 3 + 3 = 7 SOS segments. Restart markers are not emitted
on the progressive path. Compressed size is typically ~10% larger than
the equivalent baseline encode due to the extra SOS/DHT overhead and
per-scan EOB handling (no EOBn runs).
Progressive with Successive Approximation (SA)
For full T.81 §G.1 compliance call
encoder::encode_jpeg_progressive_sa(frame, quality). This 14-scan
decomposition uses a 1-bit point transform:
- Phase 1 — initial scans (
Al=1): one interleaved DC scan + 3 per-component AC low-band scans + 3 per-component AC high-band scans. Each coefficient is encoded ascoef >> 1, dropping the LSB. - Phase 2 — refinement scans (
Ah=1, Al=0): DC and AC correction scans send the dropped LSB to the decoder, with AC correction bits for pre-existing nonzeros interleaved inline during the decoder's zero-history walk (T.81 §G.1.2.3).
Output round-trips through any conformant SOF2 decoder with PSNR ≥ 40 dB relative to the equivalent spectral-selection-only encode.
Progressive (SOF2) single-component grayscale encode
For single-component (Gray8) input call
encoder::encode_jpeg_progressive_grayscale(width, height, samples, stride, quality) directly. The bitstream layout is SOI / JFIF APP0 / DQT (luma) / SOF2 (Nf = 1, H = V = 1, P = 8) / DHT (Annex K luma DC + AC) / SOS_DC (Ss=0, Se=0) / scan / SOS_AC_low (Ss=1, Se=5) / scan / SOS_AC_high (Ss=6, Se=63) / scan / EOI — three spectral-selection
scans, no successive approximation, no DRI / RSTn. The output
round-trips through any conformant SOF2 decoder as a single Gray8
plane (max-diff ≤ 4 LSBs at Q=100 on smooth content; PSNR ≥ 30 dB
at the default Q=75). The companion _with_meta variant
(encode_jpeg_progressive_grayscale_with_meta(..., meta)) replaces
the default JFIF APP0 with caller-supplied APP/COM segments harvested
via extract_app_segments.
The trait-API encoder routes Gray8 input + set_progressive(true)
to the same path:
let mut params = video;
params.width = Some;
params.height = Some;
params.pixel_format = Some;
let mut enc = from_params?;
enc.set_progressive;
enc.send_frame?;
set_lossless(true) continues to override set_progressive for
grayscale (the SOF3 lossless path wins), and set_restart_interval
is ignored on the progressive path — neither the 3-component nor the
1-component SOF2 encoder emits DRI / RSTn.
Lossless (SOF3) encode
For single-component grayscale input call
encoder::encode_lossless_jpeg_grayscale(width, height, samples, stride, precision, predictor) directly:
precisionmust be in2..=16. Samples forP ≤ 8are one byte each (stride= bytes per row); forP > 8they are 16-bit little-endian (stride=width * 2).predictorselects one of the Annex H Table H.1 spatial predictors1..=7(1 = Ra / left is the safest default; 4..7 are two-dimensional and can compress better on smooth images).- Output is bit-exact: the decoder side recovers every input sample
verbatim, including the special
Di = 32768half-modulus case (T.81 §H.1.2.2). Point transform is fixed atPt = 0and no restart markers are emitted by the default entry point; for non-zeroPtor DRI +RSTnemission callencode_lossless_jpeg_grayscale_with_opts(width, height, samples, stride, precision, predictor, restart_interval, point_transform). On each restart boundary the encoder byte-aligns the stream, writesRST0..=RST7cycling modulo 8 per T.81 §F.1.1.5.2, and re-seeds the predictor history to the per-component origin2^(P − Pt − 1)(§H.1.2.1). WithPt > 0every input sample is right-shifted byPtbefore prediction; the decoder side then left-shifts the reconstructed sample by the samePton output.
The same path is available through the trait-API encoder:
let mut params = video;
params.width = Some;
params.height = Some;
params.pixel_format = Some;
let mut enc = from_params?;
enc.set_lossless;
enc.set_lossless_predictor;
enc.send_frame?;
Without set_lossless(true) the trait-API encoder rejects grayscale
input rather than silently downgrading the bitstream.
Lossless arithmetic (SOF11) grayscale encode
For single-component grayscale input the lossless arithmetic-coded counterpart of the SOF3 path is exposed directly:
use encode_lossless_arith_jpeg_grayscale;
// precision ∈ 2..=16, predictor ∈ 1..=7 (Annex H Table H.1).
let jpeg = encode_lossless_arith_jpeg_grayscale?;
# Ok::
The spatial model is identical to the Huffman lossless path (Annex H
predictors 1..=7 over Ra / Rb / Rc), but each prediction
difference is coded with the Q-coder arithmetic statistical model of
T.81 §H.1.2.3 (Table H.3) — the L_Context(Da, Db) / X1_Context(Db)
conditioning over neighbouring differences — instead of a Huffman
magnitude category. The bitstream is SOI / JFIF APP0 / SOF11 (Nf = 1, H = V = 1) / [DRI] / SOS (Ss = predictor, Al = Pt) / arith scan / EOI;
no DAC segment is emitted, so the SOF11 decoder applies the default
conditioning bounds (L, U) = (0, 1) per §H.1.2.3.3. Output is
bit-exact for every precision P ∈ 2..=16, every predictor, and the
half-modulus Di = 32768 corner case (§H.1.2.2). The
encode_lossless_arith_jpeg_grayscale_with_opts(..., restart_interval, point_transform) variant adds DRI + RSTn emission (each interval
flushes the Q-coder, byte-aligns, writes RST0..=RST7 cycling modulo 8,
and re-seeds the statistical model + difference history + predictor to
the scan-origin default 2^(P − Pt − 1), §H.1.1 / §H.1.2.3.4) and a
non-zero point transform Pt (the low Pt bits are discarded on both
sides). The decoder has supported SOF11 since round 0.1.x, so these
encode entry points round-trip end-to-end (tests/lossless_roundtrip.rs).
The three-component (RGB-class) counterpart
encode_lossless_arith_jpeg_rgb(width, height, [c0, c1, c2], strides, precision, predictor) (plus its _with_opts(..., restart_interval, point_transform) companion) emits a SOF11 (Nf = 3, every component H = V = 1) interleaved scan — the Q-coder counterpart of
encode_lossless_jpeg_rgb. Each component is modelled independently per
§H.1.2 (its own statistics area + L_Context(Da, Db) /
X1_Context(Db) difference history), and one residual per component is
emitted per pixel position in scan order into a single arithmetic-coded
segment. Bit-exact for every precision P ∈ 2..=16 (decode shape:
P = 8 → packed Rgb24, P ∈ {10, 12, 14} → planar Gbrp*Le, every
other P → packed Rgb48Le), every predictor, the half-modulus
Di = 32768 case, non-zero Pt, and per-interval restart re-seeding of
all three components.
The four-component (CMYK-class) counterpart
encode_lossless_arith_jpeg_cmyk(width, height, [c0, c1, c2, c3], strides, predictor, adobe_transform) (plus its _with_opts(..., restart_interval, point_transform) companion) emits a SOF11 (Nf = 4, every component H = V = 1) interleaved scan at P = 8 — the Q-coder counterpart of
encode_lossless_jpeg_cmyk. Each component is modelled independently per
§H.1.2 (its own statistics area + L_Context(Da, Db) / X1_Context(Db)
difference history), and one residual per component is emitted per pixel
position in scan order into a single arithmetic-coded segment. The
adobe_transform flag matches the Huffman CMYK helpers: None writes no
APP14 (plain "regular" CMYK), Some(0) selects Adobe CMYK and inverts
every sample on the wire before predictive coding, Some(2) selects Adobe
YCCK (interpret the packed input as [Y, Cb, Cr, K] and invert only K
before coding). Decode shape is the same packed PixelFormat::Cmyk the
SOF3 four-component path produces (4 bytes/pixel). Bit-exact for every
predictor, the half-modulus Di = 32768 case, non-zero Pt, and
per-interval restart re-seeding of all four components on the no-APP14 /
Adobe-CMYK paths (YCCK is a lossy interop convention by construction —
BT.601 YCbCr → RGB → CMY clamps; the K plane round-trips exactly).
Lossless (SOF3) RGB / three-component encode
For three-component (R, G, B / or any three independent monochrome
planes) lossless output call encoder::encode_lossless_jpeg_rgb(width, height, [r, g, b], strides, precision, predictor) directly:
- The three planes share one shared DC Huffman table (Td = 0) and one predictor selector. Each component is modeled independently per T.81 §H.1.2 — neighbours come from the same plane only.
- Each component is declared
H_i = V_i = 1, so the MCU at every pixel position is exactly one residual per component in scan order (component IDs 1, 2, 3). Output: a standalone SOF3 JPEG with one interleaved SOS scan. precisionis the same2..=16range as the grayscale entry point. Decode output is shaped by precision:P = 8→ packedRgb24(one plane, 3 bytes/pixel).P = 10→ planarGbrp10Le(3 planes, 16-bit LE storage).P = 12→ planarGbrp12Le.P = 14→ planarGbrp14Le.- any other P → packed
Rgb48Le(one plane, 6 bytes/pixel — samples narrower than 16 bits sit in the low bits of each 16-bit word).
- The codec is colour-agnostic on both sides: callers pass planes in
whatever channel order they want (R-G-B, G-B-R, etc.) to the encoder,
and the decoder hands them back in the same SOS scan order — both for
the 8-bit packed-
Rgb24path and the high-bit-depth planar paths. Callers that want the canonical G-B-R plane order ofGbrp*Leshould pass G, B, R to the encoder in that order. - For DRI +
RSTnemission or a non-zero point transform callencode_lossless_jpeg_rgb_with_opts(width, height, [r, g, b], strides, precision, predictor, restart_interval, point_transform). Both options behave identically to the grayscale variant; restarts reset every component's predictor in lockstep, andPtshifts every sample of every plane uniformly.
Lossless (SOF3) CMYK / four-component encode
For four-component (C, M, Y, K — or any four independent monochrome
planes) lossless output at 8-bit precision call
encoder::encode_lossless_jpeg_cmyk(width, height, [c, m, y, k], strides, predictor, adobe_transform) directly:
- Each component is modeled independently per T.81 §H.1.2; the four
planes share one DC Huffman table and one predictor selector. Each
component is declared
H_i = V_i = 1, so the MCU at every pixel position is exactly one residual per component in scan order (component IDs 1, 2, 3, 4). Output: a standalone SOF3 JPEG with one interleaved SOS scan. precisionis fixed at 8 bits — the workspacePixelFormatenum has no high-bit-depth CMYK variant, so the four-component lossless path isP = 8only. Output: packedPixelFormat::Cmyk(one plane, 4 bytes/pixel inC, M, Y, Korder).adobe_transformselects the APP14 colour-transform marker, identical to the lossy CMYK helpers:Nonewrites no APP14 (plain "regular" CMYK),Some(0)selects Adobe CMYK and inverts every sample on the wire before predictive coding,Some(2)selects Adobe YCCK (interpret the packed input as[Y, Cb, Cr, K]and invert only K before coding). The decoder un-does both transforms on output, so a no-APP14 or Adobe-CMYK round-trip is bit-exact.- For DRI +
RSTnemission or a non-zero point transform callencode_lossless_jpeg_cmyk_with_opts(width, height, [c, m, y, k], strides, predictor, adobe_transform, restart_interval, point_transform). Both options behave identically to the grayscale and three-component variants; restarts reset every component's predictor in lockstep, andPtshifts every sample of every plane uniformly.
4-component CMYK / YCCK encode
The 4-component (CMYK / Adobe YCCK) decode paths landed in earlier
rounds are now matched by a public encoder API. Both a baseline
(SOF0) and a progressive (SOF2) variant accept the same packed
[C, M, Y, K] interleaved buffer the decoder produces (4 bytes per
pixel, stride bytes per row), so round-tripping a decoded CMYK
frame back into a JPEG is a single call:
use ;
let jpeg = encode_jpeg_cmyk?;
let prog = encode_jpeg_cmyk_progressive?;
# Ok::
adobe_transform selects the Adobe APP14 colour-transform marker:
None writes no APP14 (plain "regular" CMYK), Some(0) selects
Adobe CMYK and inverts every sample on the wire, Some(2) selects
Adobe YCCK, interpreting the packed input as [Y, Cb, Cr, K] and
inverting only the K plane (the decoder un-does both transforms on
output). The two per-plane back-end entry points
encoder::encode_jpeg_cmyk_1111 / encode_jpeg_progressive_cmyk_1111
are also pub for callers that already hold four separate component
buffers.
The trait-API encoder accepts CMYK input as well:
let mut params = video;
params.width = Some;
params.height = Some;
params.pixel_format = Some;
let mut enc = from_params?;
enc.set_adobe_transform?; // None / Some(0) / Some(2)
enc.set_progressive; // optional — SOF2 instead of SOF0
enc.send_frame?;
The plane stride must be at least width * 4; shorter strides are
rejected with Error::InvalidData.
Metadata pass-through
All encoder entry points have *_with_meta variants that accept a
meta: &[u8] byte slice of pre-serialised APP/COM segments to embed
immediately after SOI (replacing the default JFIF APP0). Use
encoder::extract_app_segments(jpeg) to harvest APP0-APP15 and COM
segments from an existing JPEG for pass-through to the re-encoded
output.
RTP/JPEG depacketization (RFC 2435)
Motion-JPEG carried over RTP omits the JPEG frame and scan headers from
the wire (abbreviated table-specification format) and fragments the
entropy-coded scan across packets. rtp::JpegDepacketizer reassembles
those fragments and reconstructs the absent SOI / DQT / SOF0 / DHT /
[DRI] / SOS / EOI marker segments into a complete JPEG interchange
stream the decoder consumes directly.
use ;
let mut dp = new;
// `payload` = one RTP packet body with the 12-byte RTP fixed header
// already stripped; `marker` = the RTP marker bit (set on the last
// fragment of a frame).
# let payload: & = &;
# let marker = false;
match dp.push?
# Ok::
Coverage:
- Well-known fixed type mappings 0/64 (4:2:2-class,
H=2 V=1luma) and 1/65 (4:2:0-class,H=2 V=2luma), three-component YUV interleaved scan (§4.1). - Quantization tables recovered from the Q field via the Independent
JPEG Group scale formula over Annex K.1 / K.2 for
Q ∈ 1..=99(§4.2), or read in-band from the §3.1.8 Quantization Table header forQ ∈ 128..=255(8-bit, plus 16-bit saturated to the emitted 8-bit DQT). - Cross-frame in-band table caching (§4.2): a static
Q ∈ 128..=254may carry its tables once and omit them (Length = 0) on later frames; the depacketizer caches them per Q value and reuses the cached pair, so a multi-frame static-Q stream keeps decoding.Q = 255is dynamic and never cached (tables reload every frame).reset()keeps the cache;new()starts fresh. - Types 64..=127 consume the §3.1.7 Restart Marker header and emit a DRI segment with the carried interval.
- Fragment reassembly keyed on the §3.1.2 Fragment Offset, so misordered intra-frame delivery is tolerated as long as the marker-bit fragment arrives.
RTP/JPEG packetization (RFC 2435)
rtp::packetize(jpeg, max_payload, qmode) is the encode-side inverse:
it parses a complete baseline JPEG, strips the frame/scan headers, and
emits a Vec<rtp::JpegPacket> of RTP/JPEG payloads ready to drop after
the RTP fixed header.
use ;
# let jpeg: & = &;
// `jpeg` = a complete baseline SOF0/SOF1 4:2:2 or 4:2:0 YUV stream.
let packets = packetize?;
for pkt in &packets
#
# Ok::
Coverage:
- Luma sampling
2x1→ type 0 (4:2:2),2x2→ type 1 (4:2:0); chroma must be1x1(the well-known §4.1 layout). - A source DRI promotes the type to 64/65 and writes the §3.1.7 Restart
Marker header. By default the chunks span arbitrary byte boundaries and
the header signals whole-frame reassembly (
F = L = 1, Restart Count0x3FFF); passPacketizeOpts::new(qmode).with_restart_align(true)topacketize_with_optsto split the scan on restart-interval boundaries instead — each emitted fragment then carries one or more complete intervals, setsF = L = 1, and reports its first interval's index in the 14-bit Restart Count (wrapping modulo0x3FFF). QMode::Quality(1..=99)carries an IJG-quality Q value (receiver regenerates the Annex K tables);QMode::InBand(128..=255)carries the JPEG's own two DQT tables in a §3.1.8 Quantization Table header on the first fragment.- The scan is fragmented at
max_payload(header bytes counted); the first fragment has offset 0, the last hasJpegPacket::marker == true.
Lacks: RTP transport framing itself (the 12-byte RTP fixed header,
sequence numbering, 90 kHz timestamping stay the caller's job),
packetization of progressive / lossless / grayscale / CMYK JPEGs (no
well-known RTP/JPEG type — Unsupported), out-of-band table negotiation
via a session-setup protocol on depacketize (a static Q ≥ 128 frame
whose tables were never sent in-band, nor cached from an earlier frame,
→ Unsupported), and the dynamic non-well-known types 128..=255.
Codec / container IDs
- Codec:
"mjpeg". Decoder output / encoder input pixel formats:Yuv444P,Yuv422P,Yuv420P, plusGray8on the decode side. - Container:
"jpeg", matches.jpg/.jpeg/.jpe/.jfifby extension and byFF D8 FFmagic bytes. One frame per file; muxing is a pass-through of the codec packet. - Container:
"mjpeg-raw", matches.mjpeg/.mjpgby extension. Raw concatenated SOI..EOI JPEG frames, one packet per frame. Default time base is1/25so frameicarriespts = i; the demuxer implementsseek_to(stream, pts)with a marker-aware scanner (no SOI false-positives from APP1 thumbnails / stuffed entropy bytes) and a lazy(pts, byte_offset)waypoint index (one entry every 5 frames).
Decode-free inspector
oxideav_mjpeg::inspect_jpeg(bytes) -> Result<JpegInfo> walks the
marker prefix of a JPEG buffer up to the first SOS (T.81 §B.1) and
returns a typed summary — SofKind (Baseline / ExtendedSequential /
Progressive / Lossless / ExtendedSequentialArith / ProgressiveArith /
LosslessArith / HierarchicalDct / HierarchicalArith), precision,
width, height, per-component sampling / quant-table descriptors,
a ChromaSubsampling discriminator (4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 /
GrayscaleOnly / Custom), a ColorHint from JFIF (T.871) and Adobe
APP14 (T.872 §6.5.3) tags, the restart_interval from a DRI
segment if present, and — when the leading APP0 is a structurally
valid JFIF segment per T.871 §10.1 — an optional JfifApp0 typed
view (version_major/_minor, units: JfifUnits ∈
{AspectRatio, DotsPerInch, DotsPerCm}, h_density/v_density,
thumbnail_width/_height, plus has_thumbnail(),
thumbnail_payload_len(), h_density_dpi() / v_density_dpi()
unit-normalised accessors and pixel_aspect_ratio() for the
units-= 0 case). When a JFIF extension APP0 segment (identifier
"JFXX\0", T.871 §10.2) follows the JFIF APP0 — the segment most
writers use to carry the thumbnail — an optional JfxxApp0 typed
view on JpegInfo::jfxx reports the thumbnail-storage variant via a
JfxxThumbnail enum exhaustive over the three defined
extension_code bytes: JpegEncoded { jpeg_len } (0x10, §10.3 —
embedded baseline JPEG), PaletteRgb { width, height } (0x11,
§10.4 — 768-byte palette + indices), and Rgb24 { width, height }
(0x13, §10.5 — packed 24-bit RGB). It carries no colour-convention
signal, so it leaves color_hint untouched. When an APP14 Adobe segment is present and
structurally valid per T.872 §6.5.3, an optional AdobeApp14
typed view is also exposed (dct_encode_version, flags_0,
flags_1, transform: AdobeColorTransform ∈ {Unknown,
YCbCr, Ycck}, plus is_standard_version() and
as_color_hint() projections). When one or more APP2 segments
carrying the "ICC_PROFILE\0" signature (T.872 / Annex L of
T.871) appear, an IccProfileChunks summary on
JpegInfo::icc_profile reports the declared chunk total, the
cumulative total_payload_len, the per-segment
(seq_no, payload_len) ordering, and an is_complete()
predicate that returns true when the sequence numbers cover
1..=total exactly once. No entropy decoding, no DCT, no
allocation proportional to the scan body — O(prefix-length).
Useful for pipeline triage (pick a target pixel format),
fallback-decoder routing without spinning up the full decode
path, DPI-aware thumbnail sizing, and corpus summarisation. The
SofKind exposes is_supported_by_decoder(), is_dct(), and
is_arithmetic() helpers so callers can negotiate without
matching on every variant by hand. Standalone
parse_jfif_app0(payload) -> Result<JfifApp0>,
parse_jfxx_app0(payload) -> Result<JfxxApp0>,
parse_adobe_app14(payload) -> Result<AdobeApp14>, and
parse_icc_profile_app2(payload) -> Result<IccProfileApp2Chunk<'_>>
validators are also re-exported for callers that already have the
APP0 / APP14 / APP2 payload bytes in hand. Standalone surface —
the inspector requires neither the registry feature nor an
oxideav-core dep.
use ;
let info = inspect_jpeg?;
println!;
if !info.sof_kind.is_supported_by_decoder
# Ok::
Format coverage
Decoder:
- SOF0 (baseline sequential, Huffman, 8-bit).
- SOF1 (extended sequential, Huffman, 8-bit) — same scan structure as SOF0 at 8-bit, so the same code path handles it.
- SOF2 (progressive, Huffman) — multi-scan spectral selection and
successive approximation (DC first + refinement, AC first +
refinement with EOB-run). Accepts both
P = 8andP = 12; the scan path is precision-agnostic (i32 coefficient planes) and the EOI render dispatcher routesP = 12to the sameGray12Le/Yuv444P12Le/Yuv422P12Le/Yuv420P12Leshape as the sequential 12-bit path below. 4-component CMYK / YCCK is supported atP = 8and produces the same packedCmykoutput the sequential path emits (Adobe APP14 transform flag honoured). - SOF10 (progressive, arithmetic) — the SOF2 scan structure with
the Annex D Q-coder as the entropy layer per T.81 §G.1.3. DC first
scans reuse the §F.1.4.1 sequential model on the point-transformed
values; DC refinement bits use the fixed 0.5 estimate; AC first
scans run the §F.1.4 procedure per band (
Kmin = Ss, EOB = end-of-band, DACKxhonoured); AC refinement scans follow the §G.1.3.3 model (Figures G.10 / G.11, Table G.2 — 189 statistics bins, end-of-band decision bypassed below the prior scan's EOBx). Restart intervals re-seed coder + statistics + DC prediction.P = 8andP = 12, 4-component CMYK / YCCK atP = 8— same output shaping as SOF2 via the shared coefficient accumulator. - Non-interleaved sequential scans (SOF0/SOF1 with one SOS per component) — transparently routed through the shared coefficient accumulator.
- 12-bit precision sequential JPEGs (SOF0/SOF1,
P=12) → 16-bit-LEGray12Lefor grayscale andYuv444P12Le/Yuv422P12Le/Yuv420P12Lefor three-component YUV at 4:4:4 / 4:2:2 / 4:2:0 chroma sampling. Level shift is 2048 as per the spec. - Lossless JPEG (SOF3) — single-component grayscale at any
precision
P ∈ 2..=16. Annex H predictor reconstruction (bit-exact). Output:Gray8at P=8,Gray10Le/Gray12Leat P=10/12, elseGray16Le. Point transform (Pt = Al) honoured. - Lossless JPEG (SOF3) three-component — every precision
P ∈ 2..=16, interleaved scan with each component declaredH_i = V_i = 1(the natural RGB-class layout). Independent per-component predictor buffers per Annex H §H.1.2. Output is precision-shaped: packedRgb24atP = 8, planarGbrp10Le/Gbrp12Le/Gbrp14LeatP = 10/12/14, packedRgb48Lefor every other precision in the valid range (the lowPbits carry the post-Pt-shift sample, top bits zero — same widen policy the grayscale path uses to landP = 14inGray16Le). - Lossless JPEG (SOF3) four-component —
P = 8only (the workspacePixelFormatenum has no high-bit-depth CMYK variant), interleaved scan with each component declaredH_i = V_i = 1. Independent per-component predictor buffers per Annex H §H.1.2. Output: packedPixelFormat::Cmyk(4 bytes/pixel). Adobe APP14 colour-transform flag honoured identically to the lossy CMYK paths (no APP14 → plain "regular" CMYK, transform=0 → Adobe CMYK un-inverted on output, transform=2 → YCCK converted back to CMYK via BT.601). - Lossless arithmetic JPEG (SOF11) — the same Annex H coding model
(grayscale
P ∈ 2..=16, three-component RGB-classP ∈ 2..=16, four-component CMYK-classP = 8, all predictors, point transform, restart intervals) with the modulo-2^16 prediction differences entropy-coded by the Annex D Q-coder under the T.81 §H.1.2.3 two-dimensional statistical model: each binary decision is conditioned on the classifications of the differences coded for the sample to the left and the sample in the line above (the 5 × 5L_Context(Da, Db)array of Figure H.2, 158 statistics bins per component per Table H.3), with the DAC marker's DC-conditioning(L, U)bounds honoured (defaults(0, 1)per §H.1.2.3.3). The first line of the scan and of each restart interval uses the 1-D horizontal predictor per §H.1.2.1. Output shaping is shared with the SOF3 path (bit-exact reconstruction, same precision-driven pixel-format policy). - CMYK / YCCK 4-component JPEGs → packed
PixelFormat::Cmyk. Adobe APP14 transform flag honoured: transform=0 (Adobe CMYK, stored inverted) un-inverts on decode; transform=2 (YCCK) converts back to CMYK via BT.601 YCbCr→RGB→CMY and K inversion; no APP14 → plain ("regular", C=0 = no ink) pass-through. - Chroma subsampling: 4:4:4, 4:2:2, 4:2:0.
- Grayscale (single-component →
Gray8). - Baseline RGB (3-component SOF0 at
H = V = 1, signalled by either an Adobe APP14transform = 0segment or component IDs'R'/'G'/'B'in the SOF) → packedPixelFormat::Rgb24(single plane,stride = width * 3). The encoder's matchingencode_jpeg_rgb24_*entry points emit both signals (APP14 + component-id triple) by default; the decoder accepts either, so a caller-supplied APP-segment override that drops the APP14 still round-trips. - Restart markers (
RSTn) + DRI. - DNL (Define Number of Lines, T.81 §B.2.5) — when the SOF frame
header codes the number of lines
Y = 0, the real line count is recovered from the mandatory DNL segment (0xFFDC) that immediately follows the first scan, and the frame is decoded at that height. TheY = 0case without a following DNL (the segment is mandatory there per §B.2.5), and a malformedNL = 0DNL, are both rejected. Applies to every scan-decomposition path (baseline fast path, the sequential / progressive / arithmetic accumulator paths, and lossless). - RTP/JPEG (RFC 2435) depacketization via
rtp::JpegDepacketizer— reassembles fragmented RTP/JPEG payloads and reconstructs the absent frame/scan headers (from the Q field or an in-band quantization-table header) into a complete JPEG the decoder consumes. The encode-side inversertp::packetizefragments a baseline JPEG into RTP/JPEG payloads. See the RTP/JPEG sections below. - APP0..APP15 segments skipped cleanly (EXIF/XMP/ICC preserved at the container level, not parsed).
- Trailing garbage past EOI is stripped by the demuxer.
Encoder:
- SOF0 (baseline sequential) — 8-bit Huffman, Annex K tables.
3-component YUV at 4:4:4 / 4:2:2 / 4:2:0, single-component
Gray8(H = V = 1, one DQT + DC/AC luma Huffman pair + one-entry SOS), 3-component packedRgb24atH = V = 1(component IDs'R'/'G'/'B', single DQT + DC/AC luma Huffman pair, Adobe APP14transform = 0emitted alongside JFIF APP0), plus 4-component CMYK / YCCK atH_i = V_i = 1with the Adobe APP14 colour-transform flag configurable via the dedicated public CMYK entry points (and the trait API'sset_adobe_transform). - SOF2 (progressive) — spectral-selection decomposition (default:
7 SOS scans,
Ah=0,Al=0) for 3-component YUV, and a 3-scan variant (DC + AC-low + AC-high,Ss/Se ∈ {(0,0), (1,5), (6,63)}) for single-componentGray8. The CMYK / YCCK variant uses a 9-segment spectral-selection scan decomposition over four components. Full successive-approximation decomposition (14 SOS scans, 1-bit point transform) is available on the YUV path viaencode_jpeg_progressive_sa. See above. - SOF3 (lossless) — single-component grayscale at any precision
P ∈ 2..=16, three-component interleaved (RGB-class) at any precisionP ∈ 2..=16, and four-component interleaved (CMYK-class) atP = 8, all withH_i = V_i = 1per component and every Annex H Table H.1 predictor1..=7. Bit-exact roundtrip on the grayscale, RGB and no-APP14 / Adobe-CMYK four-component paths (YCCK is a lossy interop convention by construction — BT.601 YCbCr → RGB → CMY clamps), including the SSSS=16 / Di=32768 half-modulus case. Optional DRI +RSTnemission and non-zero point transform viaencode_lossless_jpeg_grayscale_with_opts/encode_lossless_jpeg_rgb_with_opts/encode_lossless_jpeg_cmyk_with_opts. Restart boundaries re-seed every component's predictor to2^(P − Pt − 1)per T.81 §H.1.2.1. - SOF11 (lossless, arithmetic-coded) — single-component grayscale
and three-component interleaved (RGB-class) at any precision
P ∈ 2..=16, plus four-component interleaved (CMYK-class) atP = 8, with every Annex H Table H.1 predictor1..=7. The Q-coder counterpart of the SOF3 path: the modulo-2^16 prediction differences are entropy-coded by the Annex D arithmetic coder under the §H.1.2.3 two-dimensional statistical model (each component modelled independently, no DAC segment → default conditioning(L, U) = (0, 1)). Bit-exact for every predictor, the half-modulusDi = 32768case, non-zero point transform, and per-interval restart re-seeding viaencode_lossless_arith_jpeg_grayscale_with_opts/encode_lossless_arith_jpeg_rgb_with_opts/encode_lossless_arith_jpeg_cmyk_with_opts. The four-component path honours the Adobe APP14 colour-transform flag identically to the Huffman SOF3 CMYK encoder (no-APP14 / Adobe-CMYK round-trips are bit-exact; YCCK is a lossy interop convention). - 4:4:4 / 4:2:2 / 4:2:0 YUV input on the lossy paths, plus single-
component
Gray8and packedRgb24on the baseline SOF0 path;Gray8/Gray10Le/Gray12Le/Gray16Leinput on the lossless path. - Optional DRI +
RSTnemission on the baseline path (off by default; see the Encoder section above).
Not supported (decoder returns Error::Unsupported):
- Hierarchical (SOF5..SOF7, SOF13..SOF15). The arithmetic-coded
non-hierarchical variants are all supported: SOF9 (extended
sequential) at
P=8, SOF10 (progressive) atP=8/P=12, and SOF11 (lossless) at every Annex H precision — see the decoder coverage list above. - 12-bit 4-component progressive (SOF2 / SOF10 with
Nf = 4, P = 12) — the workspacePixelFormatenum has no 12-bit CMYK variant.P = 84-component CMYK / YCCK is supported on the sequential (SOF0 / SOF1) and progressive (SOF2 / SOF10) scan decompositions, with the Adobe APP14 transform flag honoured. - 4-component lossless above
P = 8(the workspacePixelFormatenum has no high-bit-depth CMYK variant — wider precisions are rejected withUnsupported).P = 84-component lossless is supported on both encode and decode with the Adobe APP14 transform flag honoured on output. - Lossless with non-unit sampling factors (the spec permits this
but no real-world corpus exercises it; rejected with
Unsupported).
Fuzzing
The fuzz/ sub-crate runs eight cargo-fuzz harnesses against the
public encoder + decoder + RTP surface, executed daily by the
org-wide reusable fuzz workflow:
decode— feeds arbitrary bytes (≤ 64 KiB) through the publicDecodertrait (make_decoder→send_packet→receive_frame). Contract: never panic. Covers the SOF / SOS validators (Tdj/Taj/Tqselectors,Nf/Nsbounds,Hi/Vifactors), the multi-SOF rejection, theWt × Ht × Nf ≤ 64 Mpxpixel-budget cap, theBitReader::get_bits(n)guards (n == 0short-circuit,n > 24rejection), and thePq = 1(16-bit quantiser) × coefficient dequantise multiplication (now inf32to skip i32 overflow). Last local 60 s baseline: 25 694 runs, 0 crashes (cov 2023 / ft 7670).arith_decode— wraps fuzz-supplied bytes (≤ 16 KiB) in a minimal SOF9 (extended-sequential arithmetic-coded) JPEG envelope and pushes the result through the sameDecodertrait. A control nibble drives component count (1 vs 3), optional DAC conditioning, optional DRI (restart interval = 1 MCU), the luma sampling factor (4:4:4 vs 4:2:2), and the image dimension (8..=64 px square), so thesrc/jpeg/arith.rsQ-coder (ArithDecoder::new/Initdec/Renorm_d/Byte_in/decode_dc_diff/decode_ac/decode_magnitude) and thedecode_arith_scanper-component statistics + restart-interval bookkeeping execute on every iteration. Contract: never panic; seefuzz_targets/arith_decode.rsfor the enumerated panic surfaces (per-component bin indexing inDcStats::bins[0..49]/AcStats::bins[0..245], thecategory > 15magnitude guard, thedecode_ack > sebound, and the restart-mid-scanErrpath).rtp_depacketize— feeds arbitrary bytes (≤ 16 KiB) through the RFC 2435 RTP/JPEG depacketizer (rtp::parse_main_header,rtp::parse_restart_header,rtp::JpegDepacketizer::push), splitting the input into up to 8 synthetic packets per iteration so the §3.1.2 24-bit fragment-offset reassembly buffer, the §3.1.7 Restart Marker header, the §3.1.8 in-band Quantization Table header, the §4.2 static-Q cache, the marker-bit close path, and thereset()cache-retention invariant all run on every iteration. Contract: never panic. Assembled frames are asserted SOI..EOI; interior correctness is owned by the unit tests insrc/rtp.rs.rtp_packetize— feeds arbitrary bytes (≤ 16 KiB) through the RFC 2435 RTP/JPEG packetizer (rtp::packetize). The packetizer walks a complete external JPEG byte stream and indexes into it by big-endian segment lengths; the harness exercises SOF / DQT / DRI / SOS / catch-all length-field bounds checks, theQMode::Quality(1..=99)andQMode::InBand(128..=255)validation branches, and a range ofmax_payloadMTU buckets (16 / 256 / 1400 / 8192). Contract: never panic. Successful returns are shape-checked (first fragment offset 0, last fragment marker bit set, no payload exceedsmax_payload). Round-trip correctness is owned by the unit tests insrc/rtp.rs. Last local 15 s baseline: 21 819 067 runs, 0 crashes (debug build, no instrumentation; daily CI runs the release-instrumented binary).jpeg_self_roundtrip/jpeg_progressive_self_roundtrip— oxideav-mjpeg encode → oxideav-mjpeg decode round-trip with ±2 LSB YUV tolerance.libjpeg_encode_oxideav_decode/oxideav_encode_libjpeg_decode— cross-decode against systemlibturbojpeg(loaded vialibloadingat runtime; no*-syscrate in the dep tree).
Fixture corpus
tests/docs_corpus.rs decodes every fixture under
docs/image/jpeg/fixtures/<name>/ and compares the result against the
reference PGM/PPM. Each fixture is classified into one of two enforced
tiers (no more silent reporting):
Tier::Exact(5 fixtures): every sample must equal the reference. Coverstiny-baseline-1x1,baseline-grayscale-32x32,lossless-1986-mode,arithmetic-coded(the SOF9 Q-coder path), andbaseline-yuv411-32x32.Tier::PsnrFloor { db, exact_pct }(11 fixtures): total PSNR and total exact-sample percentage must both meet a floor recorded ~0.5–2 dB / ~1–2 pp below the observed value. A real regression (worse IDCT rounding, sloppier YCbCr→RGB) trips the assert; normal floating-point jitter does not flap the suite. Coversbaseline-rgb-32x32,baseline-yuv422-32x32,baseline-yuv420-128x128-q75,baseline-q1-low-quality,baseline-q100-no-loss,progressive-yuv420-128x128,multi-scan-non-interleaved,extended-sequential-12bit,with-restart-interval-8,with-icc-profile-embedded, andwithout-jfif-marker.
The two remaining variants Tier::ReportOnly and Tier::Ignored stay
in the enum for future fixtures that haven't earned a baseline yet.
Benchmarks
benches/codec.rs is a Criterion harness for the encode + decode hot
paths. Run with:
cargo bench -p oxideav-mjpeg --bench codec
Six scenarios, each fed by a deterministically-built in-bench fixture
(xorshift32 + low-amplitude triangle-wave gradient — no committed
payload files, no docs/ reads, no third-party library calls):
baseline_encode/yuv420_256x256_q75— full SOF0 path: forward DCT, AAN-style quantise, Huffman run-length encode, marker emission.baseline_encode/yuv444_64x64_q75— same path on a small 4:4:4 fixture; isolates per-call header / Huffman-table-construction overhead from the per-block cost.baseline_decode/yuv420_256x256_q75— the inverse, driven through theDecodertrait so the bench tracks the same code path application callers exercise.progressive_encode/yuv420_64x64_q75— SOF2 spectral-selection decomposition (7 SOS scans).lossless_encode/gray_pred1_256x256— SOF3 grayscale encode with predictor 1 (Ra / left), the simplest case.lossless_encode/gray_pred4_256x256— SOF3 grayscale encode with predictor 4 (Ra + Rb − Rc), the most expensive 2-D Table H.1 variant; A/B againstpred1measures the predictor-loop cost.
Headline numbers on the round-209 dev machine (Apple Silicon, release
profile, criterion --quick): baseline 4:2:0 encode 256x256 q75 runs
~185 µs / call (≈ 353 Melem/s); the matching decode runs ~248 µs /
call (≈ 264 Melem/s). The 256x256 lossless grayscale encode runs
~370 µs / call independent of predictor choice (the magnitude /
Huffman emission dominates the per-sample cost — the four extra
predictor arithmetic ops in pred=4 disappear into the noise).
License
MIT — see LICENSE.