oxideav-ac4
Pure-Rust Dolby AC-4 audio decoder foundation — sync / TOC / presentation
/ substream parsing, plus a stub decode path that emits silence at the
correct channel count and sample rate so container fixtures can round-trip
without crashing. Zero C dependencies, no FFI, no *-sys crates.
Part of the oxideav framework but usable standalone.
Status: Foundation. AC-4 is a complex codec. This crate parses the bitstream framing, the table of contents (
ac4_toc()), presentations and substream descriptors per ETSI TS 103 190-1 V1.4.1, and exposes a decoder that emits PCM with the right shape. Mono ASF, stereo CPE (split + joint MDCT), full A-SPX front-end, A-CPL channel-pair synthesis (ASPX_ACPL_1 / ASPX_ACPL_2), DRC + DE + outer metadata walker are all implemented. Round 20 unblocks the ETSI Huffman-table audit (60 codebooks validated byte-for-byte against the canonical ETSI accompaniment file intests/etsi_table_validation.rs) and wires the 5.X channel-element walker family's Cfg0 / Cfg1 / Cfg2 outer shells plus a Table-21-correctsf_info_lfe()parser. Round 21 lands the §5.7.7.6.2 ASPX_ACPL_3 transform-matrix synthesis math (Pseudocodes 118/119 —Transform(),ACplModule2(),ACplModule3()and the full 5-channel pipelinerun_pseudocode_118_5x()). Round 22 lands §5.7.7.6.1 ASPX_ACPL_1 / ASPX_ACPL_2 multichannel wrappers (Pseudocode 117 —run_pseudocode_117_5x(): two parallel ACplModule's with D0/D1 decorrelators) plus the 5_X-walker glue: PCM-level helpersrun_acpl_5x_pair_pcm()(ASPX_ACPL_1/2) andrun_acpl_5x_mch_pcm()(ASPX_ACPL_3) consume the parsedacpl_config_*+acpl_data_*to produce 5-channel L/R/C/Ls/Rs PCM end-to-end via QMF analysis → A-CPL → QMF synthesis. Round 23 wires the per-channelsf_data(ASF)Huffman bodies for the multichannel layouts (Tables 26 / 27 / 28 / 29):parse_two_channel_data/parse_three_channel_data/parse_four_channel_data/parse_five_channel_datanow also walk the trailing 2 / 3 / 4 / 5sf_data(ASF)calls throughdecode_mch_sf_data_channels()and deposit the per-channel scaled MDCT spectra on each*ChannelData::scaled_spec_per_channelfor the long-frame, single-window-group case. The Huffman codebook IDs reused areHCB_1..HCB_11(spectral lines),HCB_SCALEFAC(scale-factor DPCM) andHCB_SNF(spectral noise fill) — Annex A.1 shares the codebooks across mono / stereo / multichannel; there is no separate "MCH" codebook set. Round 24 closes the two r23 follow-ups: (1) the grouped / short-frame multichannelsf_data(ASF)walker (num_window_groups > 1) is now driven bydecode_asf_grouped_mono_body_with_max_sfb()— each per-channel spectrum is the concatenation ofnum_window_groupsindependent(section + spectral + scalefac + snf)chains, group-major; (2) the ASPX_ACPL_3 inner body walker is now wired inparse_5x_audio_data_outer— on an I-frame the walker chainsstereo_data() + aspx_data_2ch() + acpl_data_2ch()and the parsedtools.acpl_data_2chflows straight into the §5.7.7.6.2 Pseudocode-118 5_X synthesis pipeline. The Table-52aspx_data_2ch()body parser was factored out of the stereo CPE ASPX path into a sharedparse_aspx_data_2ch_body()helper — both the stereo CPE mode and the 5_X ASPX_ACPL_3 mode use the same parser. Round 25 wires the ASPX_ACPL_1 / ASPX_ACPL_2 inner body walker inparse_5x_audio_data_outerper §4.2.6.6 Table 25 (case ASPX_ACPL_1: case ASPX_ACPL_2:): a newparse_aspx_acpl_1_2_inner_body()helper walkstwo_channel_data() / three_channel_data()(selected by the 1-bitcoding_config), the ASPX_ACPL_1-only joint-MDCT residual layer (max_sfb_master + 2x chparam_info + 2x sf_data(ASF)over the dominant transform length signalled by the upstream channel data —n_side_bitsis derived per the §4.2.6.6 NOTE), the optional Cfg0 trailermono_data(0), thenaspx_data_2ch()+aspx_data_1ch()and finally the two parallelacpl_data_1ch()calls per Pseudocode 117. The pair lands intools.acpl_data_1ch_pair[0/1](D0 / D1 ACplModule). The walker is try-and-bail: any inner Huffman / parse miss leaves the already-populatedtools.*slots intact and returns silently. Round 27 lands the 7_X channel-element walker (parse_7x_audio_data_outer) per §4.2.6.14 Table 33 — immersive 7.0 and 7.1 streams now parse end-to-end. The 7.X walker mirrors the 5_X SIMPLE/ASPX path'scoding_configselector but has its own quirks: 2-bit7_X_codec_mode(no ASPX_ACPL_3 in 7.X),companding_control(5)only on ASPX_ACPL_{1,2}, the centre/back monos move out of the coding_config switch into a single trailingmono_data(0)gated oncoding_config in {0, 2}, and a SIMPLE/ASPX-only additional-channel block (b_use_sap_add_ch + optional chparam_info×2 + two_channel_data) carries the front-extension / back-surround pair beyond the 5.X core.walk_ac4_substreamnow dispatcheschannels == 7/8(7.0/7.1) into the new walker. Round 28 lands the mono / stereo short-framesf_data(ASF)walker per ETSI TS 103 190-1 §4.2.8.3-6 Tables 39-42: new spec-correct_groupedpayload parsers inasf_data.rs(each with its own outerfor (g = 0; g < num_window_groups; g++)loop, a single 8-bitreference_scale_factorat the head ofasf_scalefac_data()withfirst_scf_foundcarrying across groups, and a single 1-bitb_snf_data_existsgate at the head ofasf_snf_data()), plusderive_per_group()helpers that resolve per-group(transf_length_idx, transform_length, max_sfb)from(ti, psy)per Pseudocodes 2 / 3 / 5 (handling theb_different_framinghalf-frame split). New body decodersdecode_asf_grouped_mono_body[_with_max_sfb]()anddecode_asf_grouped_stereo_joint_body()(shared section, per-groupms_used[g][sfb], inverse M/S) are wired into all four mono / stereo call sites:parse_mono_audio_data_outer,parse_aspx_acpl2_mdct_body,parse_aspx_acpl1_mdct_body(joint + split) andparse_stereo_data_body(joint + split). Real Dolby AC-4 mono / stereo streams using short-window sub-frames now decode end-to-end without bailing at the previousnum_window_groups != 1guard. Round 29 lands the full §5.2.8 SSF arithmetic decoder + Annex C scalar inventory + 37 prediction-coefficient matrices — 705-entryCDF_TABLE,PREDICTOR_GAIN_CDF_LUT,ENVELOPE_CDF_LUT,DITHER_TABLE/RANDOM_NOISE_TABLE,STEP_SIZES_Q4_15,AC_COEFF_MAX_INDEX, the four C.14 dB↔linear LUTs, plusAcState(init/decode_target/decode_symbol_ext_cdf/decode_symbol_calc_cdf/decode_finishper Pseudocodes 41-47), theIdx2Reconstruction + CdfEstcomputed-CDF path (Pseudocodes 51-53), envelope / predictor-gain / coefficient convenience entry points (Pseudocodes 48-50), and theSsfRandGenStatedither + noise RNG (Pseudocodes 54-57). Round 30 lands the SSF bitstream walker (ssf::parse_ssf_data/parse_ssf_granule/parse_ssf_st_data/parse_ssf_ac_dataper Tables 43-46), the Annex C.1 SSF-bandwidths matrix (SSF_BANDWIDTHS, 19 bands × 8 block-length columns),SsfBinLayout::build()(Pseudocode 7 —start_bin[]/end_bin[]/num_bins),SsfFrameConfig(Tables 112-113), and theSsfChannelStatecarrying RNG /prev_pred_lag_idx/last_num_bands/env_prev[]across granules. Wired intowalk_ac4_substreamfor mono SIMPLE/ASPX, split-MDCT stereo, and the ASPX_ACPL_1 split residual layer —spec_frontend == SSFno longer falls through silently. Round 31 lands the §5.2.3-5.2.7 SSF PCM synthesis chain in a newssf_synthmodule: envelope decoder + predictor + helpers + lossless decode + inverse-quant + subband predictor + inverse-flattening (Pseudocodes 4a / 4b / 4c / 4d / 4e / 26 / 31 / 32 / 33 / 34 / 35 / 36 / 37 / 38) plus the C-matrix reconstruction (Pseudocode 39) for all 37tab_idxvalues.synthesize_ssf_data()threadsenv_prev[]between granules.Ac4Decodernow carries a per-channelVec<SsfSynthState>and consumestools.ssf_data_primary/tools.ssf_data_secondaryafter the ASF/A-CPL pipeline: each granule'snum_blocks * n_mdctspectrum is split per-block and IMDCT'd through the existing KBD overlap-add path. SSF substreams now emit real PCM in place of silence. §5.2.5.2.2 Heuristic Scaling (Pseudocodes 27 / 28 / 29 / 30) is deferred — the spec'sf_rfu == 0short-circuit covers any block with the predictor disabled, which the current synth supports. Round 32 closes the SHORT_STRIDE P-frame correctness gap by addingenv_prev: Vec<i32>toSsfSynthState:synthesize_granule()latches the resolved envelope (post-decode_envelopeδ-chain) at the end of each granule so that a SHORT_STRIDE P-granule with no caller-suppliedenv_prev[]interpolates against the previous frame's envelope rather than a zero fallback (§5.2.3.0 Note 2). The walker side gets a parallel hoist:Ac4Decodernow owns aVec<SsfChannelState>(ssf_walker_state) and a newwalk_ac4_substream_stateful()threads it through the SSF body parses so dither / noise RNGs (Pseudocodes 54-57) andprev_pred_lag_idx/last_num_bandspersist across frames — pre-r32 the walker built a fresh state per frame and dropped it. Round 33 lands §5.2.5.2.2 Heuristic Scaling (Pseudocodes 27/28/29/30) — the predictor-enabled spectrum-decoding branch the spec'sf_rfu == 0short-circuit previously skipped. Newmap_db_to_lin_q10()/map_lin_to_db_q10()Q.10 fixed-point converters use the Annex C.14 LUTs;heuristic_scaling()runs the full Pseudocode 28 chain (dynamic-range compression ofenv_in[], sorted-descendingMap_dB_to_Lin,iRfu²-weighted reverse water-filling,Map_Lin_to_dB-driven per-band weight);apply_heuristic_scaling()wraps it with the Pseudocode 27env_in = 3 * env_allocpre-multiply, LF-boost, and(env_alloc_mod, f_gain_q)post-processing.synthesize_granule()dispatches the §5.2.5.2.0 selector — whenf_rfu > 0 && !variance_preservingthe heuristic-scaling branch fires andinverse_heuristic_scale()consumes the resultingf_gain_q[]instead of the all-1 stub;variance_preservingblocks correctly skip the inverse-scale call per §5.2.5.2.0 step 5. 494 tests (481 lib + 5 + 8 integration).
Specs
- ETSI TS 103 190-1 — Channel-based coding + bitstream syntax.
- ETSI TS 103 190-2 — Multi-stream / Immersive / Object-based (IFM).
Installation
[]
= "0.1"
= "0.1"
= "0.0"
What's parsed (TS 103 190-1 clause 4)
- Sync frame (
ac4_syncframe(), Annex G) —0xAC40plain or0xAC41CRC-protected, plus the two-tierframe_size()(16-bit,0xFFFFescape to 24-bit). - Raw frame (
raw_ac4_frame()). - Table of contents (
ac4_toc()): bitstream_version (withvariable_bits(2)escape for version == 3), sequence_counter, wait_frames,fs_index-> 44.1 / 48 kHz,frame_rate_index-> 24…120 fps + 23.44 (Table 83 / 84),b_iframe_global, payload_base. - Presentations: per-presentation
ac4_presentation_info()walking both thepresentation_v1(default) andpresentation_v0forms. Handlespresentation_config0..=5 (M+E+D, Main+DE, Main+Assoc, M+E+D+Assoc, Main+DE+Assoc, Main+HSF) plus thepresentation_config_ext_infoescape,b_hsf_ext,b_pre_virtualizedand additional EMDF substreams. - Substream info:
ac4_substream_info()channel mode (1/2/4/7-bit withvariable_bits(2)escape), sample-frequency multiplier, bitrate_indicator, content_type + language tag, per-frame-rate-factorb_iframeflags. - Substream index table: per-substream
substream_sizewith theb_more_bits/variable_bits(2)extension. - Bit-rate indicator / content classifier / frame_rate_factor /
sf_multiplier all surfaced on the parsed
Ac4FrameInfostruct.
What's not parsed yet
- ASF / ASF-A2 / A-SPX audio coefficient coding (the heart of the
codec). The A-SPX
aspx_config()header andcompanding_control()element are parsed (ETSI §4.2.11 / §4.2.12.1); the Huffman-coded envelope / noise payload (aspx_framing,aspx_ec_data, etc.) is not. - Metadata payloads inside substreams (DRC, dialog normalization,
downmix params) — the spec's
metadata()tree is skipped by size, not parsed. - TS 103 190-2 IFM (immersive / object) extensions.
Decode path
make_decoder builds an Ac4Decoder that:
- Scans the packet for a sync word.
- Parses the full TOC + presentation + substream descriptors, and
therefore knows the channel count, sample rate (44.1 / 48 kHz
scaled by
sf_multiplier), and frame length in samples. - Emits a silence
AudioFrame(S16 zeros) with the correctchannels,sample_rate,samplesandpts.
This is enough to keep a container/demuxer pipeline running against an AC-4 track without crashing, and to exercise the TOC parser against real fixtures.
Codec id
"ac4". Also registers the ISO BMFF fourcc ac-4 so MP4 tracks tagged
with the AC-4 sample entry resolve cleanly.
License
MIT — see LICENSE.