Skip to main content

Module dmc_model

Module dmc_model 

Source
Expand description

DMC (Dynamic Markov Compression) — bit-level automaton predictor.

A prediction model based on state cloning: starts with a small initial automaton and adaptively clones states when a transition is used frequently enough, creating context-specific states that capture sub-byte and cross-byte patterns.

Key properties:

  • Bit-level: predict() returns 12-bit probability, update(bit) transitions automaton
  • State cloning: when a transition count exceeds clone_threshold and the target state has accumulated significantly more counts, clone the target into a new state specific to this transition path
  • Deterministic: uses integer arithmetic only for count splitting (no floats)
  • Self-resetting: when max_states reached, reinitialize to starting automaton

Memory: ~64MB with 4M states at 16 bytes/state.

Reference: Cormack & Horspool, “Data Compression using Dynamic Markov Modelling” (1987). PAQ8PX uses a DmcForest with multiple clone thresholds.

Structs§

DmcModel
DmcForest: multiple DMC instances with different clone thresholds. Each instance captures patterns at different granularities. Predictions are averaged in probability space.