Expand description
DMC (Dynamic Markov Compression) — bit-level automaton predictor.
A prediction model based on state cloning: starts with a small initial automaton and adaptively clones states when a transition is used frequently enough, creating context-specific states that capture sub-byte and cross-byte patterns.
Key properties:
- Bit-level: predict() returns 12-bit probability, update(bit) transitions automaton
- State cloning: when a transition count exceeds clone_threshold and the target state has accumulated significantly more counts, clone the target into a new state specific to this transition path
- Deterministic: uses integer arithmetic only for count splitting (no floats)
- Self-resetting: when max_states reached, reinitialize to starting automaton
Memory: ~64MB with 4M states at 16 bytes/state.
Reference: Cormack & Horspool, “Data Compression using Dynamic Markov Modelling” (1987). PAQ8PX uses a DmcForest with multiple clone thresholds.
Structs§
- DmcModel
- DmcForest: multiple DMC instances with different clone thresholds. Each instance captures patterns at different granularities. Predictions are averaged in probability space.