Skip to main content

Module source_conditional_pair

Module source_conditional_pair 

Source
Expand description

Source-conditional Dirichlet account-pair sampler (SOTA-8).

Per source string, fits a Dirichlet-multinomial over a per-source account pool. Round 0 (FINDINGS §14) showed the synthetic engine’s source-conditional structure is too uniform (entropy 0.97 vs corpus 0.68) and too narrow (5 vs 23.5 accounts per source). This sampler closes both gaps simultaneously: a configurable larger pool, drawn through a concentrated (low-α) Dirichlet.

Math: symmetric Dirichlet(α, …, α) is realised by pᵢ = Gᵢ / Σⱼ Gⱼ with each Gᵢ ~ Gamma(α, 1). Lower α ⇒ concentrated PMF. With α = 0.5 and N_s = 25 the expected normalised entropy is ≈ 0.65 — matching the corpus median of 0.68.

This module is wired in by je_generator only when the transactions .source_conditional_account_pair.enabled config flag is set (default off — opt-in so existing users’ synthetic streams stay byte-identical).

Structs§

SourceConditionalPairSampler
Top-level sampler — one SourcePool per source string.
SourcePool
One source’s account pool with a fitted Dirichlet PMF, ready to sample from.