# Test Matrix
One component per section. Each section contains a table with test name, description, and what the test validates.
## How To Run
```bash
cargo test
```
## Sketches
### CountMin
Test file: [`src/sketches/countminsketch.rs`](../src/sketches/countminsketch.rs)
| `dimension_test` | Default/custom dimensions initialize zeroed counters. | Verifies default dimensions (`rows=3`, `cols=4096`), custom dimensions (`3x17`), and zero-initialized counters after construction. |
| `fast_insert_same_estimate` | Fast and regular insert paths produce identical estimates. | Inserts five string keys once into both `RegularPath` and `FastPath` sketches (`3x64`) and asserts equal estimates for every key. |
| `merge_adds_counters_element_wise` | Merge sums counters element-wise for matching dimensions. | Merges two `2x32` sketches after inserting the same key (`1` on left, `2` on right) and checks merged per-row target counters equal `3`. |
| `countmin_insert_emit_delta_emits_at_threshold_and_resets_period` | Worker-path delta emission fires at promotion threshold and resets. | Inserts key into `3x64` CMS via `insert_emit_delta`; verifies no delta emitted before `CM_PROMASK` inserts, then exactly `3` deltas (one per row) with `value == CM_PROMASK` at threshold, no extra deltas in next sub-threshold window, and another batch of `3` at the next threshold. |
| `countmin_apply_delta_increments_parent_counter` | Apply delta increments parent counter. | Constructs a `CmDelta{row=1, col=5, value=CM_PROMASK}`, applies it to a `3x64` parent CMS, and verifies the target counter at `(1,5)` equals `CM_PROMASK`. |
| `cm_regular_path_correctness` | Regular-path hashing, counters, and estimates are exact on a deterministic stream. | Recomputes expected counter indices for `I32(0..9)` using per-row hashing, asserts exact full-matrix equality after one pass, doubled counters after second pass, and estimate `== 2` for each inserted key. |
| `cm_fast_path_correctness` | Fast-path counter placement matches bit-sliced hash mapping. | Recomputes expected fast-path indices for `I32(0..9)` from one hash plus row bit-slices/mask bits and asserts exact full-matrix equality. |
| `cm_error_bound_zipf` | Zipf-stream error bound holds for regular and fast paths. | On `200_000` Zipf samples with domain `8192` and exponent `1.1`, checks both paths satisfy: number of distinct queried keys with `\|estimate - true\| < epsilon * N` is `> (1 - delta) * distinct_key_count`, with `epsilon = e / cols`, `delta = e^-rows`. |
| `cm_error_bound_uniform` | Uniform-stream error bound holds for regular and fast paths. | On `200_000` uniform samples in `[100.0, 1000.0]`, checks both paths satisfy: number of distinct queried keys with `\|estimate - true\| < epsilon * N` is `> (1 - delta) * distinct_key_count`, with `epsilon = e / cols`, `delta = e^-rows`. |
| `count_min_round_trip_serialization` | Serialization round trip preserves full sketch state. | Serializes/deserializes a populated `3x8` regular-path sketch and verifies dimensions plus the full counter array are unchanged. |
### Count
Test file: [`src/sketches/countsketch.rs`](../src/sketches/countsketch.rs)
| `default_initializes_expected_dimensions` | Default dimensions initialize zeroed counters. | Verifies default `Count` dimensions (`rows=3`, `cols=4096`) and that all counters are zero after construction. |
| `with_dimensions_uses_custom_sizes` | Custom dimensions initialize zeroed rows. | Verifies `with_dimensions(3, 17)` applies requested shape and each row slice is zero-initialized. |
| `insert_updates_signed_counters_per_row` | Regular insert applies per-row signed updates. | After one insert of key `"alpha"` into a `3x64` sketch, checks each row’s hashed counter equals that row’s expected sign (`+1` or `-1`). |
| `fast_insert_produces_consistent_estimates` | Fast-path single inserts return unit estimates. | Inserts five string keys once into a fast-path sketch (`4x128`) and asserts estimate `== 1.0` for each key. |
| `insert_produces_consistent_estimates` | Regular-path single inserts return unit estimates. | Inserts five string keys once into a regular-path sketch (`3x64`) and asserts estimate `== 1.0` for each key. |
| `estimate_recovers_frequency_for_repeated_key` | Regular path recovers repeated-key frequency. | Inserts key `"theta"` 37 times into a regular-path sketch (`3x64`) and asserts estimate `== 37.0`. |
| `fast_path_recovers_repeated_insertions` | Fast path recovers repeated insertions across keys. | Inserts five keys for 5 rounds into a fast-path sketch (`4x256`) and asserts estimate `== 5.0` for each key. |
| `merge_adds_counters_element_wise` | Merge sums signed counters for matching dimensions. | Merges two regular-path `2x32` sketches after inserting the same key (`1` on left, `2` on right) and checks per-row target counters equal `sign(row,key) * 3`. |
| `count_child_insert_emits_at_threshold` | Worker-path delta emission fires after sufficient inserts. | Inserts key into `3x64` `Count` via `insert_emit_delta` for `200` iterations and verifies at least `3` deltas (one per row) are emitted. |
| `zipf_stream_stays_within_twenty_percent_for_most_keys` | Zipf stream keeps relative error under 20% for most keys. | On Zipf stream (`rows=5`, `cols=8192`, `domain=8192`, `exponent=1.1`, `N=200_000`), computes per-key relative error and requires at least 70% of keys with error `< 0.20`. |
| `cs_regular_path_correctness` | Regular-path counter/sign mapping and estimates are exact on deterministic inserts. | Recomputes expected signed counter updates for `I32(0..9)` using regular hashing/sign logic, asserts exact matrix match after one pass, doubled counters after second pass, and estimate `== 2.0` for each inserted key. |
| `cs_fast_path_correctness` | Fast-path row-hash/sign mapping matches expected counters. | Recomputes expected fast-path updates for `I32(0..9)` using matrix hash row slices and row signs, then asserts exact full-matrix equality. |
| `cs_error_bound_zipf` | Zipf-stream error bound check passes for regular and fast paths. | On Zipf samples (`domain=8192`, `exponent=1.1`, `N=200_000`) with default dimensions, both paths require: count of distinct queried keys with `\|estimate - true\| < epsilon * N` is `> (1 - delta) * distinct_key_count`, with `epsilon = e / cols`, `delta = e^-rows`. |
| `cs_error_bound_uniform` | Uniform-stream error bound check passes for regular and fast paths. | On uniform samples in `[100.0, 1000.0]` with `N=200_000` and default dimensions, requires for both paths that in-bound distinct keys exceed `(1-delta)` fraction (`delta = e^-rows`); regular path uses `epsilon = sqrt(e / cols)` and bound `epsilon * L2_norm`, fast path uses `epsilon = e / cols` and bound `epsilon * N`. |
| `count_sketch_round_trip_serialization` | Serialization round trip preserves full `Count` state. | Serializes/deserializes a populated regular-path `3x8` sketch and verifies dimensions plus full counter array are unchanged. |
| `countl2hh_estimates_and_l2_are_consistent` | CountL2HH updates keep estimate and L2 consistent. | For `CountL2HH(3x32)`, applies `+5` then `-2` to one key, verifies estimates `5.0` then `3.0`, and asserts non-trivial L2 (`>= 3.0`). |
| `countl2hh_merge_combines_frequency_vectors` | CountL2HH merge combines per-key frequencies. | Merges two `CountL2HH(3x32)` sketches with same key counts `4` and `9`, then verifies merged estimate `== 13.0`. |
| `countl2hh_round_trip_serialization` | CountL2HH serialization round trip preserves estimate and L2. | Serializes/deserializes `CountL2HH::with_dimensions_and_seed(3,32,7)` after updates, verifying rows/cols and that both estimate and L2 remain unchanged (within `f64::EPSILON`). |
### HyperLogLog
Test file: [`src/sketches/hll.rs`](../src/sketches/hll.rs)
| `hll_child_insert_emits_on_improvement` | Child insert emits delta only on register improvement. | Inserts a key via `insert_emit_delta` into `HyperLogLog<Classic>`; verifies exactly `1` delta emitted on the first insert and `0` additional deltas on a duplicate insert. |
| `hyperloglog_accuracy_within_two_percent` | Classic HyperLogLog stays within 2% relative error across scale checkpoints. | Inserts sequential unique `U64` values and checks at targets `[10, 100, 1_000, 10_000, 100_000, 1_000_000, 10_000_000]` that relative error `\|estimate-truth\|/truth <= 0.02`. |
| `hll_ertl_accuracy_within_two_percent` | ErtlMLE HyperLogLog stays within 2% relative error across scale checkpoints. | Applies the same checkpointed unique-stream accuracy test as classic HLL, requiring relative error `<= 0.02` at each target cardinality. |
| `hllds_accuracy_within_two_percent` | HIP HyperLogLog stays within 2% relative error across scale checkpoints. | Applies the same checkpointed unique-stream accuracy test to `HyperLogLogHIP`, requiring relative error `<= 0.02` at each target cardinality. |
| `hyperloglog_p12_accuracy_within_two_percent` | P12 Classic HyperLogLog stays within P12 error tolerance across scale checkpoints. | Applies the same checkpointed unique-stream accuracy test to `HyperLogLogP12<Classic>`, requiring relative error `<= P12_ERROR_TOLERANCE` at each target cardinality. |
| `hll_ertl_p12_accuracy_within_two_percent` | P12 ErtlMLE HyperLogLog stays within P12 error tolerance across scale checkpoints. | Applies the same checkpointed accuracy test to `HyperLogLogP12<ErtlMLE>`, requiring relative error `<= P12_ERROR_TOLERANCE`. |
| `hllds_p12_accuracy_within_two_percent` | P12 HIP HyperLogLog stays within P12 error tolerance across scale checkpoints. | Applies the same checkpointed accuracy test to `HyperLogLogHIPP12`, requiring relative error `<= P12_ERROR_TOLERANCE`. |
| `hyperloglog_merge_within_two_percent` | Classic HyperLogLog merge remains within 2% relative error. | Splits unique stream into even keys (left) and odd keys (right), merges sketches at each target checkpoint, and requires merged relative error `<= 0.02`. |
| `hll_ertl_merge_within_two_percent` | ErtlMLE HyperLogLog merge remains within 2% relative error. | Uses the same even/odd split merge scenario and requires merged relative error `<= 0.02` at each target checkpoint. |
| `hyperloglog_p12_merge_within_two_percent` | P12 Classic HyperLogLog merge remains within P12 error tolerance. | Applies the same even/odd split merge scenario to `HyperLogLogP12<Classic>`, requiring merged relative error `<= P12_ERROR_TOLERANCE`. |
| `hll_ertl_p12_merge_within_two_percent` | P12 ErtlMLE HyperLogLog merge remains within P12 error tolerance. | Applies the same even/odd split merge scenario to `HyperLogLogP12<ErtlMLE>`, requiring merged relative error `<= P12_ERROR_TOLERANCE`. |
| `hyperloglog_round_trip_serialization` | Classic HyperLogLog round trip preserves bytes and estimate stability. | After inserting `100_000` unique values, verifies serialized payload is non-empty, `deserialize -> reserialize` bytes are identical, and estimate drift is within `0.02 * max(original_est, 1.0)`. |
| `hll_ertl_round_trip_serialization` | ErtlMLE HyperLogLog round trip preserves bytes and estimate stability. | Applies the same `100_000`-value serialization round-trip checks: non-empty bytes, byte-for-byte reserialization equality, and bounded estimate drift. |
| `hllds_round_trip_serialization` | HIP HyperLogLog round trip preserves bytes and estimate stability. | Applies the same `100_000`-value serialization round-trip checks for `HyperLogLogHIP`: non-empty bytes, byte-for-byte reserialization equality, and bounded estimate drift. |
| `hyperloglog_p12_round_trip_serialization` | P12 Classic HyperLogLog round trip preserves bytes and estimate stability. | Applies the same `100_000`-value serialization round-trip checks for `HyperLogLogP12<Classic>`: non-empty bytes, byte-for-byte reserialization equality, and bounded estimate drift. |
| `hll_ertl_p12_round_trip_serialization` | P12 ErtlMLE HyperLogLog round trip preserves bytes and estimate stability. | Applies the same `100_000`-value serialization round-trip checks for `HyperLogLogP12<ErtlMLE>`: non-empty bytes, byte-for-byte reserialization equality, and bounded estimate drift. |
| `hllds_p12_round_trip_serialization` | P12 HIP HyperLogLog round trip preserves bytes and estimate stability. | Applies the same `100_000`-value serialization round-trip checks for `HyperLogLogHIPP12`: non-empty bytes, byte-for-byte reserialization equality, and bounded estimate drift. |
| `hll_correctness_test` | Register update logic matches expected bucket/index behavior for all HLL variants. | Runs fixed hashed inserts against Classic, ErtlMLE, and HIP variants; asserts exact expected register values at specific bucket indices and confirms an untouched bucket remains zero. |
### KLL
Test file: [`src/sketches/kll.rs`](../src/sketches/kll.rs)
| `coin_bit_cache_behavior` | Coin consumes cached random bits in deterministic bit order. | From a fixed seed, validates 3 successive 64-bit RNG blocks are consumed bit-by-bit (`0..63`) before refill, matching expected xorshift-derived bits exactly. |
| `coin_state_never_zero` | Coin state is never zero, including zero-seed initialization. | Verifies `Coin::from_seed(0)` normalizes to non-zero state and remains non-zero across 128 tosses. |
| `distributions_quantiles_stay_within_rank_error` | KLL quantiles stay within 2% rank tolerance across distributions and scales. | For `k=200`, checks quantiles `{0,0.1,0.25,0.5,0.75,0.9,1}` on Uniform (`0..100,000,000`) and Zipf (`1,000,000..10,000,000`, domain `8192`, exponent `1.1`) streams at sizes `[1_000, 5_000, 20_000, 100_000, 1_000_000, 5_000_000]`; each estimate must fall within truth interval defined by `q +/- 0.02`. |
| `test_data_input_api` | DataInput numeric API is accepted and non-numeric input is rejected. | Inserts `I32`, `I64`, `F64`, `F32`, and `U32` values, checks median query lies between `20.0` and `40.2`, and verifies string input returns error `KLL sketch only accepts numeric inputs`. |
| `test_forced_compact` | Small-capacity KLL triggers compaction and keeps median in valid compacted outcomes. | With `KLL::init(3,3)` and inserts `[10,20,30,40,50]`, asserts median query is one of `{30.0, 40.0}` under forced compaction. |
| `test_no_compact` | Larger-capacity KLL avoids compaction for small stream and returns exact median. | With `KLL::init_kll(8)` and inserts `[10,20,30,40,50]`, asserts median query equals `30.0`. |
| `merge_preserves_quantiles_within_tolerance` | Merging two KLL sketches preserves quantiles within 2% rank tolerance. | Splits 10,000 uniform samples (`1,000,000..10,000,000`, seed `0xC0FFEE`) across two `k=200` sketches by index parity, merges, and checks quantiles `{0,0.1,0.25,0.5,0.75,0.9,1}` remain within `q +/- 0.02` truth bounds. |
| `cdf_handles_empty_sketch` | Empty KLL CDF queries return zero-valued defaults. | For empty `KLL::init_kll(64)`, asserts `cdf.quantile(123.0) == 0.0`, `cdf.query(0.5) == 0.0`, and `cdf.query_li(0.5) == 0.0`. |
| `kll_round_trip_rmp` | RMP round trip preserves KLL structure, packed data, and queried quantiles. | Serializes/deserializes `KLL::init_kll(256)` after 5,000 uniform updates (`0..1,000,000`, seed `0xDEAD_BEEF`), verifies non-empty bytes, core fields and packed arrays (`levels`, `items`) are identical, and CDF queries at `{0,0.1,0.25,0.5,0.75,0.9,1}` match within `f64::EPSILON`. |
| `generic_kll_i64_sanity` | Generic `KLL<T>` path works for non-`f64` numeric types. | Builds `KLL<i64>`, inserts `1..=20_000` through the typed `update(&T)` API, checks approximate count and p50/p90 quantiles, verifies merge on two `KLL<i64>` instances, and confirms MessagePack round-trip preserves weighted count. |
### KLLDynamic
Test file: [`src/sketches/kll_dynamic.rs`](../src/sketches/kll_dynamic.rs)
| `distributions_quantiles_stay_within_rank_error` | KLLDynamic quantiles stay within 2% rank tolerance across distributions and scales. | For `k=200`, checks quantiles `{0,0.1,0.25,0.5,0.75,0.9,1}` on Uniform (`0..100,000,000`) and Zipf (`1,000,000..10,000,000`, domain `8192`, exponent `1.1`) streams at sizes `[1_000, 5_000, 20_000, 100_000, 1_000_000, 5_000_000]`; each estimate must fall within truth interval defined by `q +/- 0.02`. |
| `test_data_input_api` | `KLLDynamic<f64>` accepts numeric `DataInput` and rejects non-numeric input. | Inserts `I32`, `I64`, `F64`, `F32`, and `U32` values through `update_data_input`, checks median query lies between `20.0` and `40.2`, and verifies string input returns error `KLL sketch only accepts numeric inputs`. |
| `test_forced_compact` | Small-capacity KLLDynamic triggers compaction and keeps median in valid compacted outcomes. | With `KLLDynamic::init(3,3)` and typed inserts `[10,20,30,40,50]`, asserts median query is one of `{30.0, 40.0}` under forced compaction. |
| `test_no_compact` | Larger-capacity KLLDynamic avoids compaction for small stream and returns exact median. | With `KLLDynamic::init_kll(8)` and typed inserts `[10,20,30,40,50]`, asserts median query equals `30.0`. |
| `merge_preserves_quantiles_within_tolerance` | Merging two KLLDynamic sketches preserves quantiles within 2% rank tolerance. | Splits 10,000 uniform samples (`1,000,000..10,000,000`, seed `0xC0FFEE`) across two `k=200` sketches by index parity, merges, and checks quantiles `{0,0.1,0.25,0.5,0.75,0.9,1}` remain within `q +/- 0.02` truth bounds. |
| `cdf_handles_empty_sketch` | Empty KLLDynamic CDF queries return zero-valued defaults. | For empty `KLLDynamic::<f64>::init_kll(64)`, asserts `cdf.quantile(123.0) == 0.0`, `cdf.query(0.5) == 0.0`, and `cdf.query_li(0.5) == 0.0`. |
| `kll_dynamic_round_trip_rmp` | RMP round trip preserves KLLDynamic structure, packed data, and queried quantiles. | Serializes/deserializes `KLLDynamic::init_kll(256)` after 5,000 uniform updates (`0..1,000,000`, seed `0xDEAD_BEEF`), verifies non-empty bytes, core fields and packed arrays (`levels`, `items`) are identical, and CDF queries at `{0,0.1,0.25,0.5,0.75,0.9,1}` match within `f64::EPSILON`. |
| `generic_kll_dynamic_i64_sanity` | Generic `KLLDynamic<T>` path works for non-`f64` numeric types. | Builds `KLLDynamic<i64>`, inserts `1..=20_000` through the typed `update(&T)` API, checks approximate count and p50/p90 quantiles, and confirms MessagePack round-trip preserves weighted count. |
### DDSketch
Test file: [`src/sketches/ddsketch.rs`](../src/sketches/ddsketch.rs)
| `insert_and_query_basic` | Basic insert/query preserves count semantics and quantile monotonicity. | Inserts mixed values `[0.0, -5.0, 1.0, 2.0, 3.0, 10.0, 50.0, 100.0, 1000.0]`, verifies non-positive values are ignored (`count == 7`), and checks queried quantiles at `{0.0, 0.5, 0.9, 0.99, 1.0}` are monotone and bounded by sketch min/max. |
| `empty_quantile_returns_none` | Empty sketch returns no quantiles and zero count. | For a new `DDSketch(alpha=0.01)`, asserts `get_value_at_quantile` returns `None` for `p in {0.0, 0.5, 1.0}` and `get_count() == 0`. |
| `dds_uniform_distribution_quantiles` | Uniform-distribution quantiles stay within configured relative error. | With `alpha=0.01`, samples sizes `[1_000, 5_000, 20_000]` from uniform range `[1_000_000, 10_000_000]` (seeded), and requires relative error `<= 0.01` at quantiles `{0, 0.1, 0.25, 0.5, 0.75, 0.9, 1}` against sorted-truth quantiles. |
| `dds_zipf_distribution_quantiles` | Zipf-distribution quantiles stay within configured relative error. | With `alpha=0.01`, samples sizes `[1_000, 5_000, 20_000]` from Zipf range `[1_000_000, 10_000_000]` (domain `8192`, exponent `1.1`, seeded), and requires relative error `<= 0.01` at quantiles `{0, 0.1, 0.25, 0.5, 0.75, 0.9, 1}`. |
| `dds_normal_distribution_quantiles` | Normal-distribution quantiles stay within configured relative error. | With `alpha=0.01`, samples sizes `[1_000, 5_000, 20_000]` from normal distribution (`mean=1000.0`, `std=100.0`, positive finite values retained), and requires relative error `<= 0.01` at quantiles `{0, 0.1, 0.25, 0.5, 0.75, 0.9, 1}`. |
| `dds_exponential_distribution_quantiles` | Exponential-distribution quantiles stay within near-1% relative error. | With `alpha=0.01`, `lambda=1e-3`, and sample sizes `[1_000, 5_000, 20_000]`, requires relative error `<= 0.011 + 1e-9` at quantiles `{0, 0.1, 0.25, 0.5, 0.75, 0.9, 1}`. |
| `merge_two_sketches_combines_counts_and_bounds` | Merge combines counts and preserves quantile boundary invariants. | Merges sketches built from `[1,2,3,4]` and `[5,10,20]`, then verifies merged `count=7`, `min=1`, `max=20`, exact boundary quantiles (`q0=1`, `q1=20`), and median lies within `[1,20]`. |
| `dds_serialization_round_trip` | Serialization round trip preserves count, bounds, and selected quantiles. | Serializes/deserializes a populated sketch (`alpha=0.01`), verifies non-empty bytes, equal `count/min/max`, and exact quantile matches at `{0.0, 0.1, 0.5, 0.9, 1.0}`. |
### CMSHeap
Test file: [`src/sketches/countminsketch_topk.rs`](../src/sketches/countminsketch_topk.rs)
| `insert_and_estimate` | Repeated inserts increment Count-Min estimate for one key. | Inserts `"hello"` 5 times into `CMSHeap::new(3,64,10)` and verifies `estimate("hello") == 5`. |
| `heap_tracks_top_k` | Heap keeps highest-frequency keys within top-k capacity. | Inserts keys `1..5` with frequencies `10,20,30,40,50` into `top_k=3` sketch and verifies heap counts are exactly `[30,40,50]`. |
| `merge_reconciles_heaps` | Merge combines counters and refreshes heap counts from merged sketch. | Merges two sketches containing `"merge_key"` counts `10` and `20`, then verifies merged estimate and heap count are both `30`. |
| `insert_many_updates_estimate_and_heap` | `insert_many` updates estimate and heap entry consistently. | Calls `insert_many("many", 11)` and verifies `estimate == 11` plus heap entry count `11`. |
| `bulk_insert_updates_multiple_keys` | `bulk_insert` updates multiple keys and heavy-hitter counts correctly. | Inserts stream `[7,8,7,9,7]` and verifies estimates `7->3`, `8->1`, `9->1`, with heap count for key `7` equal to `3`. |
| `clear_heap_keeps_cms_counters` | Clearing heap does not clear CMS counters. | After `insert_many("persist",5)`, calls `clear_heap()`, verifies estimate remains `5`, then one more insert rebuilds heap entry to `6`. |
| `from_storage_uses_storage_dimensions` | `from_storage` preserves backend dimensions and requested heap capacity. | Builds from `Vector2D::init(4,128)` with `top_k=9` and verifies `rows=4`, `cols=128`, `heap.capacity=9`. |
| `merge_refreshes_existing_self_heap_entries` | Merge refreshes pre-existing self heap keys to merged estimates. | After merging sketches with `a-key` counts `10` and `5`, verifies merged `a-key` estimate `15` and heap entry count `15`. |
| `fast_path_insert_and_estimate` | Fast path repeated inserts keep estimate exact for single key. | Inserts `"fast"` 7 times into fast-path sketch and verifies estimate `7`. |
| `fast_path_insert_many_and_bulk_insert` | Fast path batched APIs keep heap and estimate in sync. | Applies `insert_many("fast-many",6)` plus bulk inserts adding 2 more hits, then verifies estimate and heap count are `8`. |
| `fast_path_heap_tracks_top_k` | Fast path heap still preserves top-k ordering under weighted updates. | Inserts keys `1..5` with counts `10,20,30,40,50` via `insert_many` and verifies heap counts `[30,40,50]`. |
| `fast_path_merge_refreshes_existing_self_heap_entries` | Fast path merge refreshes self heap entries using merged totals. | Merges sketches where `"a-fast"` contributes `10` and `5` across sides, then verifies estimate and heap count are `15`. |
| `default_construction` | Default CMSHeap constructor uses expected dimensions and heap capacity. | Verifies `CMSHeap::<Vector2D<i64>, RegularPath>::default()` has `rows=3`, `cols=4096`, and `heap.capacity=DEFAULT_TOP_K`. |
| `default_construction_fixed_backends_parity` | Default constructors across storage backends keep intended size/capacity contracts. | Verifies defaults for Fixed/Quick backends are `5x2048`, DefaultMatrix backends are `3x4096`, and all regular/fast variants use `DEFAULT_TOP_K`. |
| `merge_requires_matching_dimensions_panics` | Merge panics on incompatible sketch dimensions. | Verifies merging `CMSHeap::new(3,256,4)` with `CMSHeap::new(4,256,4)` panics with dimension-mismatch message. |
| `heap_entries_match_cms_estimates_after_mutations` | Every heap entry count matches current CMS estimate after updates and merge. | Checks heap-entry equality to `estimate(key)` both before and after merging another mutated sketch. |
| `bulk_insert_equivalent_to_repeated_insert` | Bulk insert is equivalent to repeated single inserts. | Compares bulk vs repeated insertion on same stream and verifies identical per-key estimates and heap counts for keys `1..5`. |
| `regular_vs_fast_equivalence_on_same_stream` | Regular and fast wrappers agree on identical deterministic stream. | Feeds same 10-item string stream to both paths and verifies per-key estimates and heap counts match for `{alpha,beta,gamma,delta,epsilon}`. |
| `merge_with_empty_other_and_empty_self` | Merge behavior is stable when one side is empty. | Verifies merging non-empty with empty leaves counts unchanged and merging empty-self with non-empty copies counts/heap visibility correctly. |
| `duplicate_candidate_keys_during_merge_do_not_corrupt_heap` | Duplicate merge candidates do not duplicate heap entries. | Merges sketches both containing `"dup"`; verifies merged count `19`, heap size within capacity, and exactly one heap entry for `"dup"`. |
| `zipf_stream_top_k_recall_regular_fast_budget` | Regular path heap achieves high top-k recall on Zipf stream. | On Zipf stream (`rows=3`, `cols=4096`, `top_k=16`, `domain=1024`, `exponent=1.1`, `N=20_000`), verifies heap size bound, entry-count consistency, and recall hits `>= 15` vs truth top-16. |
| `zipf_stream_top_k_recall_fast_path_fast_budget` | Fast path heap achieves high top-k recall on Zipf stream. | Runs same Zipf setup in fast mode and verifies heap size bound, entry-count consistency, and recall hits `>= 15`. |
| `zipf_stream_regular_fast_heap_overlap` | Regular and fast heaps substantially overlap on Zipf heavy hitters. | On shared Zipf stream (`top_k=16`), verifies key overlap ratio between regular and fast top-k heaps is at least `0.8`. |
### CSHeap
Test file: [`src/sketches/countsketch_topk.rs`](../src/sketches/countsketch_topk.rs)
| `insert_and_estimate` | Repeated inserts increment `Count` estimate for one key. | Inserts `"hello"` 5 times into `CSHeap::new(5,256,10)` and verifies estimate is `5.0` within `1e-9`. |
| `heap_tracks_top_k` | Heap keeps highest-frequency keys within top-k capacity. | Inserts keys `1..5` with frequencies `100,200,300,400,500` into `top_k=3` sketch and verifies heap counts are exactly `[300,400,500]`. |
| `merge_reconciles_heaps` | Merge combines counters and refreshes heap counts from merged sketch. | Merges two sketches containing `"merge_key"` counts `10` and `20`, then verifies estimate is `30.0` and heap count is `30`. |
| `insert_many_updates_estimate_and_heap` | `insert_many` updates estimate and heap entry consistently. | Calls `insert_many("many", 17)` and verifies estimate `17.0` plus heap entry count equals estimated count. |
| `bulk_insert_updates_multiple_keys` | `bulk_insert` updates multiple keys and heavy-hitter counts correctly. | Inserts stream `[7,8,7,9,7]`, verifies estimate for key `7` is `3.0`, and heap count for key `7` matches estimate cast to integer. |
| `clear_heap_keeps_cs_counters` | Clearing heap does not clear `Count` counters. | After `insert_many("persist",5)`, calls `clear_heap()`, verifies estimate remains `5.0`, then one more insert repopulates heap with updated estimate count. |
| `from_storage_uses_storage_dimensions` | `from_storage` preserves backend dimensions and requested heap capacity. | Builds from `Vector2D::init(4,128)` with `top_k=9` and verifies `rows=4`, `cols=128`, `heap.capacity=9`. |
| `merge_refreshes_existing_self_heap_entries` | Merge refreshes pre-existing self heap keys to merged estimates. | Merges sketches where `"a-key"` is updated on both sides (`120` and `40`), then verifies heap count for `"a-key"` equals merged estimate. |
| `fast_path_insert_and_estimate` | Fast path repeated inserts keep estimate exact for single key. | Inserts `"fast"` 7 times into fast-path sketch and verifies estimate is `7.0` within `1e-9`. |
| `fast_path_insert_many_and_bulk_insert` | Fast path batched APIs keep heap and estimate in sync. | Applies `insert_many("fast-many",6)` plus bulk inserts adding 2 hits, then verifies estimate is `8.0` and heap count matches it. |
| `fast_path_heap_tracks_top_k` | Fast path heap preserves top-k ordering under weighted updates. | Inserts keys `1..5` with counts `100,200,300,400,500` via `insert_many` and verifies heap counts `[300,400,500]`. |
| `fast_path_merge_refreshes_existing_self_heap_entries` | Fast path merge refreshes self heap entries using merged totals. | Merges fast sketches where `"a-fast"` is updated on both sides (`120` and `40`) and verifies heap count equals merged estimate. |
| `default_construction` | Default CSHeap constructor uses expected dimensions and heap capacity. | Verifies `CSHeap::<Vector2D<i64>, RegularPath>::default()` has `rows=3`, `cols=4096`, and `heap.capacity=DEFAULT_TOP_K`. |
| `default_construction_fixed_backends_parity` | Default constructors across storage backends keep intended size/capacity contracts. | Verifies defaults for Fixed/Quick backends are `5x2048`, DefaultMatrix backends are `3x4096`, and all regular/fast variants use `DEFAULT_TOP_K`. |
| `merge_requires_matching_dimensions_panics` | Merge panics on incompatible sketch dimensions. | Verifies merging `CSHeap::new(5,256,4)` with `CSHeap::new(6,256,4)` panics with dimension-mismatch message. |
| `heap_entries_match_cs_estimates_after_mutations` | Every heap entry count matches current sketch estimate after updates and merge. | Checks heap-entry equality to `estimate(key)` both before and after merging another mutated sketch. |
| `bulk_insert_equivalent_to_repeated_insert` | Bulk insert is equivalent to repeated single inserts. | Compares bulk vs repeated insertion on same stream and verifies per-key estimates match within `1e-9` plus identical heap counts for keys `1..5`. |
| `regular_vs_fast_equivalence_on_same_stream` | Regular and fast wrappers agree on identical deterministic stream. | Feeds same 10-item string stream to both paths and verifies per-key estimates match within `1e-9` and heap counts match for `{alpha,beta,gamma,delta,epsilon}`. |
| `merge_with_empty_other_and_empty_self` | Merge behavior is stable when one side is empty. | Verifies merging non-empty with empty leaves estimates/heap size unchanged and merging empty-self with non-empty reproduces estimates and heap visibility. |
| `duplicate_candidate_keys_during_merge_do_not_corrupt_heap` | Duplicate merge candidates do not duplicate heap entries. | Merges sketches both containing `"dup"`; verifies heap count equals merged estimate, heap size stays within capacity, and only one heap entry exists for `"dup"`. |
| `zipf_stream_top_k_recall_regular_fast_budget` | Regular path heap achieves high top-k recall on Zipf stream. | On Zipf stream (`rows=5`, `cols=4096`, `top_k=16`, `domain=1024`, `exponent=1.1`, `N=20_000`), verifies heap size bound, entry-count consistency, and recall hits `>= 15` vs truth top-16. |
| `zipf_stream_top_k_recall_fast_path_fast_budget` | Fast path heap achieves high top-k recall on Zipf stream. | Runs same Zipf setup in fast mode and verifies heap size bound, entry-count consistency, and recall hits `>= 15`. |
| `zipf_stream_regular_fast_heap_overlap` | Regular and fast heaps substantially overlap on Zipf heavy hitters. | On shared Zipf stream (`top_k=16`), verifies key overlap ratio between regular and fast top-k heaps is at least `0.8`. |
### Elastic
Test file: [`src/sketches/elastic.rs`](../src/sketches/elastic.rs)
| `heavy_bucket_tracks_repeated_flow_exactly` | Heavy bucket tracks repeated flow exactly. | Top-K/heavy-hitter tracking and updates behave as expected. |
| `light_sketch_counts_colliding_flows` | Light sketch counts colliding flows. | Core functional behavior for this component path is validated. |
### Coco
Test file: [`src/sketches/coco.rs`](../src/sketches/coco.rs)
| `insert_then_estimate_matches_full_value_for_partial_key` | Insert then estimate matches full value for partial key. | Core behavior for insert/query/update and deterministic semantics is validated. |
| `estimate_with_udf_allows_custom_partial_matching` | Estimate with udf allows custom partial matching. | Core behavior for insert/query/update and deterministic semantics is validated. |
| `merge_combines_tables_without_losing_counts` | Merge combines tables without losing counts. | Merge behavior preserves expected aggregate semantics and internal invariants. |
### KMV
Test file: [`src/sketches/kmv.rs`](../src/sketches/kmv.rs)
| `assert_accuracy` | Assert accuracy. | Accuracy/error behavior stays within expected bounds on representative workloads. |
| `assert_merge_accuracy` | Assert merge accuracy. | Merge behavior preserves expected aggregate semantics and internal invariants. |
| `assert_serialization_round_trip` | Assert serialization round trip. | Serialization/deserialization preserves component state and behavior after round trip. |
### UniformSampling
Test file: [`src/sketches/uniform.rs`](../src/sketches/uniform.rs)
| `sample_count_tracks_rate` | Sample count tracks rate. | Core behavior for insert/query/update and deterministic semantics is validated. |
| `samples_are_drawn_from_input_stream` | Samples are drawn from input stream. | Core behavior for insert/query/update and deterministic semantics is validated. |
| `merge_combines_samples_using_rate_based_target` | Merge combines samples using rate based target. | Merge behavior preserves expected aggregate semantics and internal invariants. |
| `merge_rejects_different_rates` | Merge rejects different rates. | Merge behavior preserves expected aggregate semantics and internal invariants. |
| `sample_access_is_stable` | Sample access is stable. | Core behavior for insert/query/update and deterministic semantics is validated. |
## Sketch Frameworks
### Hydra
Test file: [`src/sketch_framework/hydra.rs`](../src/sketch_framework/hydra.rs)
| `hydra_updates_countmin_frequency` | Hydra updates countmin frequency. | Updates `"user;session"` with value `"event"` 5 times and verifies combined query `>= 5` while an unrelated key query is exactly `0.0`. |
| `hydra_updates_countmin_frequency_multiple_values` | Hydra updates countmin frequency multiple values. | Inserts values `I64(0..4)` with multiplicity `i` under one key, verifies per-value fan-out query `>= i`, and checks unrelated-key query returns `0.0`. |
| `hydra_round_trip_serialization` | Hydra round trip serialization. | After mixed inserts, verifies MessagePack round trip keeps non-empty payload, preserves dimensions/template type, and keeps queried frequencies exactly unchanged. |
| `multihead_hydra_updates_multiple_dimensions` | Multihead hydra updates multiple dimensions. | With two heads (`events`, `latency`), repeated updates make full-key and fan-out frequency queries for each head return at least `3.0`. |
| `hydra_subpopulation_frequency_test` | Hydra subpopulation frequency test. | On a fixed labeled dataset, asserts exact subpopulation frequencies for single-label, multi-label, full-key, and disjoint cross-population queries (including zero-result case). |
| `hydra_subpopulation_cardinality_test` | Hydra subpopulation cardinality test. | Using HLL-backed counters, checks single/multi/full-key cardinalities are approximately `3.0` (within `EPSILON`) and disjoint/unknown keys return `0.0`. |
| `hydra_tracks_kll_quantiles` | Hydra tracks KLL quantiles. | For inserted samples `[10,20,30,40,50]`, verifies CDF query at `30.0` is `0.6` (within `1e-9`) and empty-bucket query returns `0.0`. |
| `hydra_kll_single_label_cdfs` | Hydra KLL single label cdfs. | For each label group, verifies exact expected CDF levels `{1/3, 2/3, 1}` at chosen thresholds using `EPSILON` tolerance. |
| `hydra_kll_multi_label_cdfs` | Hydra KLL multi label cdfs. | Verifies exact CDF values for multi-label combinations and confirms a non-overlapping key pair returns CDF `0.0`. |
| `hydra_kll_extreme_queries` | Hydra KLL extreme queries. | Confirms CDF boundary behavior (`0` below range, `1` above range) for known keys and `0` for unknown keys. |
| `test_count_min_frequency_query` | Test count min frequency query. | Inserts one key three times into `HydraCounter::CM`, then verifies `Frequency` query succeeds and returns exactly `3.0`. |
| `test_count_min_invalid_query_types` | Test count min invalid query types. | Verifies unsupported CM queries return errors, including exact message for `Quantile` (`"Count-Min Sketch Counter does not support Quantile Query"`). |
| `test_hll_cardinality_query` | Test HLL cardinality query. | Inserts `100` unique items plus one duplicate and verifies `Cardinality` query succeeds with estimate constrained to `(90.0, 110.0)`. |
| `test_kll_quantile_query` | Test KLL quantile query. | Inserts values `1..=100` and verifies median query succeeds with estimate within `+/-5` of `50.0`. |
| `test_univmon_universal_queries` | Test univmon universal queries. | Inserts `A` 10 times and `B` 20 times, then checks `L1=30.0`, cardinality is approximately `2.0` (`abs err < 0.5`), and entropy is positive. |
| `test_merge_counters` | Test merge counters. | Merges two CM counters and verifies frequency sum (`2.0`) for shared key, then confirms merging with mismatched counter type (`HLL`) returns error. |
| `test_count_frequency_query` | Test count frequency query. | Inserts one `Count` key four times and verifies `Frequency` query succeeds with exact result `4.0`. |
| `test_count_invalid_query_types` | Test count invalid query types. | Verifies unsupported `Count` queries fail, including exact `Quantile` error message and error on `Cardinality`. |
### HashSketchEnsemble
Test file: [`src/sketch_framework/hashlayer.rs`](../src/sketch_framework/hashlayer.rs)
| `test_insert_and_estimate` | Insert and frequency estimate accuracy on Zipf stream. | On Zipf stream (`N=10_000`, `domain=1000`, `exp=1.5`), builds default 2-sketch ensemble (CMS + Count, `3x4096`) and verifies average relative error for both CMS (index `0`) and Count (index `1`) frequency estimates over sampled keys is below `0.1`. |
| `test_insert_at` | Targeted insert updates only specified indices. | Inserts only at index `[0]` via `insert_at`, then verifies CMS at index `0` has a positive estimate while Count at index `1` returns `0.0`. |
| `test_insert_with_hash_matches_insert` | Pre-computed hash insert matches regular insert. | Builds two identical ensembles; one uses `insert()`, the other uses `hash_input()` + `insert_with_hash()`. Verifies CMS estimates at index `0` are identical for a probe key. |
| `test_hll_cardinality` | HLL-only ensemble cardinality accuracy. | Builds HLL-only ensemble (`HyperLogLog<ErtlMLE>`), inserts Zipf stream, and verifies cardinality relative error vs true distinct count is `< 0.02`. |
| `test_estimate_on_hll_returns_error` | Frequency query on HLL sketch returns error. | Calls `estimate()` on an HLL-only ensemble and verifies it returns `Err`. |
| `test_cardinality_on_cms_returns_error` | Cardinality query on CMS sketch returns error. | Calls `cardinality()` on a CMS+Count ensemble and verifies it returns `Err`. |
| `test_direct_access` | Index-based get/get_mut and sketch type reporting. | Verifies `get(0)` and `get(1)` return `Some`, `get(2)` returns `None`, and `get_mut(0)` reports `sketch_type() == "CountMin"`. |
| `test_bounds_checking` | Out-of-bounds queries return errors. | Confirms `estimate(999, ...)`, `cardinality(999)`, and `estimate_with_hash(999, ...)` all return `Err`. |
| `test_custom_dimensions` | Custom-dimension ensemble insert and estimate. | Builds 2-sketch ensemble (CMS + Count, `5x2048`), verifies `len=2`/non-empty, inserts Zipf stream, and confirms both indices return positive estimates. |
| `test_mixed_matrix_and_hll` | Mixed CMS + HLL ensemble queries. | Builds ensemble with one CMS and one `HyperLogLog<ErtlMLE>`, inserts Zipf stream, verifies CMS estimate at index `0` is positive and HLL cardinality error at index `1` is `< 0.05`. |
| `test_push_compatible` | Push compatible sketch succeeds. | Creates single-CMS ensemble (`3x4096`), pushes a Count sketch with matching dimensions, verifies `push` returns `Ok` and `len=2`. |
| `test_push_incompatible_rejected` | Push incompatible sketch is rejected. | Creates single-CMS ensemble (`3x4096`), pushes a Count sketch with different dimensions (`5x2048`), verifies `push` returns `Err`. |
### UnivMon
Test file: [`src/sketch_framework/univmon.rs`](../src/sketch_framework/univmon.rs)
| `univmon_round_trip_serialization` | Univmon round trip serialization. | After weighted inserts, verifies non-empty serialization and round-trip preservation of configuration fields, `bucket_size`, `L1/L2/entropy` (`<1e-6` drift), and cardinality (`< EPSILON` drift). |
| `update_populates_bucket_size_and_heavy_hitters` | Update populates bucket size and heavy hitters. | Inserting one hot key `40` times sets `bucket_size=40`, tracks key in heavy-hitter heap with count `>=20`, and yields exact `L1=40` and `cardinality=1`. |
| `merge_with_combines_heavy_hitters` | Merge with combines heavy hitters. | Merging sketches with disjoint heavy keys verifies merged left heap contains both contributions (`left=25`, `right=30`) while right heap retains `right=30`. |
| `univmon_layers_use_different_seeds` | Univmon layers use different seeds. | Verifies hash outputs for the same key with seed indices `0..3` are all pairwise different. |
| `univmon_cardinality_is_positive` | Univmon cardinality is positive. | After inserting `20` distinct flow keys, cardinality estimate is exactly `20.0`. |
| `univmon_bucket_size_tracked_correctly` | Univmon bucket size tracked correctly. | Inserts counts `100`, `200`, `150` for three flows and verifies `bucket_size` equals total `450`. |
| `univmon_basic_operation` | Univmon basic operation. | On fixed mixed workload, verifies exact aggregate metrics `cardinality=10.0` and `L1=131.0`. |
| `test_statistical_accuracy` | Test statistical accuracy. | On heavy/medium/noise synthetic distribution, verifies relative error for both `L2` and `entropy` is below `0.15`. |
| `univmon_random_data_matches_ground_truth_within_five_percent` | Univmon random data matches ground truth within five percent. | Over `10_000` random weighted updates, requires relative error `<= 0.05` for `cardinality`, `L1`, `L2`, and `entropy` against exact truth map. |
### UnivMon Optimized
Test file: [`src/sketch_framework/univmon_optimized.rs`](../src/sketch_framework/univmon_optimized.rs)
| `pool_basic_take_put` | Pool basic take put. | Validates pool accounting: initial preallocation (`available=2`, `allocated=2`), on-demand allocation when empty (`allocated=3`), and reuse on put/take without further allocation. |
| `pool_free_resets_sketch` | Pool free resets sketch. | Confirms returning a used sketch to pool resets state so retaken sketch has `bucket_size=0` and near-zero `L2` in layer `0`. |
| `pyramid_basic_insert_and_query` | Pyramid basic insert and query. | For simple inserts, verifies exact aggregate state with `bucket_size=65`, `L1` approximately `65` (`<1e-6`), and `cardinality=3`. |
| `pyramid_fast_insert_matches_standard` | Pyramid fast insert matches standard. | On identical 500-item stream, verifies standard vs fast paths keep identical `bucket_size`, with `L1` deviation `<10%` and cardinality deviation `<15%`. |
| `pyramid_two_tier_dimensions` | Pyramid two tier dimensions. | Verifies two-tier layout metadata for configured pyramid (`layer_size=8`, `elephant_layers=4`). |
| `pyramid_free_resets_state` | Pyramid free resets state. | After bulk inserts, `free()` resets sketch to empty baseline (`bucket_size=0`, layer-0 `L2` approximately `0`). |
| `pyramid_merge_combines_data` | Pyramid merge combines data. | Merging disjoint halves verifies merged `L1` stays within `10%` of the sum of pre-merge `L1` values. |
| `pyramid_accuracy_zipf` | Pyramid accuracy Zipf. | On heavy/medium/light Zipf-like workload, requires relative error `<15%` for `L1`, `L2`, cardinality, and entropy. |
| `pyramid_fast_insert_accuracy` | Pyramid fast insert accuracy. | Using `fast_insert` only, requires relative error `<15%` for `L1`, `L2`, cardinality, and entropy versus exact frequency map. |
| `pyramid_memory_savings_vs_uniform` | Pyramid memory savings vs uniform. | Verifies pyramid column budget is smaller than uniform baseline and computed memory savings exceed `30%`. |
### NitroBatch
Test file: [`src/sketch_framework/nitro.rs`](../src/sketch_framework/nitro.rs)
| `nitro_batch_countmin_error_bound_zipf` | Nitro batch countmin error bound Zipf. | On Zipf stream (`rows=3`, `cols=4096`, `N=200_000`), verifies CountMin estimates satisfy in-bound key count `> (1-delta)*distinct` using `epsilon=e/cols`, `delta=e^-rows`, and bound `epsilon*N`. |
| `nitro_batch_count_error_bound_zipf` | Nitro batch count error bound Zipf. | Applies the same probabilistic in-bound criterion to `Count` median estimates with `epsilon=e/cols`, `delta=e^-rows`, and bound `epsilon*N`. |
### ExponentialHistogram
Test file: [`src/sketch_framework/eh.rs`](../src/sketch_framework/eh.rs)
| `constructor_infers_merge_norm` | Constructor infers merge norm. | Verifies constructor infers `SketchNorm::L1` for CM payload and `SketchNorm::L2` for `COUNTL2HH` payload. |
| `l1_merge_invariant_same_size` | L1 merge invariant same size. | Under repeated updates with `k=2`, verifies L1 merge policy compacts buckets so `bucket_count < 10`. |
| `l2_merge_invariant_sum_l22` | L2 merge invariant sum l22. | With `k=1` and weighted updates, verifies L2 merge rule keeps bucket count bounded (`bucket_count <= 2`). |
| `merge_recomputes_l2_mass` | Merge recomputes L2 mass. | After L2 merges, verifies bounded bucket count (`<=2`) and non-negative recomputed `l2_mass` for every payload bucket. |
| `test_basic_insertion_and_query` | Test basic insertion and query. | After one update at `t=100`, verifies single bucket presence, exact min/max timestamps (`100`), and successful interval merge query for `[100,100]`. |
### EHSketchList
Test file: [`src/sketch_framework/eh_sketch_list.rs`](../src/sketch_framework/eh_sketch_list.rs)
| `insert_routes_to_countl2hh_and_univmon` | Insert routes to countl2hh and univmon. | Verifies variant routing by checking `COUNTL2HH` estimate `>=9` after 9 inserts and `UNIVMON` `bucket_size=6` after 6 inserts. |
| `count_sketch_insert_and_query_round_trip` | Count insert and query round trip. | Confirms the `Count` variant updates/query path by inserting one key and verifying returned estimate is at least `1.0`. |
| `ddsketch_insert_and_quantile_query_round_trip` | DDSketch insert and quantile query round trip. | Inserts `10,20,30` into DDSketch variant and verifies queried median (`q=0.5`) lies within `[10.0, 30.0]`. |
| `supports_norm_whitelist_is_enforced` | Supports norm whitelist is enforced. | Validates norm capability matrix: `CM/CS/DDS` support `L1` only, while `COUNTL2HH/UNIVMON` support `L2` only. |
### EHUnivOptimized
Test file: [`src/sketch_framework/eh_univ_optimized.rs`](../src/sketch_framework/eh_univ_optimized.rs)
| `basic_insertion_and_query` | Basic insertion and query. | For updates `{(1,5),(2,3),(1,2)}` across `[100,102]`, verifies map-tier result with exact counts (`1->7`, `2->3`, `total=10`) plus `L1=10` and `cardinality=2`. |
| `map_merge_bounds_volume` | Map merge bounds volume. | With `k=1` and 50 one-count updates, verifies merge policy bounds growth so `bucket_count < 50`. |
| `promotion_creates_sketch_buckets` | Promotion creates sketch buckets. | Under small promotion thresholds and many distinct updates, verifies at least one map bucket is promoted (`um_buckets` becomes non-empty). |
| `window_expiration` | Window expiration. | With `window=100`, advancing to `t=200` after earlier inserts confirms expiration by forcing oldest surviving `min_time` to recent range (`>=100` or `==200`). |
| `hybrid_query_returns_sketch` | Hybrid query returns sketch. | After forcing both map and sketch tiers to coexist, verifies interval query spanning both returns `EHUnivQueryResult::Sketch` (not map-only). |
| `cover_check` | Cover check. | Verifies coverage logic transitions from false (empty) to true for contained intervals and remains false when query extends outside observed range. |
| `accuracy_known_distribution` | Accuracy known distribution. | On fixed known histogram, verifies query estimates for `L1`, `L2`, cardinality, and entropy each stay within `10%` relative error. |
| `pool_used_during_promotion` | Pool used during promotion. | With bounded preallocated pool, promotion workload verifies sketch-tier creation and confirms pool allocation accounting remains active (`total_allocated >= 2`). |
| `correctness_map_only_exact` | Correctness map only exact. | For map-only regime, verifies `L1/L2/cardinality/entropy` each match exact truth within `1%` tolerance. |
| `correctness_subinterval_query` | Correctness subinterval query. | For two-phase stream, verifies full-interval query recovers `L1` approximately `200` and `cardinality` approximately `2` within `5%` tolerance. |
| `correctness_expired_data_excluded` | Correctness expired data excluded. | After sliding beyond window cutoff, verifies very old segment is excluded by checking earliest retained bucket time is at least `50`. |
| `correctness_volume_bounded_long_stream` | Correctness volume bounded long stream. | Over `20_000` updates with `k=4`, verifies EH volume bound by requiring maximum observed bucket count `< 200`. |
| `correctness_pool_recycling_across_cycles` | Correctness pool recycling across cycles. | Long-run expiration/promotion cycling keeps pool bounded (`total_allocated < 50`) and still returns valid interval query results. |
| `correctness_sketch_merge_preserves_metrics` | Correctness sketch merge preserves metrics. | After repeated promotions/merges, verifies each sketch bucket has positive `L2^2` and stored `l22` stays within `1%` relative difference of recomputed value. |
| `accuracy_zipf_distribution_sketch_tier` | Accuracy Zipf distribution sketch tier. | On heavy/medium/light Zipf-like stream in sketch tier, requires `L1/L2/cardinality/entropy` relative errors each `<= 15%`. |
| `accuracy_uniform_distribution` | Accuracy uniform distribution. | On uniform stream, requires `L1/L2/cardinality/entropy` relative errors each `<= 10%`. |
| `accuracy_sliding_window` | Accuracy sliding window. | Across suffix and periodic sliding-window queries, verifies average relative error for `L1`, `L2`, cardinality, and entropy is each below `15%`. |
| `accuracy_varies_with_k` | Accuracy varies with K. | For `k in {2,8,32}`, verifies per-k average of `L1/L2` relative errors remains under `15%` on same fixed stream/window. |
| `accuracy_suffix_queries` | Accuracy suffix queries. | Across suffix lengths `[1000,2000,5000,8000]`, verifies worst observed `L2` relative error remains below `20%`. |
| `accuracy_distribution_shift` | Accuracy distribution shift. | For two-phase distribution shift stream, verifies full-span `L1/L2/cardinality/entropy` estimates each stay within `15%` relative error. |
## Common
### Common Hash Utilities
Test file: [`src/common/hash.rs`](../src/common/hash.rs)
| `hash128_seeded_preserves_cardinality` | Hash128 seeded preserves cardinality. | With `SEED_IDX=0` and `SAMPLE_SIZE=5000`, verifies uniform and Zipf sample unique-input counts exactly match unique-hash counts (no observed collisions). |
| `hash128_seeded_is_deterministic_for_repeated_inputs` | Hash128 seeded is deterministic for repeated inputs. | For fixed key `"deterministic-key"` and seed `3`, verifies 100 repeated `hash128_seeded` calls always equal the first hash value. |
### Common Heap Utilities
Test file: [`src/common/heap.rs`](../src/common/heap.rs)
| `heap_retains_top_k_items_by_count` | Heap retains top K items by count. | For `HHHeap::new(3)` updated with counts `1..5`, verifies heap size is `3` and retained counts are exactly `[3,4,5]`. |
| `update_count_increments_existing_entry` | Update count increments existing entry. | Repeatedly updates key `alpha` with counts `1,2,3` and verifies stored heap entry count is `3` (incremental update, not replacement). |
| `clean_resets_heap_state` | Clean resets heap state. | After inserting two items into `HHHeap::new(2)`, `clear()` is verified to leave the heap empty. |
| `test_min_heap_basic` | Test min heap basic. | For `CommonHeap::<i32, KeepSmallest>::new_min(5)`, verifies `peek=1` and pop order `1,3,5,7`, then `None`. |
| `test_max_heap_basic` | Test max heap basic. | For `CommonHeap::<i32, KeepLargest>::new_max(5)`, verifies `peek=7` and pop order `7,5,3,1`, then `None`. |
| `test_bounded_heap_capacity` | Test bounded heap capacity. | With min-heap capacity `3`, verifies length never exceeds 3 and final retained values are `[5,7,10]` after pushing `5,3,7,1,10`. |
| `test_update_at` | Test update at. | After mutating an internal element (`heap[1]=3`) and calling `update_at(1)`, verifies heap root updates so `peek()` becomes `3`. |
| `test_custom_struct_with_ord` | Test custom struct with ord. | Uses `HHItem` values with counts `5,3,7` and verifies min-heap ordering by checking root count is `3`. |
| `test_topk_use_case` | Test topk use case. | Simulated top-k flow keeps only counts `[3,4,5]` at capacity 3 and verifies lookup for `key-4` succeeds with count `4`. |
| `test_heap_size` | Test heap size. | Verifies both `CommonHeap<u64, KeepSmallest>` and `CommonHeap<u64, KeepLargest>` sizes equal `size_of::<Vec<u64>>() + size_of::<usize>()`. |
| `test_topk_with_custom_comparator` | Test topk with custom comparator. | With custom comparator and capacity 3, verifies low-count insert is rejected/replaced as expected so heap size is 3 and root count is `5`. |
| `test_exact_topk_heap_replacement` | Test exact topk heap replacement. | Reproduces TopK-style find/update flow for keys `1..5`, verifies retained counts `[3,4,5]`, finds `key-4` with count `4`, then verifies `clear()` makes heap empty. |
### Common Structure Utilities
Test file: [`src/common/structure_utils.rs`](../src/common/structure_utils.rs)
| `median_test` | Median test. | For 1,000 seeded random arrays of lengths 3, 4, and 5, verifies `compute_median_inline_f64` exactly matches sort-based median for every case. |
### Vector2D (Common Structure)
Test file: [`src/common/structures/vector2d.rs`](../src/common/structures/vector2d.rs)
| `required_bits_match_expected_thresholds` | Required bits match expected thresholds. | Verifies `Vector2D::get_required_bits()` returns `64` for `(3,4096)`, `32` for `(3,64)`, and `128` for `(5,1_048_576)`. |