Crate geo_filters

source ·
Expand description

This crate implements probabilistic data structures that solve the Distinct Count Problem using geometric filters. Two variants are implemented, which differ in the way new elements are added to the filter:

  • GeoDiffCount adds elements through symmetric difference. Elements can be added and later removed. Supports estimating the size of the symmetric difference of two sets with a precision related to the estimated size and not relative to the union of the original sets.
  • GeoDistinctCount adds elements through union. Elements can be added, duplicates are ignored. The union of two sets can be estimated with precision. Supports estimating the size of the union of two sets with a precision related to the estimated size. It has some similar properties as related filters like HyperLogLog, MinHash, etc, but uses less space.

Modules§

  • Geometric filter configuration types.
  • Geometric filter implementation for diff count.
  • Geometric filter implementation for distinct count.

Structs§

  • Indicates a diff count estimation, which allows addition and removal of items, and combines values using symmetric difference.
  • Indicates a distinct count estimation, which allows addition of items, and combines values using union.

Traits§

  • Trait for types solving the set cardinality estimation problem.
  • Marker trait to indicate the variant implemented by a Count instance.