Expand description
§hyperloglockless
Lightning fast concurrent HyperLogLog for Rust.
§Overview
HyperLogLogs are a space efficient data structures for the “count-distinct problem”, approximating the number of distinct elements in a multiset. Paper.
hyperloglockless is a lockless concurrent HyperLogLog, making it a direct replacement for RwLock<OtherHyperLogLog<K, V>>
. To accomplish these goals, all methods take &self
instead of modifying methods taking &mut self
. This allows you to put a HyperLogLog in an Arc<T>
and share it between threads while still being able to modify it. HyperLogLog puts great effort into performance and aims to be as fast as possible, even in single threaded scenarios.
hyperloglockless is also extremely accurate–it is able to count billions of items with 99.5% accuracy using only 65kb of memory.
§Usage
# Cargo.toml
[dependencies]
hyperloglockless = "0.1.0"
Basic usage:
use hyperloglockless::HyperLogLog;
let hll = HyperLogLog::new(16);
hll.insert("42");
hll.insert("🦀");
let count = hll.count();
§Performance
§Accuracy
hyperloglockless stays accurate while other implementations break down after millions of items.
§Available Features
-
rand
- Enabled by default, this has theDefaultHasher
source its random state usingthread_rng()
instead of hardware sources. Getting entropy from a user-space source is considerably faster, but requires additional dependencies to achieve this. Disabling this feature by usingdefault-features = false
makesDefaultHasher
source its entropy usinggetrandom
, which will have a much simpler code footprint at the expense of speed. -
serde
-HyperLogLog
s implementSerialize
andDeserialize
when possible.
§License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
§Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Structs§
- Hyper
LogLog - HyperLogLog is a data structure for the “count-distinct problem”, approximating the number of distinct elements in a multiset.
Enums§
Type Aliases§
- Default
Hasher - The default hasher for
crate::HyperLogLog
.