Expand description
Starshard: a high-performance, lazily sharded concurrent HashMap.
§Features
async
: enablesAsyncShardedHashMap
backed bytokio::sync::RwLock
.rayon
: enables parallel snapshot flattening inside iteration for large maps (sync + async).serde
: (sync) serialize/deserialize via a stable map snapshot; async map via snapshot helper.
§Serde Semantics
ShardedHashMap
:
- Serialized form: { shard_count: usize, entries: Vec<(K,V)> }.
- Hasher state is not preserved; deserialization rebuilds with
S::default()
. - Requires
K: Eq + Hash + Clone + Send + Sync + Serialize + Deserialize
,V: Clone + Send + Sync + Serialize + Deserialize
,S: BuildHasher + Clone + Send + Sync + Default
.
AsyncShardedHashMap
:
- No direct
Serialize
/Deserialize
(locks needawait
). - Use
async_snapshot_serializable().await
to obtain a snapshot wrapper implementingSerialize
. - To rebuild: create a new async map, then bulk insert entries.
§Design Goals
- Minimize contention via sharding (coarse dynamic set of RwLocks).
- Lazy shard materialization to reduce cold-start memory (slots are
None
until touched). - O(1) (amortized) length via atomic counter (fast cloning, no full scan).
- Parallel iteration using
rayon
if enabled (snapshots each shard then flattens). - Async version mirrors sync semantics; attempts optimistic
try_read
first to reduce await points. - Predictable memory layout leveraging
hashbrown::HashMap
and user-supplied hasher.
§Consistency Model
- Per-shard operations are linearizable with respect to that shard.
- Global iteration is snapshot-per-shard at the moment each shard lock is taken: You may see entries inserted/removed concurrently in other shards.
len()
is eventually consistent only in the trivial sense of atomic monotonic increments/decrements: It reflects completed inserts/removes; in-flight operations not yet applied are invisible.
§Thread / Task Safety
- Each shard guarded by a single RwLock (Std or Tokio).
- No nested acquisition of multiple shard locks (avoids lock order deadlocks).
- Atomic length update only after a structural insert/delete succeeds.
Clone
bounds onK
,V
needed for iteration snapshot flattening.
§Performance Notes (Indicative, not guaranteed)
- Read-heavy sync workloads: sharding reduces write interference vs a single map + RwLock.
rayon
speeds large aggregate scans (e.g. metrics dump, checkpoint) 3-4x on >100k elements.- Lazy shards: memory roughly proportional to number of distinct shard indices used.
§Hasher Choice
- Default
FxBuildHasher
: speed oriented (non-cryptographic). - For DoS / adversarial key defense use:
std::collections::hash_map::RandomState
. Example:use starshard::ShardedHashMap; use std::collections::hash_map::RandomState; let map: ShardedHashMap<String, u64, RandomState> = ShardedHashMap::with_shards_and_hasher(128, RandomState::default());
§Limitations
- No dynamic shard rebalancing / rehash across shards yet.
- No eviction / TTL (can be layered externally).
- Iteration allocates temporary vectors proportional to initialized shards (to snapshot).
- Not lock-free; large writer pressure can still cause convoying on hot shards.
§Future Extension Ideas
- Optional background shard growth / rebalancing.
- Configurable eviction: LRU per shard / clock / segmented queue.
- Metrics hooks (pre/post op).
- Batched mutation (multi-insert with single lock acquisition per target shard).
- Optional copy-on-write snapshots for near-zero iteration locking windows.
§Examples
Sync (default features):
use starshard::ShardedHashMap;
use rustc_hash::FxBuildHasher;
let map: ShardedHashMap<String, i32, FxBuildHasher> = ShardedHashMap::new(64);
map.insert("a".into(), 1);
assert_eq!(map.get(&"a".into()), Some(1));
assert_eq!(map.len(), 1);
Async (enable async
feature):
#[cfg(feature = "async")]
#[tokio::main]
async fn main() {
use starshard::AsyncShardedHashMap;
let map: AsyncShardedHashMap<String, i32> = AsyncShardedHashMap::new(64);
map.insert("k".into(), 7).await;
assert_eq!(map.get(&"k".into()).await, Some(7));
}
Parallel iteration (enable rayon
):
use starshard::ShardedHashMap;
let map: ShardedHashMap<String, u32> = ShardedHashMap::new(32);
for i in 0..10_000 {
map.insert(format!("k{i}"), i);
}
let count = map.iter().count(); // internally parallel if `rayon` feature active
assert_eq!(count, 10_000);
Async + Rayon (enable async,rayon
):
#[cfg(all(feature="async", feature="rayon"))]
#[tokio::main]
async fn main() {
use starshard::AsyncShardedHashMap;
let m: AsyncShardedHashMap<u32, u32> = AsyncShardedHashMap::new(64);
for i in 0..1000 { m.insert(i, i*i).await; }
let items = m.iter().await; // flattens in parallel internally
assert_eq!(items.len(), 1000);
}
Custom hasher (RandomState):
use starshard::ShardedHashMap;
use std::collections::hash_map::RandomState;
let secure: ShardedHashMap<String, i64, RandomState> =
ShardedHashMap::with_shards_and_hasher(64, RandomState::default());
secure.insert("x".into(), 1);
Structs§
- Async
Sharded Hash Map async
- Asynchronous sharded concurrent HashMap (Tokio
RwLock
). - Async
Sharded Hash MapSnapshot async
andserde
- A serializable snapshot of an
AsyncShardedHashMap
. - Sharded
Hash Map - Sharded concurrent HashMap (synchronous).
Constants§
- DEFAULT_
SHARDS - Default shard count (power-of-two not required; hashing modulo used).