Expand description
Starshard: a high-performance, lazily sharded concurrent HashMap.
§Features
async: enablesAsyncShardedHashMapbacked bytokio::sync::RwLock.rayon: enables parallel snapshot flattening inside iteration for large maps (sync + async).serde: (sync) serialize/deserialize via a stable map snapshot; async map via snapshot helper.
§Serde Semantics
ShardedHashMap:
- Serialized form: { shard_count: usize, entries: Vec<(K,V)> }.
- Hasher state is not preserved; deserialization rebuilds with
S::default(). - Requires
K: Eq + Hash + Clone + Send + Sync + Serialize + Deserialize,V: Clone + Send + Sync + Serialize + Deserialize,S: BuildHasher + Clone + Send + Sync + Default.
AsyncShardedHashMap:
- No direct
Serialize/Deserialize(locks needawait). - Use
async_snapshot_serializable().awaitto obtain a snapshot wrapper implementingSerialize. - To rebuild: create a new async map, then bulk insert entries.
§Design Goals
- Minimize contention via sharding (coarse dynamic set of RwLocks).
- Lazy shard materialization to reduce cold-start memory (slots are
Noneuntil touched). - O(1) (amortized) length via atomic counter (fast cloning, no full scan).
- Parallel iteration using
rayonif enabled (snapshots each shard then flattens). - Async version mirrors sync semantics; attempts optimistic
try_readfirst to reduce await points. - Predictable memory layout leveraging
hashbrown::HashMapand user-supplied hasher.
§Consistency Model
- Per-shard operations are linearizable with respect to that shard.
- Global iteration is snapshot-per-shard at the moment each shard lock is taken: You may see entries inserted/removed concurrently in other shards.
len()is eventually consistent only in the trivial sense of atomic monotonic increments/decrements: It reflects completed inserts/removes; in-flight operations not yet applied are invisible.
§Thread / Task Safety
- Each shard guarded by a single RwLock (Std or Tokio).
- No nested acquisition of multiple shard locks (avoids lock order deadlocks).
- Atomic length update only after a structural insert/delete succeeds.
Clonebounds onK,Vneeded for iteration snapshot flattening.
§Performance Notes (Indicative, not guaranteed)
- Read-heavy sync workloads: sharding reduces write interference vs a single map + RwLock.
rayonspeeds large aggregate scans (e.g. metrics dump, checkpoint) 3-4x on >100k elements.- Lazy shards: memory roughly proportional to number of distinct shard indices used.
§Hasher Choice
- Default
FxBuildHasher: speed oriented (non-cryptographic). - For DoS / adversarial key defense use:
std::collections::hash_map::RandomState. Example:use starshard::ShardedHashMap; use std::collections::hash_map::RandomState; let map: ShardedHashMap<String, u64, RandomState> = ShardedHashMap::with_shards_and_hasher(128, RandomState::default());
§Limitations
- No dynamic shard rebalancing / rehash across shards yet.
- No eviction / TTL (can be layered externally).
- Iteration allocates temporary vectors proportional to initialized shards (to snapshot).
- Not lock-free; large writer pressure can still cause convoying on hot shards.
§Future Extension Ideas
- Optional background shard growth / rebalancing.
- Configurable eviction: LRU per shard / clock / segmented queue.
- Metrics hooks (pre/post op).
- Batched mutation (multi-insert with single lock acquisition per target shard).
- Optional copy-on-write snapshots for near-zero iteration locking windows.
§Examples
Sync (default features):
use starshard::ShardedHashMap;
use rustc_hash::FxBuildHasher;
let map: ShardedHashMap<String, i32, FxBuildHasher> = ShardedHashMap::new(64);
map.insert("a".into(), 1);
assert_eq!(map.get(&"a".into()), Some(1));
assert_eq!(map.len(), 1);Async (enable async feature):
#[cfg(feature = "async")]
#[tokio::main]
async fn main() {
use starshard::AsyncShardedHashMap;
let map: AsyncShardedHashMap<String, i32> = AsyncShardedHashMap::new(64);
map.insert("k".into(), 7).await;
assert_eq!(map.get(&"k".into()).await, Some(7));
}Parallel iteration (enable rayon):
use starshard::ShardedHashMap;
let map: ShardedHashMap<String, u32> = ShardedHashMap::new(32);
for i in 0..10_000 {
map.insert(format!("k{i}"), i);
}
let count = map.iter().count(); // internally parallel if `rayon` feature active
assert_eq!(count, 10_000);Async + Rayon (enable async,rayon):
#[cfg(all(feature="async", feature="rayon"))]
#[tokio::main]
async fn main() {
use starshard::AsyncShardedHashMap;
let m: AsyncShardedHashMap<u32, u32> = AsyncShardedHashMap::new(64);
for i in 0..1000 { m.insert(i, i*i).await; }
let items = m.iter().await; // flattens in parallel internally
assert_eq!(items.len(), 1000);
}Custom hasher (RandomState):
use starshard::ShardedHashMap;
use std::collections::hash_map::RandomState;
let secure: ShardedHashMap<String, i64, RandomState> =
ShardedHashMap::with_shards_and_hasher(64, RandomState::default());
secure.insert("x".into(), 1);Re-exports§
pub use eviction::AtomicMetrics;lifecyclepub use eviction::DrainIterator;lifecyclepub use eviction::EvictionConfig;lifecyclepub use eviction::EvictionPolicy;lifecyclepub use eviction::IterBuilder;lifecyclepub use eviction::MemoryStats;lifecyclepub use eviction::MetricsStats;lifecyclepub use eviction::PerShardLoad;lifecyclepub use advanced::CasResult;advancedpub use advanced::CowSnapshot;advancedpub use advanced::IsolatedSnapshot;advancedpub use advanced::LockProfile;advancedpub use advanced::QuorumConfig;advancedpub use advanced::Replica;advancedpub use advanced::ReplicaError;advancedpub use advanced::ReplicationOp;advancedpub use advanced::Transaction;advancedpub use advanced::TransactionResult;advancedpub use advanced::TxnOp;advanced
Modules§
- advanced
advanced - Version 1.0.0 features: Transactions, CAS, replication, and diagnostics. Version 1.0.0: Advanced Transactions, CAS Operations, and Distributed Support
- eviction
lifecycle - Version 0.9.0 features: TTL, eviction, metrics, and advanced iteration. Version 0.9.0: Eviction, Metrics, and Advanced Iteration Support
Structs§
- Async
Sharded Hash Map async - Asynchronous sharded concurrent HashMap (Tokio
RwLock). - Async
Sharded Hash MapSnapshot asyncandserde - A serializable snapshot of an
AsyncShardedHashMap. - Shard
Stats - Statistics about shard distribution and utilization.
- Sharded
Hash Map - Sharded concurrent HashMap (synchronous).
Constants§
- DEFAULT_
SHARDS - Default shard count (power-of-two not required; hashing modulo used).