Starshard
English | 简体中文
Starshard: a high-performance, lazily sharded concurrent HashMap for Rust.
Sync + Async + Optional Rayon + Optional Serde
Status
Early stage. API may still evolve (semantic stability prioritized; minor naming changes possible).
Motivation
You often need a fast concurrent map:
- Standard single
RwLock<HashMap<..>>becomes contended under mixed read/write load. - Fully lock-free or CHM designs can add memory + complexity cost.
- Sharding with lazy initialization offers a pragmatic middle ground.
Starshard focuses on:
- Minimal uncontended overhead.
- Lazy shard allocation (memory proportional to actually touched shards).
- Atomic cached length.
- Snapshot iteration (parallel if
rayon). - Symmetric sync / async APIs.
- Extensible design (future: rebalancing, eviction, metrics).
Features
| Feature | Description | Notes |
|---|---|---|
async |
Adds AsyncShardedHashMap (Tokio RwLock) |
Independent of rayon |
rayon |
Parallel snapshot flatten for large iteration | Used internally; API unchanged |
serde |
Serialize/Deserialize (sync) + async snapshot helper | Hasher not persisted |
| (none) | Pure sync core | Lowest dependency surface |
Enable all in docs.rs via:
[]
= true
Installation
[]
= { = "0.5", = ["async", "rayon", "serde"] }
# or minimal:
# starshard = "0.5"
serde_json (tests / examples):
[]
= "1"
Quick Start (Sync)
use ShardedHashMap;
use FxBuildHasher;
let map: = new;
map.insert;
assert_eq!;
assert_eq!;
Custom Hasher (defense against adversarial keys)
use ShardedHashMap;
use RandomState;
let secure =
with_shards_and_hasher;
secure.insert;
Async Usage
async
Parallel Iteration (rayon)
Serde Semantics
Sync:
- Serialized shape:
{ "shard_count": usize, "entries": [[K,V], ...] }. - Hasher internal state not preserved; recreated with
S::default(). - Requirements:
K: Eq + Hash + Clone + Serialize + Deserialize,V: Clone + Serialize + Deserialize,S: BuildHasher + Default + Clone.
Async:
- No direct
Serialize; call:
- To reconstruct: create a new async map and bulk insert.
Consistency Model
- Per-shard ops are linearizable w.r.t that shard.
- Global iteration builds a per-shard snapshot as each shard lock is taken (not a fully atomic global view).
len()is maintained atomically (structural insert/remove only).- Iteration after concurrent writes may omit late inserts performed after a shard snapshot was captured.
Performance Notes (Indicative)
| Scenario | Observation (relative) |
|---|---|
Read-heavy mixed workload vs global RwLock<HashMap> |
Reduced contention |
Large snapshot iteration with rayon (100k+) |
3-4x speedup flattening |
| Sparse shard usage | Only touched shards allocate |
Do benchmark with your own key/value distribution and CPU topology.
Safety / Concurrency
- No nested multi-shard lock ordering -> avoids deadlocks.
- Each shard single
RwLock; iteration snapshots avoid long-lived global blocking. - Cloning values required (trade memory for contention isolation).
- Not lock-free: intense write focus on one shard can still serialize.
Limitations
- No dynamic shard rebalancing.
- No eviction / TTL.
- Snapshot iteration allocates intermediate vectors.
- Hasher state not serialized.
- No lock-free progress guarantees.
Roadmap (Potential)
- Optional adaptive shard expansion / rebalancing.
- Per-shard eviction strategies (LRU / segmented).
- Metrics hooks (pre/post op instrumentation).
- Batched multi-insert API.
- Zero-copy or COW snapshot mode.
Design Sketch
Arc -> RwLock<Vec<Option<Arc<RwLock<HashMap<K,V,S>>>>>> + AtomicUsize(len)
Lazy fill of inner Option slot when first key hashes into shard.
Examples Summary
| Goal | Snippet |
|---|---|
| Basic sync | see Quick Start |
| Async insert/get | see Async Usage |
| Parallel iterate | enable rayon |
| Serde snapshot (sync) | serde_json::to_string(&map) |
| Async serde snapshot | async_snapshot_serializable() |
| Custom hasher | with_shards_and_hasher(..) |
License
Dual license: MIT OR Apache-2.0 (choose either).
Contribution
PRs welcome: focus on correctness (tests), simplicity, and documentation clarity.
Run:
Minimal Example (All Features)
use ;
async
Disclaimer
Benchmarks and behavior notes are indicative only; validate under production load patterns.