starshard 1.0.0

A blazing-fast sharded concurrent HashMap using hashbrown and RwLock, with lazy shards, atomic length cache, async support, conditional operations, batch operations, TTL/metrics/transactions.
Documentation

Starshard

Build crates.io docs.rs License Downloads

English | 简体中文

Starshard: a high-performance, lazily sharded concurrent HashMap for Rust.

Sync + Async + Optional Rayon + Optional Serde


Status

Early stage. API may still evolve (semantic stability prioritized; minor naming changes possible).

Motivation

You often need a fast concurrent map:

  • Standard single RwLock<HashMap<..>> becomes contended under mixed read/write load.
  • Fully lock-free or CHM designs can add memory + complexity cost.
  • Sharding with lazy initialization offers a pragmatic middle ground.

Starshard focuses on:

  1. Minimal uncontended overhead.
  2. Lazy shard allocation (memory proportional to actually touched shards).
  3. Atomic cached length.
  4. Snapshot iteration (parallel if rayon).
  5. Symmetric sync / async APIs.
  6. Extensible design (future: rebalancing, eviction, metrics).

Features

Feature Description Notes
async Adds AsyncShardedHashMap (Tokio RwLock) Independent of rayon
rayon Parallel snapshot flatten for large iteration Used internally; API unchanged
serde Serialize/Deserialize (sync) + async snapshot helper Hasher not persisted
(none) Pure sync core Lowest dependency surface

Enable all in docs.rs via:

[package.metadata.docs.rs]
all-features = true

Installation

[dependencies]
starshard = { version = "0.7", features = ["async", "rayon", "serde"] }
# or minimal:
# starshard = "0.7"

serde_json (tests / examples):

[dev-dependencies]
serde_json = "1"

Quick Start (Sync)

use starshard::ShardedHashMap;
use rustc_hash::FxBuildHasher;

let map: ShardedHashMap<String, i32, FxBuildHasher> = ShardedHashMap::new(64);
map.insert("a".into(), 1);
assert_eq!(map.get(&"a".into()), Some(1));
assert_eq!(map.len(), 1);

Custom Hasher (defense against adversarial keys)

use starshard::ShardedHashMap;
use std::collections::hash_map::RandomState;

let secure = ShardedHashMap::<String, u64, RandomState>
::with_shards_and_hasher(128, RandomState::default ());
secure.insert("k".into(), 7);

Async Usage

#[cfg(feature = "async")]
#[tokio::main]
async fn main() {
    use starshard::AsyncShardedHashMap;
    let m: AsyncShardedHashMap<String, u32> = AsyncShardedHashMap::new(64);
    m.insert("x".into(), 42).await;
    assert_eq!(m.get(&"x".into()).await, Some(42));
}

Parallel Iteration (rayon)

#[cfg(feature = "rayon")]
{
use starshard::ShardedHashMap;
let m: ShardedHashMap<String, u32> = ShardedHashMap::new(32);
for i in 0..50_000 {
m.insert(format ! ("k{i}"), i);
}
let count = m.iter().count(); // internal parallel flatten
assert_eq!(count, 50_000);
}

Serde Semantics

Sync:

  • Serialized shape: { "shard_count": usize, "entries": [[K,V], ...] }.
  • Hasher internal state not preserved; recreated with S::default().
  • Requirements: K: Eq + Hash + Clone + Serialize + Deserialize, V: Clone + Serialize + Deserialize, S: BuildHasher + Default + Clone.

Async:

  • No direct Serialize; call:
#[cfg(all(feature = "async", feature = "serde"))]
{
let snap = async_map.async_snapshot_serializable().await;
let json = serde_json::to_string( & snap).unwrap();
}
  • To reconstruct: create a new async map and bulk insert.

Consistency Model

  • Per-shard ops are linearizable w.r.t that shard.
  • Global iteration builds a per-shard snapshot as each shard lock is taken (not a fully atomic global view).
  • len() is maintained atomically (structural insert/remove only).
  • Iteration after concurrent writes may omit late inserts performed after a shard snapshot was captured.

Performance Notes (Indicative)

Scenario Observation (relative)
Read-heavy mixed workload vs global RwLock<HashMap> Reduced contention
Large snapshot iteration with rayon (100k+) 3-4x speedup flattening
Sparse shard usage Only touched shards allocate

Do benchmark with your own key/value distribution and CPU topology.


Safety / Concurrency

  • No nested multi-shard lock ordering -> avoids deadlocks.
  • Each shard single RwLock; iteration snapshots avoid long-lived global blocking.
  • Cloning values required (trade memory for contention isolation).
  • Not lock-free: intense write focus on one shard can still serialize.

Limitations

  • No dynamic shard rebalancing.
  • No eviction / TTL.
  • Snapshot iteration allocates intermediate vectors.
  • Hasher state not serialized.
  • No lock-free progress guarantees.

Roadmap (Potential)

  • Optional adaptive shard expansion / rebalancing.
  • Per-shard eviction strategies (LRU / segmented).
  • Metrics hooks (pre/post op instrumentation).
  • Batched multi-insert API.
  • Zero-copy or COW snapshot mode.

Version 0.8.0 Features (Conditional Operations & Batch Operations)

Conditional Operations (Method-based, highly efficient)

use starshard::ShardedHashMap;

let map: ShardedHashMap<String, i32> = ShardedHashMap::new(64);

// Update only if key exists; single shard lock
map.compute_if_present( & "counter".into(), | v| Some(v + 1));

// Insert only if key absent; single shard lock
let val = map.compute_if_absent("new_key".into(), | | 42);

// Conditional deletion
map.compute_if_present( & "key".into(), | _v| None);

Why not Entry API?

  • In a sharded context, returning a mutable reference is infeasible (shard lock held across await points would deadlock async).
  • Conditional operations provide the same expressiveness with better performance and simpler semantics.

Batch Operations

// Batch insert (amortizes shard lock acquisition)
let entries = vec![("a".into(), 1), ("b".into(), 2), ("c".into(), 3)];
let inserted = map.batch_insert(entries);

// Batch remove
let removed = map.batch_remove(vec!["a".into(), "b".into()]);

// Batch get
let keys = vec!["a", "b", "c"];
let results = map.batch_get( & keys);

Collection Views & Filtering

// Iterate all keys
let all_keys: Vec<_ > = map.keys().collect();

// Iterate all values
let all_values: Vec<_ > = map.values().collect();

// Standard iteration
for (k, v) in map.iter() {
println ! ("{}: {}", k, v);
}

// Retention filtering
map.retain( | k, v| v > & 10);  // Keep only entries where v > 10

Shard Introspection

// Get shard distribution statistics
let stats = map.shard_stats();
println!("Total slots: {}", stats.total);
println!("Initialized shards: {}", stats.initialized);
println!("Avg load: {:.2}", stats.avg_load);

// Get utilization percentage
let util = map.shard_utilization();
println!("Utilization: {:.1}%", util);

Version 0.9.0 Features (Eviction, Metrics, Advanced Iteration)

Planned features:

  • TTL & Eviction: LRU/LFU/custom eviction policies with background cleanup
  • Metrics Hooks: Production observability with hit rates, latency tracking
  • Advanced Iteration: Filter + limit + parallel control builders
  • Drain Operations: Efficient bulk removal

See ROADMAP.md for detailed specification.


Version 1.0.0 Features (Transactions, CAS, Replication)

Planned features:

  • MVCC Transactions: Atomic multi-key operations with conflict detection
  • Compare-And-Swap: Lock-free coordination primitives
  • Copy-On-Write Snapshots: Minimal-contention read optimization
  • Distributed Replication: Quorum-based consistency framework
  • Lock Diagnostics: Per-shard contention profiling

See ROADMAP.md for detailed specification.


Feature Matrix

Feature v0.7 v0.8 v0.9 v1.0 Status
Sharded HashMap (sync) Stable
Async (Tokio) Stable
Parallel iteration (rayon) Stable
Serde (de)serialization Stable
Conditional Operations - New in 0.8
Batch operations - New in 0.8
TTL/Eviction - - Planned for 0.9
Metrics - - Planned for 0.9
Transactions - - - Planned for 1.0
CAS operations - - - Planned for 1.0
Replication - - - Planned for 1.0

Design Sketch


Arc -> RwLock<Vec<Option<Arc<RwLock<HashMap<K,V,S>>>>>> + AtomicUsize(len)

Lazy fill of inner Option slot when first key hashes into shard.


Examples Summary

Goal Snippet
Basic sync see Quick Start
Async insert/get see Async Usage
Parallel iterate enable rayon
Serde snapshot (sync) serde_json::to_string(&map)
Async serde snapshot async_snapshot_serializable()
Custom hasher with_shards_and_hasher(..)

License

Dual license: MIT OR Apache-2.0 (choose either).


Contribution

PRs welcome: focus on correctness (tests), simplicity, and documentation clarity.
Run:

cargo clippy --all-features -- -D warnings
cargo test --all-features

Minimal Example (All Features)

use starshard::{ShardedHashMap, AsyncShardedHashMap};
#[cfg(feature = "async")]
#[tokio::main]
async fn main() {
    let sync_map: ShardedHashMap<u64, u64> = ShardedHashMap::new(32);
    sync_map.insert(1, 10);

    #[cfg(feature = "serde")]
    {
        let json = serde_json::to_string(&sync_map).unwrap();
        let _de: ShardedHashMap<u64, u64> = serde_json::from_str(&json).unwrap();
    }

    let async_map: AsyncShardedHashMap<u64, u64> = AsyncShardedHashMap::new(32);
    async_map.insert(2, 20).await;
}

Disclaimer

Benchmarks and behavior notes are indicative only; validate under production load patterns.