Crate small_hash_map

Crate small_hash_map 

Source
Expand description

§SmallHashMap

A hash map optimized for small collections with automatic stack-to-heap transition.

§Motivation

In many programs, hash maps are created with only a few entries. The standard HashMap always allocates on the heap, which involves:

  1. Memory allocation syscalls
  2. Hash table bucket management
  3. Poor cache locality for small datasets

For maps that typically hold 4-16 entries, linear search through a contiguous array is often faster than hash-based lookup due to cache efficiency.

SmallHashMap solves this by:

  1. Starting with a stack-allocated InlineMap for small collections
  2. Automatically transitioning to a heap-allocated HeapMap when capacity is exceeded
  3. Supporting any hasher via the generic S parameter (defaults to RandomState, same as std::HashMap)

This provides optimal performance across the full range of collection sizes.

§Use Cases

  • Game engines: Per-entity components, particle attributes, input bindings — millions of small maps without heap thrashing
  • Trading systems: Order parameters, market data snapshots — zero allocation in hot paths
  • Request handlers: HTTP headers, session attributes — created and destroyed per request
  • Compilers/interpreters: AST node metadata, symbol tables per scope
  • Embedded structs: When you have millions of objects each containing a small map

§API Overview

use small_hash_map::SmallHashMap;

// Create a map with inline capacity of 8
let mut map: SmallHashMap<&str, i32, 8> = SmallHashMap::new();

// Standard map operations
map.insert("one", 1);
map.insert("two", 2);
map.insert("three", 3);

assert_eq!(map.get(&"two"), Some(&2));
assert_eq!(map.len(), 3);
assert!(map.contains_key(&"one"));

// Iteration
for (key, value) in map.iter() {
    println!("{}: {}", key, value);
}

// Remove entries
map.remove(&"two");
assert_eq!(map.get(&"two"), None);

// Automatic transition to HeapMap when exceeding capacity
for i in 0..20 {
    map.insert(Box::leak(format!("key{}", i).into_boxed_str()), i);
}
// Now using HeapMap internally, but API remains the same

§Pre-sizing for Large Collections

use small_hash_map::SmallHashMap;

// If you know you'll exceed inline capacity, start with HeapMap
let map: SmallHashMap<String, i32, 8> = SmallHashMap::with_capacity(100);
// Starts directly with HeapMap, avoiding transition overhead

§Custom Hashers

Like std::collections::HashMap, SmallHashMap supports custom hashers:

use small_hash_map::SmallHashMap;
use std::collections::hash_map::RandomState;

// Default hasher (RandomState) - same as std::HashMap
let map1: SmallHashMap<String, i32, 8> = SmallHashMap::new();

// Explicit hasher
let map2: SmallHashMap<String, i32, 8, RandomState> =
    SmallHashMap::with_hasher(RandomState::new());

// With capacity and custom hasher
let map3: SmallHashMap<String, i32, 8, RandomState> =
    SmallHashMap::with_capacity_and_hasher(100, RandomState::new());

// Access the hasher
let _hasher: &RandomState = map2.hasher();

For performance-critical applications, you can use faster hashers like fxhash:

use fxhash::FxBuildHasher;
use small_hash_map::SmallHashMap;

let mut map: SmallHashMap<String, i32, 8, FxBuildHasher> =
    SmallHashMap::with_hasher(FxBuildHasher::default());

§Limitations

§One-Way Transition

Once a SmallHashMap transitions from InlineMap to HeapMap, it never transitions back. Even if you remove elements, it remains heap-allocated.

let mut map: SmallHashMap<i32, i32, 4> = SmallHashMap::new();

// Fill and exceed capacity
for i in 0..10 {
    map.insert(i, i);
}
// Now using HeapMap

map.clear();
// Still HeapMap, even though empty

Mitigation: If you need to reclaim stack allocation, create a new SmallHashMap.

§Linear Scan for InlineMap

InlineMap uses O(n) linear search, not hash-based lookup. This is intentional and typically faster for small n due to cache locality, but becomes slower as n approaches the capacity limit.

§Trait Bounds

Keys, values, and hashers require trait bounds depending on the operation:

  • new, default: K: Hash + Eq, S: BuildHasher + Default
  • with_hasher: K: Hash + Eq, S: BuildHasher
  • with_capacity, with_capacity_and_hasher: K: Hash + Eq, S: BuildHasher + Default + Clone
  • insert, extend: K: Hash + Eq, S: BuildHasher + Clone
  • get, remove, etc.: K: Hash + Eq, S: BuildHasher
  • clone: K: Clone, V: Clone, S: Clone
  • Debug: K: Debug, V: Debug

§No Entry API

Unlike std::collections::HashMap, there’s no entry API for in-place mutation. The Entry API is problematic because it holds a mutable borrow of the map’s internals, but inserting via a VacantEntry could trigger a transition from InlineMap to HeapMap, invalidating that reference. Use get/insert/remove instead.

§Implementation Details

§InlineMap

Stack-allocated storage using fixed-size arrays with MaybeUninit:

pub struct InlineMap<K, V, const N: usize> {
    keys: [MaybeUninit<K>; N],
    values: [MaybeUninit<V>; N],
    len: usize,
}
  • Keys and values stored in separate arrays for better cache utilization during key lookup
  • Uses MaybeUninit to avoid requiring Default for uninitialized slots
  • Linear scan for all operations (get, insert, remove)
  • Maintains insertion order

§HeapMap

Thin wrapper around HashMap with configurable hasher:

pub struct HeapMap<K, V, S = RandomState> {
    map: HashMap<K, V, S>,
}
  • Supports any hasher implementing BuildHasher
  • Defaults to RandomState (same as std::HashMap)
  • Standard hash table performance characteristics
  • No ordering guarantees

§SmallHashMap

Enum-based dispatch between the two implementations:

pub struct SmallHashMap<K, V, const N: usize, S = RandomState> {
    inner: MapKind<K, V, N, S>,
    transition_threshold: usize,
    hash_builder: S,
}

pub enum MapKind<K, V, const N: usize, S = RandomState> {
    InlineMap(InlineMap<K, V, N>),
    HeapMap(HeapMap<K, V, S>),
}

The hasher S is stored and used when transitioning to HeapMap. Transition occurs when inserting a new key would exceed capacity N.

§Safety

InlineMap uses unsafe for:

  • Reading from MaybeUninit slots (safe because we track len)
  • Creating slices from raw pointers for iteration

All unsafe code maintains the invariant that only indices 0..len contain initialized values.

§Performance Characteristics

OperationInlineMapHeapMap
get()O(n)O(1) average
insert()O(n)O(1) average
remove()O(n)O(1) average
contains_key()O(n)O(1) average
iter()O(n)O(n)
clone()O(n)O(n)
MemoryStack, N×(K+V)Heap, hash table overhead

Crossover point: For typical key/value sizes, InlineMap outperforms HeapMap up to roughly 16-32 entries due to:

  • No hash computation
  • No bucket lookup
  • Cache-friendly contiguous memory
  • No allocation overhead

Beyond this, hash-based O(1) lookup wins.

§Similar Crates

CrateStorageHeap SpillSIMDKey ConstraintNotes
SmallHashMapStack arrayYesNoHash + EqAutomatic transition to HashMap
small-mapStack arrayYesYesHash + EqSIMD-accelerated (SSE2/NEON)
stackmapStack arrayNoNoHash + EqFixed capacity, panics if exceeded
smallmapPage arrayNoNoByte-indexableMax 256 entries, specialized
vecmap-rsHeap VecN/ANoEq onlyNo Hash required, no_std

§Why No SIMD?

SIMD-accelerated lookup (as in small-map) uses hash fingerprinting to compare 16 keys in parallel. However, this requires:

  1. Computing a hash for every lookup
  2. Storing an extra byte per entry for the fingerprint
  3. A minimum of ~16 elements before SIMD outperforms linear scan

For the typical use case of SmallHashMap (4-16 entries), plain linear scan is faster because:

  • No hash computation overhead
  • Better cache utilization (keys are contiguous)
  • Simpler branch prediction

SIMD becomes beneficial only when N ≥ 16 AND the map is frequently near capacity.

§SmallHashMap vs small-map

small-map is an excellent crate with SIMD acceleration. If you’re unsure which to use, try small-map first — it’s well-designed and performs great across a wide range of sizes.

Choose SmallHashMap when:

  • Your maps typically have 4-12 entries — linear scan beats SIMD due to hash overhead
  • You have millions of small maps — the h2 fingerprint array adds N bytes plus alignment padding per map, which might be significant
  • You have insert-heavy workloads — no fingerprint computation needed
  • You want minimal dependencies and simpler code to audit

Choose small-map when:

  • Your maps typically have 16+ entries — SIMD parallel comparison wins
  • You have lookup-heavy workloads — hash cost amortizes over many lookups
  • Your keys are expensive to compare — h2 filtering reduces full comparisons
AspectSmallHashMapsmall-map
Lookup (N ≤ 8)FasterSlower (hash overhead)
Lookup (N = 8-16)~Equal~Equal
Lookup (N > 16)SlowerFaster (SIMD)
InsertFasterSlower (stores h2)
Memory/entryK + VK + V + 1 byte

§How SmallHashMap Differs

SmallHashMap applies the small-vector optimization pattern to hash maps:

AspectSmallHashMapstd HashMap
Small collectionStack allocatedHeap allocated
Memory localityExcellent for small nHash table scattered
AllocationZero for small nAlways allocates
Lookup (small n)O(n) but cache-friendlyO(1) but cache-unfriendly
Lookup (large n)O(1) after transitionO(1)

§Quick Reference

§Construction

MethodDescription
SmallHashMap::new()Creates empty map with inline storage
SmallHashMap::with_capacity(n)Pre-sizes; uses HeapMap if n > N
SmallHashMap::with_hasher(s)Creates with custom hasher
SmallHashMap::with_capacity_and_hasher(n, s)Pre-sizes with custom hasher
SmallHashMap::default()Same as new()
iter.collect()Creates from iterator
map.hasher()Returns reference to the hasher

§Core Operations

MethodReturnsDescription
insert(k, v)Option<V>Insert or update; returns old value
get(&k)Option<&V>Get reference to value
get_mut(&k)Option<&mut V>Get mutable reference
get_key_value(&k)Option<(&K, &V)>Get key-value pair
remove(&k)Option<V>Remove and return value
contains_key(&k)boolCheck if key exists

§Capacity & Size

MethodReturnsDescription
len()usizeNumber of entries
is_empty()boolTrue if no entries
capacity()usizeCurrent capacity
is_inline()boolTrue if using stack storage
clear()()Remove all entries

§Iteration

MethodYieldsDescription
iter()(&K, &V)Immutable iteration
iter_mut()(&K, &mut V)Mutable value iteration
keys()&KIterate over keys
values()&VIterate over values
values_mut()&mut VMutable value iteration
into_iter()(K, V)Consuming iteration
retain(f)()Filter in place

§When to Use SmallHashMap

Good fit:

  • Maps that usually contain fewer than 16 entries
  • Performance-critical code with many small maps
  • Avoiding allocation pressure in hot paths
  • Embedded maps in frequently-instantiated structs

Poor fit:

  • Maps that consistently grow large (use HashMap directly)
  • Need for entry API or other HashMap-specific features
  • Types that don’t implement required trait bounds
  • no_std environments (depends on std::collections::HashMap)

§Quick Start

use small_hash_map::SmallHashMap;

// Create a map with inline capacity of 8
let mut map: SmallHashMap<&str, i32, 8> = SmallHashMap::new();

// Insert some values
map.insert("one", 1);
map.insert("two", 2);
map.insert("three", 3);

// Look up values
assert_eq!(map.get(&"two"), Some(&2));
assert_eq!(map.len(), 3);

// The map is still using inline (stack) storage
assert!(map.is_inline());

§Collecting from Iterators

use small_hash_map::SmallHashMap;

let pairs = vec![("a", 1), ("b", 2), ("c", 3)];
let map: SmallHashMap<&str, i32, 8> = pairs.into_iter().collect();

assert_eq!(map.get(&"b"), Some(&2));

Structs§

HeapMap
A HashMap wrapper that can use any hasher implementing BuildHasher.
InlineMap
A minimal map implementation optimized for small collections.
SmallHashMap
An adaptive map that starts with an InlineMap and transitions to HeapMap when it grows beyond a threshold.

Enums§

SmallHashMapIntoIter
Consuming iterator over key-value pairs of a SmallHashMap.
SmallHashMapIter
Iterator type for SmallHashMap that can handle both InlineMap and HeapMap iterators.
SmallHashMapIterMut
Mutable iterator over key-value pairs of a SmallHashMap.
SmallHashMapKeys
Iterator over keys of a SmallHashMap.
SmallHashMapValues
Iterator over values of a SmallHashMap.
SmallHashMapValuesMut
Mutable iterator over values of a SmallHashMap.