pub struct InternedStringMap { /* private fields */ }Expand description
The short-string intern table, shaped like C-Lua’s stringtable
(lstring.c): power-of-two hash buckets of GcRef<LuaString> chained by
Vec instead of u.hnext. Compared to the previous
HashMap<Box<[u8]>, GcRef<LuaString>>:
- lookup hashes the input bytes ONCE and never allocates (the entry-API
shape boxed the key on every call — rejected experiment
intern-hitpath-borrowed-lookupdocuments why a partial fix loses); - insert reuses the same hash, no second probe;
- bytes are stored once (in the
LuaString), not duplicated in a map key; - dead strings are removed O(dead) by
(hash, identity)pairs collected during the GC mark phase, replacing the O(table·log live) sort + binary-search retain that dominated churn-heavy profiles (concat_chain 20260609T2201Z: intern machinery ~25% of wall).
Implementations§
Source§impl InternedStringMap
impl InternedStringMap
pub fn len(&self) -> usize
pub fn is_empty(&self) -> bool
Sourcepub fn bucket_count(&self) -> usize
pub fn bucket_count(&self) -> usize
Number of hash buckets currently allocated (the power-of-two table
size, C’s strt.size). Exposed for the shrink-policy test, which
asserts the array grows under a flood and shrinks back toward 64 once
the interned strings are collected.
pub fn find(&self, bytes: &[u8], hash: u32) -> Option<GcRef<LuaString>>
Sourcepub fn insert(&mut self, s: GcRef<LuaString>)
pub fn insert(&mut self, s: GcRef<LuaString>)
C’s luaS_resize growth rule: keep average chain length ~1.
Sourcepub fn shrink_if_sparse(&mut self)
pub fn shrink_if_sparse(&mut self)
C’s luaS_resize shrink path, driven by lgc.c:checkSizes
(if (g->strt.nuse < g->strt.size / 4) luaS_resize(L, size/2)).
Shrinks the bucket array when the live load factor falls below 25%
(count * 4 < buckets.len()), down to next_power_of_two(count)
floored at the initial 64 (C’s MINSTRTABSIZE). The 4× gap is
hysteresis: a table at load factor just under 1.0 will not thrash,
since the next grow-then-shrink cycle needs the population to drop
fourfold first. Rehashing only relocates the surviving
GcRef<LuaString> entries by their cached hash(); it never derefs a
string, so it is safe to call from the post-mark/sweep GC hook AFTER
all dead entries have been removed (only live refs remain).
Sourcepub fn remove(&mut self, hash: u32, identity: usize)
pub fn remove(&mut self, hash: u32, identity: usize)
O(dead): removes one entry located by its cached hash + GC identity.