pub struct StringInterner { /* private fields */ }Expand description
String interner for symbol name deduplication.
StringInterner stores strings efficiently by maintaining a single
copy of each unique string. When the same string is interned multiple
times, the same StringId is returned.
§Reference Counting
Each interned string has an associated reference count. This enables garbage collection of unused strings during compaction phases.
§Thread Safety
The interner uses Arc<str> for string storage, making it safe to
share resolved strings across threads. However, the interner itself
requires external synchronization (e.g., RwLock) for concurrent access.
§Example
let mut interner = StringInterner::new();
let id1 = interner.intern("foo");
let id2 = interner.intern("foo");
assert_eq!(id1, id2); // Same string → same ID
let resolved = interner.resolve(id1).unwrap();
assert_eq!(&*resolved, "foo");Implementations§
Source§impl StringInterner
impl StringInterner
Sourcepub fn with_capacity(capacity: usize) -> Self
pub fn with_capacity(capacity: usize) -> Self
Creates a new interner with the specified capacity.
Sourcepub fn with_max_ids(max_ids: u32) -> Self
pub fn with_max_ids(max_ids: u32) -> Self
Creates a new interner with a hard limit on the number of IDs.
This constructor is designed for testing error paths. It allows
deterministic testing of InternError::CapacityExhausted handling
without requiring billions of strings.
§Arguments
max_ids- Maximum number of unique strings that can be interned. Once this limit is reached,intern()will returnInternError::CapacityExhausted.
§Example
// Create an interner that can only hold 3 strings
let mut interner = StringInterner::with_max_ids(3);
interner.intern("a").unwrap(); // OK
interner.intern("b").unwrap(); // OK
interner.intern("c").unwrap(); // OK
assert!(interner.intern("d").is_err()); // CapacityExhaustedSourcepub fn len(&self) -> usize
pub fn len(&self) -> usize
Returns the number of interned strings (excluding INVALID slot).
§Panics
Panics if the lookup is stale (bulk slots written without rebuild).
Sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true if no strings are interned.
§Panics
Panics if the lookup is stale (bulk slots written without rebuild).
Sourcepub fn intern(&mut self, s: &str) -> Result<StringId, InternError>
pub fn intern(&mut self, s: &str) -> Result<StringId, InternError>
Interns a string and returns its StringId.
If the string was already interned, returns the existing ID and increments its reference count. Otherwise, allocates a new ID.
§Errors
Returns InternError::CapacityExhausted if the interner has
exhausted all available IDs (> 2^32 - 2 strings), or if max_ids
is set and the limit has been reached.
§Panics
Panics if the lookup is stale and has not been rebuilt with
build_dedup_table().
Sourcepub fn intern_without_ref(&mut self, s: &str) -> Result<StringId, InternError>
pub fn intern_without_ref(&mut self, s: &str) -> Result<StringId, InternError>
Interns a string and returns its StringId without incrementing ref count.
This is useful when the string is being stored in a structure that will manage its own lifetime (e.g., node entry).
§Errors
Returns InternError::CapacityExhausted if the interner has
exhausted all available IDs (> 2^32 - 2 strings), or if max_ids
is set and the limit has been reached.
§Panics
Panics if the lookup is stale and has not been rebuilt with
build_dedup_table().
Sourcepub fn resolve(&self, id: StringId) -> Option<Arc<str>>
pub fn resolve(&self, id: StringId) -> Option<Arc<str>>
Resolves a StringId to its string value.
Returns None if the ID is invalid or has been recycled.
Sourcepub fn ref_count(&self, id: StringId) -> u32
pub fn ref_count(&self, id: StringId) -> u32
Returns the reference count for a string.
Returns 0 if the ID is invalid or has been recycled.
Sourcepub fn inc_ref(&mut self, id: StringId) -> Option<u32>
pub fn inc_ref(&mut self, id: StringId) -> Option<u32>
Increments the reference count for a string.
Returns the new count, or None if the ID is invalid.
Sourcepub fn dec_ref(&mut self, id: StringId) -> Option<u32>
pub fn dec_ref(&mut self, id: StringId) -> Option<u32>
Decrements the reference count for a string.
Returns the new count, or None if the ID is invalid.
Note: This does NOT automatically recycle the string when count reaches 0.
Use recycle_unreferenced() during compaction for that.
Sourcepub fn recycle_unreferenced(&mut self) -> usize
pub fn recycle_unreferenced(&mut self) -> usize
Recycles all strings with zero reference count.
Returns the number of strings recycled. This should be called during compaction phases.
§Panics
Panics if the lookup is stale (bulk slots written without rebuild).
Sourcepub fn contains(&self, s: &str) -> bool
pub fn contains(&self, s: &str) -> bool
Checks if a string is interned.
§Panics
Panics if the lookup is stale (bulk slots written without rebuild).
Sourcepub fn get(&self, s: &str) -> Option<StringId>
pub fn get(&self, s: &str) -> Option<StringId>
Gets the StringId for a string if it’s already interned.
Unlike intern(), this does not create a new entry or modify ref counts.
§Panics
Panics if the lookup is stale (bulk slots written without rebuild).
Sourcepub fn iter(&self) -> impl Iterator<Item = (StringId, &Arc<str>)>
pub fn iter(&self) -> impl Iterator<Item = (StringId, &Arc<str>)>
Returns an iterator over all interned strings with their IDs.
Sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Clears all interned strings.
Resets the interner to empty state, including clearing the
lookup_stale flag (lookup is trivially consistent when empty).
Sourcepub fn reserve(&mut self, additional: usize)
pub fn reserve(&mut self, additional: usize)
Reserves capacity for at least additional more strings.
Sourcepub fn alloc_range(&mut self, count: u32) -> Result<u32, InternError>
pub fn alloc_range(&mut self, count: u32) -> Result<u32, InternError>
Pre-allocates count string slots for bulk parallel commit.
The new slots are initialized with None (no string) and ref_count = 0.
Returns the start index of the allocated range. The caller can then
fill slots start..start+count via StringInterner::bulk_slices_mut.
This method does not touch the free_list — it always appends to the
end of the strings and ref_counts vectors. This is intentional:
during parallel commit, each file gets a contiguous, non-overlapping range.
§Errors
Returns InternError::CapacityExhausted if the allocation would exceed
the LOCAL_TAG_BIT boundary (2^31 indices reserved for global IDs).
§Arguments
count- Number of slots to pre-allocate. If 0, this is a no-op returning the current length.
Sourcepub fn bulk_slices_mut(
&mut self,
start: u32,
count: u32,
) -> (&mut [Option<Arc<str>>], &mut [u32])
pub fn bulk_slices_mut( &mut self, start: u32, count: u32, ) -> (&mut [Option<Arc<str>>], &mut [u32])
Returns mutable sub-slices into the strings and ref_counts arrays for
the range start..start+count.
This enables parallel file commit workers to write directly into their pre-allocated range without contention. The caller is responsible for ensuring no overlapping ranges are accessed concurrently.
Defensively marks the lookup as stale when count > 0, since the
returned slices allow direct mutation of string slots without updating
the lookup HashMap.
§Panics
Panics if start + count exceeds the current vector length.
Sourcepub fn build_dedup_table(&mut self) -> HashMap<StringId, StringId>
pub fn build_dedup_table(&mut self) -> HashMap<StringId, StringId>
Scans all string slots and deduplicates identical strings.
After parallel commit, multiple file workers may have inserted the same string into different slots. This method:
- Iterates slots
1..Nin index order (deterministic). - For the first occurrence of each string value, that slot becomes the canonical entry.
- For duplicate occurrences, their
ref_countis accumulated into the canonical slot, and the duplicate slot is cleared (None,ref_count = 0). - The
lookupHashMapis rebuilt from canonical entries only.
Returns a remap table mapping duplicate StringId to canonical StringId.
Canonical entries are not included in the returned map.
Sourcepub fn truncate_to(&mut self, saved_len: usize)
pub fn truncate_to(&mut self, saved_len: usize)
Truncates the strings and ref_counts vectors to saved_len.
This rolls back a failed bulk allocation by removing all slots at
index saved_len and beyond. The lookup HashMap is not modified
(the caller is responsible for ensuring no lookup entries point to the
truncated region).
§Panics
Panics if saved_len is 0 (would remove the sentinel slot).
Sourcepub fn string_count_raw(&self) -> usize
pub fn string_count_raw(&self) -> usize
Returns the total number of string slots including the sentinel at index 0.
This is the raw vector length, not the number of interned strings. Useful for saving/restoring allocation state.
Sourcepub fn is_lookup_stale(&self) -> bool
pub fn is_lookup_stale(&self) -> bool
Returns whether the lookup HashMap is stale (bulk slots written
without a build_dedup_table() rebuild).
This is primarily useful for testing and diagnostics.
Sourcepub fn stats(&self) -> InternerStats
pub fn stats(&self) -> InternerStats
Returns statistics about the interner.
Safe to call even when lookup is stale — uses slot-based counting instead of lookup length.
Trait Implementations§
Source§impl Clone for StringInterner
impl Clone for StringInterner
Source§fn clone(&self) -> StringInterner
fn clone(&self) -> StringInterner
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for StringInterner
impl Debug for StringInterner
Source§impl Default for StringInterner
impl Default for StringInterner
Source§impl<'de> Deserialize<'de> for StringInterner
impl<'de> Deserialize<'de> for StringInterner
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl Display for StringInterner
impl Display for StringInterner
Auto Trait Implementations§
impl Freeze for StringInterner
impl RefUnwindSafe for StringInterner
impl Send for StringInterner
impl Sync for StringInterner
impl Unpin for StringInterner
impl UnsafeUnpin for StringInterner
impl UnwindSafe for StringInterner
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<D> OwoColorize for D
impl<D> OwoColorize for D
Source§fn fg<C>(&self) -> FgColorDisplay<'_, C, Self>where
C: Color,
fn fg<C>(&self) -> FgColorDisplay<'_, C, Self>where
C: Color,
Source§fn bg<C>(&self) -> BgColorDisplay<'_, C, Self>where
C: Color,
fn bg<C>(&self) -> BgColorDisplay<'_, C, Self>where
C: Color,
Source§fn black(&self) -> FgColorDisplay<'_, Black, Self>
fn black(&self) -> FgColorDisplay<'_, Black, Self>
Source§fn on_black(&self) -> BgColorDisplay<'_, Black, Self>
fn on_black(&self) -> BgColorDisplay<'_, Black, Self>
Source§fn red(&self) -> FgColorDisplay<'_, Red, Self>
fn red(&self) -> FgColorDisplay<'_, Red, Self>
Source§fn on_red(&self) -> BgColorDisplay<'_, Red, Self>
fn on_red(&self) -> BgColorDisplay<'_, Red, Self>
Source§fn green(&self) -> FgColorDisplay<'_, Green, Self>
fn green(&self) -> FgColorDisplay<'_, Green, Self>
Source§fn on_green(&self) -> BgColorDisplay<'_, Green, Self>
fn on_green(&self) -> BgColorDisplay<'_, Green, Self>
Source§fn yellow(&self) -> FgColorDisplay<'_, Yellow, Self>
fn yellow(&self) -> FgColorDisplay<'_, Yellow, Self>
Source§fn on_yellow(&self) -> BgColorDisplay<'_, Yellow, Self>
fn on_yellow(&self) -> BgColorDisplay<'_, Yellow, Self>
Source§fn blue(&self) -> FgColorDisplay<'_, Blue, Self>
fn blue(&self) -> FgColorDisplay<'_, Blue, Self>
Source§fn on_blue(&self) -> BgColorDisplay<'_, Blue, Self>
fn on_blue(&self) -> BgColorDisplay<'_, Blue, Self>
Source§fn magenta(&self) -> FgColorDisplay<'_, Magenta, Self>
fn magenta(&self) -> FgColorDisplay<'_, Magenta, Self>
Source§fn on_magenta(&self) -> BgColorDisplay<'_, Magenta, Self>
fn on_magenta(&self) -> BgColorDisplay<'_, Magenta, Self>
Source§fn purple(&self) -> FgColorDisplay<'_, Magenta, Self>
fn purple(&self) -> FgColorDisplay<'_, Magenta, Self>
Source§fn on_purple(&self) -> BgColorDisplay<'_, Magenta, Self>
fn on_purple(&self) -> BgColorDisplay<'_, Magenta, Self>
Source§fn cyan(&self) -> FgColorDisplay<'_, Cyan, Self>
fn cyan(&self) -> FgColorDisplay<'_, Cyan, Self>
Source§fn on_cyan(&self) -> BgColorDisplay<'_, Cyan, Self>
fn on_cyan(&self) -> BgColorDisplay<'_, Cyan, Self>
Source§fn white(&self) -> FgColorDisplay<'_, White, Self>
fn white(&self) -> FgColorDisplay<'_, White, Self>
Source§fn on_white(&self) -> BgColorDisplay<'_, White, Self>
fn on_white(&self) -> BgColorDisplay<'_, White, Self>
Source§fn default_color(&self) -> FgColorDisplay<'_, Default, Self>
fn default_color(&self) -> FgColorDisplay<'_, Default, Self>
Source§fn on_default_color(&self) -> BgColorDisplay<'_, Default, Self>
fn on_default_color(&self) -> BgColorDisplay<'_, Default, Self>
Source§fn bright_black(&self) -> FgColorDisplay<'_, BrightBlack, Self>
fn bright_black(&self) -> FgColorDisplay<'_, BrightBlack, Self>
Source§fn on_bright_black(&self) -> BgColorDisplay<'_, BrightBlack, Self>
fn on_bright_black(&self) -> BgColorDisplay<'_, BrightBlack, Self>
Source§fn bright_red(&self) -> FgColorDisplay<'_, BrightRed, Self>
fn bright_red(&self) -> FgColorDisplay<'_, BrightRed, Self>
Source§fn on_bright_red(&self) -> BgColorDisplay<'_, BrightRed, Self>
fn on_bright_red(&self) -> BgColorDisplay<'_, BrightRed, Self>
Source§fn bright_green(&self) -> FgColorDisplay<'_, BrightGreen, Self>
fn bright_green(&self) -> FgColorDisplay<'_, BrightGreen, Self>
Source§fn on_bright_green(&self) -> BgColorDisplay<'_, BrightGreen, Self>
fn on_bright_green(&self) -> BgColorDisplay<'_, BrightGreen, Self>
Source§fn bright_yellow(&self) -> FgColorDisplay<'_, BrightYellow, Self>
fn bright_yellow(&self) -> FgColorDisplay<'_, BrightYellow, Self>
Source§fn on_bright_yellow(&self) -> BgColorDisplay<'_, BrightYellow, Self>
fn on_bright_yellow(&self) -> BgColorDisplay<'_, BrightYellow, Self>
Source§fn bright_blue(&self) -> FgColorDisplay<'_, BrightBlue, Self>
fn bright_blue(&self) -> FgColorDisplay<'_, BrightBlue, Self>
Source§fn on_bright_blue(&self) -> BgColorDisplay<'_, BrightBlue, Self>
fn on_bright_blue(&self) -> BgColorDisplay<'_, BrightBlue, Self>
Source§fn bright_magenta(&self) -> FgColorDisplay<'_, BrightMagenta, Self>
fn bright_magenta(&self) -> FgColorDisplay<'_, BrightMagenta, Self>
Source§fn on_bright_magenta(&self) -> BgColorDisplay<'_, BrightMagenta, Self>
fn on_bright_magenta(&self) -> BgColorDisplay<'_, BrightMagenta, Self>
Source§fn bright_purple(&self) -> FgColorDisplay<'_, BrightMagenta, Self>
fn bright_purple(&self) -> FgColorDisplay<'_, BrightMagenta, Self>
Source§fn on_bright_purple(&self) -> BgColorDisplay<'_, BrightMagenta, Self>
fn on_bright_purple(&self) -> BgColorDisplay<'_, BrightMagenta, Self>
Source§fn bright_cyan(&self) -> FgColorDisplay<'_, BrightCyan, Self>
fn bright_cyan(&self) -> FgColorDisplay<'_, BrightCyan, Self>
Source§fn on_bright_cyan(&self) -> BgColorDisplay<'_, BrightCyan, Self>
fn on_bright_cyan(&self) -> BgColorDisplay<'_, BrightCyan, Self>
Source§fn bright_white(&self) -> FgColorDisplay<'_, BrightWhite, Self>
fn bright_white(&self) -> FgColorDisplay<'_, BrightWhite, Self>
Source§fn on_bright_white(&self) -> BgColorDisplay<'_, BrightWhite, Self>
fn on_bright_white(&self) -> BgColorDisplay<'_, BrightWhite, Self>
Source§fn bold(&self) -> BoldDisplay<'_, Self>
fn bold(&self) -> BoldDisplay<'_, Self>
Source§fn dimmed(&self) -> DimDisplay<'_, Self>
fn dimmed(&self) -> DimDisplay<'_, Self>
Source§fn italic(&self) -> ItalicDisplay<'_, Self>
fn italic(&self) -> ItalicDisplay<'_, Self>
Source§fn underline(&self) -> UnderlineDisplay<'_, Self>
fn underline(&self) -> UnderlineDisplay<'_, Self>
Source§fn blink(&self) -> BlinkDisplay<'_, Self>
fn blink(&self) -> BlinkDisplay<'_, Self>
Source§fn blink_fast(&self) -> BlinkFastDisplay<'_, Self>
fn blink_fast(&self) -> BlinkFastDisplay<'_, Self>
Source§fn reversed(&self) -> ReversedDisplay<'_, Self>
fn reversed(&self) -> ReversedDisplay<'_, Self>
Source§fn strikethrough(&self) -> StrikeThroughDisplay<'_, Self>
fn strikethrough(&self) -> StrikeThroughDisplay<'_, Self>
Source§fn color<Color>(&self, color: Color) -> FgDynColorDisplay<'_, Color, Self>where
Color: DynColor,
fn color<Color>(&self, color: Color) -> FgDynColorDisplay<'_, Color, Self>where
Color: DynColor,
OwoColorize::fg or
a color-specific method, such as OwoColorize::green, Read moreSource§fn on_color<Color>(&self, color: Color) -> BgDynColorDisplay<'_, Color, Self>where
Color: DynColor,
fn on_color<Color>(&self, color: Color) -> BgDynColorDisplay<'_, Color, Self>where
Color: DynColor,
OwoColorize::bg or
a color-specific method, such as OwoColorize::on_yellow, Read more