Skip to main content

string_interner/
lib.rs

1#![no_std]
2#![doc(html_root_url = "https://docs.rs/crate/string-interner/0.18.0")]
3#![warn(unsafe_op_in_unsafe_fn, clippy::redundant_closure_for_method_calls)]
4
5//! Caches strings efficiently, with minimal memory footprint and associates them with unique symbols.
6//! These symbols allow constant time comparisons and look-ups to the underlying interned strings.
7//!
8//! ### Example: Interning & Symbols
9//!
10//! ```
11//! use string_interner::StringInterner;
12//!
13//! let mut interner = StringInterner::default();
14//! let sym0 = interner.get_or_intern("Elephant");
15//! let sym1 = interner.get_or_intern("Tiger");
16//! let sym2 = interner.get_or_intern("Horse");
17//! let sym3 = interner.get_or_intern("Tiger");
18//! assert_ne!(sym0, sym1);
19//! assert_ne!(sym0, sym2);
20//! assert_ne!(sym1, sym2);
21//! assert_eq!(sym1, sym3); // same!
22//! ```
23//!
24//! ### Example: Creation by `FromIterator`
25//!
26//! ```
27//! # use string_interner::DefaultStringInterner;
28//! let interner = ["Elephant", "Tiger", "Horse", "Tiger"]
29//!     .into_iter()
30//!     .collect::<DefaultStringInterner>();
31//! ```
32//!
33//! ### Example: Look-up
34//!
35//! ```
36//! # use string_interner::StringInterner;
37//! let mut interner = StringInterner::default();
38//! let sym = interner.get_or_intern("Banana");
39//! assert_eq!(interner.resolve(sym), Some("Banana"));
40//! ```
41//!
42//! ### Example: Iteration
43//!
44//! ```
45//! # use string_interner::{DefaultStringInterner, Symbol};
46//! let interner = <DefaultStringInterner>::from_iter(["Earth", "Water", "Fire", "Air"]);
47//! for (sym, str) in &interner {
48//!     println!("{} = {}", sym.to_usize(), str);
49//! }
50//! ```
51//!
52//! ### Example: Use Different Backend
53//!
54//! ```
55//! # use string_interner::StringInterner;
56//! use string_interner::backend::BufferBackend;
57//! type Interner = StringInterner<BufferBackend>;
58//! let mut interner = Interner::new();
59//! let sym1 = interner.get_or_intern("Tiger");
60//! let sym2 = interner.get_or_intern("Horse");
61//! let sym3 = interner.get_or_intern("Tiger");
62//! assert_ne!(sym1, sym2);
63//! assert_eq!(sym1, sym3); // same!
64//! ```
65//!
66//! ### Example: Use Different Backend & Symbol
67//!
68//! ```
69//! # use string_interner::StringInterner;
70//! use string_interner::{backend::BucketBackend, symbol::SymbolU16};
71//! type Interner = StringInterner<BucketBackend<SymbolU16>>;
72//! let mut interner = Interner::new();
73//! let sym1 = interner.get_or_intern("Tiger");
74//! let sym2 = interner.get_or_intern("Horse");
75//! let sym3 = interner.get_or_intern("Tiger");
76//! assert_ne!(sym1, sym2);
77//! assert_eq!(sym1, sym3); // same!
78//! ```
79//!
80//! ## Backends
81//!
82//! The `string_interner` crate provides different backends with different strengths.
83//! The table below compactly shows when to use which backend according to the following
84//! performance characteristics and properties.
85//!
86//! | **Property** | **BucketBackend** | **StringBackend** | **BufferBackend** | | Explanation |
87//! |:-------------|:-----------------:|:-----------------:|:-----------------:|:--|:--|
88//! | Fill              | 🤷 | 👍 | ⭐ | | Efficiency of filling an empty string interner. |
89//! | Fill Duplicates   | 1) | 1) | 1) | | Efficiency of filling a string interner with strings that are already interned. |
90//! | Resolve           | ⭐ | 👍 | 👎 | | Efficiency of resolving a symbol of an interned string. |
91//! | Resolve Unchecked | 👍 | 👍 | ⭐ 2) | | Efficiency of unchecked resolving a symbol of an interned string. |
92//! | Allocations       | 🤷 | 👍 | ⭐ | | The number of allocations performed by the backend. |
93//! | Footprint         | 🤷 | 👍 | ⭐ | | The total heap memory consumed by the backend. |
94//! | Iteration         | ⭐ | 👍 | 👎 | | Efficiency of iterating over the interned strings. |
95//! |                   | | | | | |
96//! | Contiguous        | ✅ | ✅ | ❌ | | The returned symbols have contiguous values. |
97//! | Stable Refs       | ✅ | ❌ | ❌ | | The interned strings have stable references. |
98//! | Static Strings    | ✅ | ❌ | ❌ | | Allows to intern `&'static str` without heap allocations. |
99//!
100//! 1. Performance of interning pre-interned string is the same for all backends since
101//!    this is implemented in the `StringInterner` front-end via a `HashMap` query for
102//!    all `StringInterner` instances.
103//!
104//! 2. `BufferBackend` is slow with checked resolving because its internal representation
105//!    is extremely sensible to the correctness of the symbols, thus a lot of checks
106//!    are performed. If you will only use symbols provided by the same instance of
107//!    `BufferBackend`, `resolve_unchecked` is a lot faster.
108//!
109//! ### Legend
110//!
111//! | ⭐ | **best performance** | 👍 | **good performance** | 🤷 | **okay performance** | 👎 | **bad performance** |
112//! |-|-|-|-|-|-|-|-|
113//!
114//! ## When to use which backend?
115//!
116//! ### Bucket Backend
117//!
118//! Given the table above the `BucketBackend` might seem inferior to the other backends.
119//! However, it allows to efficiently intern `&'static str` and avoids deallocations.
120//!
121//! ### String Backend
122//!
123//! Overall the `StringBackend` performs really well and therefore is the backend
124//! that the `StringInterner` uses by default.
125//!
126//! ### Buffer Backend
127//!
128//! The `BufferBackend` is in some sense similar to the `StringBackend` on steroids.
129//! Some operations are even slightly more efficient and it consumes less memory.
130//! However, all this is at the costs of a less efficient resolution of symbols.
131//! Note that the symbols generated by the `BufferBackend` are not contiguous.
132//!
133//! ## Customizing String Hashing
134//!
135//! To ensure only one copy of each string is interned, [`StringInterner`] relies on [hashbrown]'s
136//! [hashmap](hashbrown::HashMap), which necessitates choosing a hashing function for hashing the
137//! strings.
138//!
139//! By default, [`StringInterner`] will use hashbrown's [`DefaultHashBuilder`], which should be
140//! appropriate for most users. However, you may customize the hash function via
141//! [`StringInterner`]'s second type parameter:
142//!
143//! ```
144//! use std::hash::RandomState;
145//! use string_interner::{StringInterner, DefaultBackend};
146//!
147//! // create a StringInterner with the default backend but using std's RandomState hasher
148//! let interned_strs: StringInterner<DefaultBackend, RandomState> = StringInterner::new();
149//! ```
150//!
151//! NB: as of hashbrown v0.15.2, the [`DefaultHashBuilder`] is [foldhash's
152//! RandomState](https://docs.rs/foldhash/latest/foldhash/fast/struct.RandomState.html), which
153//! relies on a one-time random initialization of shared global state; if you need stable hashes
154//! then you may wish to use [foldhash's
155//! FixedState](https://docs.rs/foldhash/latest/foldhash/fast/struct.FixedState.html) (or similar)
156//! instead.
157
158extern crate alloc;
159#[cfg(feature = "std")]
160#[macro_use]
161extern crate std;
162
163#[cfg(feature = "serde")]
164mod serde_impl;
165
166pub mod backend;
167mod interner;
168pub mod symbol;
169
170/// A convenience [`StringInterner`] type based on the [`DefaultBackend`].
171#[cfg(feature = "backends")]
172pub type DefaultStringInterner<B = DefaultBackend, H = DefaultHashBuilder> =
173    self::interner::StringInterner<B, H>;
174
175#[cfg(feature = "backends")]
176#[doc(inline)]
177pub use self::backend::DefaultBackend;
178#[doc(inline)]
179pub use self::{
180    interner::StringInterner,
181    symbol::{DefaultSymbol, Symbol},
182};
183
184#[doc(inline)]
185pub use hashbrown::DefaultHashBuilder;