Expand description
Interned provides highly optimized, thread-local, generic
interning via Interned<T>
and a
memoization layer built on top of this
interning layer, provided by Memoized<I, T>
, which can cache the result of an arbitrary
input I: Hash
and intern this result in the underlying interning layer.
Blanket implementations supporting T
are provided for all primitives, slices of Sized
T
(including &[u8]
), as well as str
slices (&str
). Support for additional
arbitrary types can be added by implementing DataType
, Staticize
, and Hash
.
str
slices have a custom implementation since they are the only built-in unsized type
with slice support.
All values are heap-allocated 'static
s and benefit from TypeId
-specific locality of
reference in the heap. Any two Interned<T>
instances that have the same value of T
will be guaranteed to point to the same memory address in the heap. Among other things,
this allows for O(1)
(in the size of the data) equality comparisons since the heap
addresses are compared instead of having to compare the underlying data bit-by-bit. This
makes interned types especially suited for parsing and similar tasks.
A caveat of the 'static
lifetime and immutability of the underlying heap data is that
unique values of Interned<T>
and Memoized<I, T>
leak in the sense that they can
never be de-allocated. This allows us to implement Copy
on all interned types, because
we can rely on the heap pointer to continue existing for the life of the program once it
has been created for a particular value. For this reason, you should not use this crate
for long-running programs that will encounter an unbounded number of unique values, such as
those created by an unending stream of user input.
Because the internal size of an Interned<T>
on the stack is the size of a usize
(pointer) plus a u64
(cached hash code), it would be silly to use Interned<T>
with
integer types directly, however it makes sense to do so for the purposes of memoizing an
expensive computation via Memoized<I, T>
.
An interned string type, InStr
, is also provided as a convenient wrapper around
Interned<&'static str>
. It has a number of extra impls and should be your go-to type if
you want to work with interned strings.
§Interned Example
use interned::*;
let a: Interned<i32> = 1289.into();
let b = Interned::from(1289);
let c: Interned<i32> = 47.into();
assert_eq!(a, b);
assert_ne!(a, c);
assert_eq!(a.as_ptr(), b.as_ptr());
assert_ne!(b.as_ptr(), c.as_ptr());
let d: Interned<&str> = "asdf".into();
assert_ne!(d, "fdsa".into());
assert_eq!(Interned::from("asdf"), d);
let e = Interned::from([1, 2, 3, 4, 5].as_slice());
let f = InStr::from("abc");
let g: InStr = "abc".into();
assert_eq!(f, g);
assert_eq!(f.as_ptr(), g.as_ptr());
assert_eq!(e, [1, 2, 3, 4, 5].as_slice().into());
assert_ne!(e, [4, 1, 7].as_slice().into());
assert_eq!(format!("{b:?}"), "Interned<i32> { value: 1289 }");
assert_eq!(format!("{d:?}"), "Interned<&str> { str: \"asdf\" }");
assert_eq!(e[3], 4);
assert_eq!(e[0], 1);
assert_eq!(
format!("{e:?}"),
"Interned<&[i32]> { slice: [1, 2, 3, 4, 5] }"
);
§Memoized Examples
use interned::*;
let initial_interned = num_interned::<usize>();
let initial_memoized = num_memoized::<usize>();
let a = Memoized::from("scope a", "some_input", |input| input.len().into());
let b = Memoized::from("scope a", "other", |input| input.len().into());
assert_ne!(a, b);
let c = Memoized::from("scope a", "some_input", |input| input.len().into());
assert_eq!(a, c);
assert_ne!(b, c);
assert_eq!(a.as_value(), &10);
assert_ne!(*a.as_value(), 11);
assert_eq!(*b.interned().interned_value(), 5);
assert_eq!(*c.as_value(), 10);
assert_eq!(num_interned::<usize>(), initial_interned + 2);
assert_eq!(num_memoized::<usize>(), initial_memoized + 2);
The following demonstrates how “scopes” work with Memoized
:
use interned::*;
fn expensive_fn(a: usize, b: usize, c: usize) -> String {
format!("{}", a * a + b * b + c * c)
}
let a = Memoized::from("my_scope", (1, 2, 3), |tup: (usize, usize, usize)| {
expensive_fn(tup.0, tup.1, tup.2).as_str().into()
});
assert_eq!(a.as_str(), "14");
Re-exports§
pub use datatype::DataType;
pub use memoized::Memoized;
pub use unsized_types::*;
Modules§
- _unsafe
- Contains some of the unsafe backing types used by interned, mainly
Static
. - datatype
- Contains the
DataType
trait, which is a useful utility trait that provides a number of associated types forT
such as the internal type in a slice, etc, howT
should be dereferenced, etc. For a type to work withInterned
and/orMemoized
, it must implementDataType
. - memoized
- Home of the
Memoized
struct, which provides a memoization facility on top ofInterned
so you can memoize and intern values based on input and a specified scope. - unsized_
types
Macros§
- derive_
from_ interned_ impl_ slice - Derives
From<Interned<T>>
for the specified slice type. - derive_
from_ interned_ impl_ value - Derives
From<Interned<T>>
for the specified value type. - unsafe_
impl_ data_ type - A convenience macro provided to easily implement
DataType
forValue
types that are neither slices nor references, for whichValueType
,InnerType
, andDerefTargetType
are all equal toT
.
Structs§
- Interned
- The main type of this crate. Represents a unique, heap-allocated, statically interned value that will exist for the life of the program.
Functions§
- num_
interned - Returns the number of items currently interned by
Interned
on the current thread for the specified typeT
. This is useful for testing and debugging. - num_
memoized - Returns the number of items currently memoized by
Memoized
on the current thread for the specified typeT
. This is useful for testing and debugging.