Crate intern_arc[][src]

Interning library based on atomic reference counting

This library differs from arc-interner (which served as initial inspiration) in that

  • interners must be created and can be dropped (for convenience functions see below)
  • it does not dispatch based on TypeId (each interner is for exactly one type)
  • it offers both Hash-based and Ord-based storage
  • it handles unsized types without overhead, so you should use Intern<str> instead of Intern<String>

Unfortunately, this combination of features makes it inevitable to use unsafe Rust. The handling of reference counting and constructing of unsized values has been adapted from the standard library’s Arc type. Additionally, the test suite passes also under miri to check against some classes of undefined behavior in the unsafe code (including memory leaks). Dedicated tests are in place employing loom to verify adherence to the Rust memory model.

API flavors

The same API is provided in two flavors:

Interning small values takes of the order of 100–200ns on a typical server CPU. The ord-based storage has an advantage when interning large values (like slices greater than 1kB). The hash-based storage has an advantage when keeping lots of values (many thousands and up) interned at the same time.

Nothing beats your own benchmarking, though.

Convenience access to interners

When employing interning for example within a dedicated deserialiser thread, it is best to create and locally use an interner, avoiding further synchronisation overhead. You can also store interners in thread-local variables if you only care about deduplication per thread.

That said, this crate also provides convenience functions based on a global type-indexed pool:

use intern_arc::{global::hash_interner, Interned};

let i1 = hash_interner().intern_ref("hello"); // -> Interned<str>
let i2 = hash_interner().intern_sized(vec![1, 2, 3]); // -> Interned<Vec<i32>>

You can also use the type-indexed pool yourself to control its lifetime:

use intern_arc::{global::OrdInternerPool, Interned};

let mut pool = OrdInternerPool::new();
let i: Interned<[u8]> = pool.get_or_create().intern_box(vec![1, 2, 3].into());

Caveat emptor!

This crate’s Interned type does not optimise equality using pointer comparisons because there is a race condition between dropping a value and interning that same value that will lead to “orphaned” instances (meaning that interning that same value again later will yield a different storage location). All similarly constructed interning implementations share this caveat (e.g. internment or the above mentioned arc-interner).

Modules

global

Convenience functions for managing type-indexed interner pools

Structs

HashInterner
Interned
OrdInterner