Crate intern_arc

source ·
Expand description

Interning library based on atomic reference counting

This library differs from arc-interner (which served as initial inspiration) in that

  • interners must be created and can be dropped (for convenience functions see below)
  • it does not dispatch based on TypeId (each interner is for exactly one type)
  • it offers both Hash-based and Ord-based storage
  • it handles unsized types without overhead, so you should inline str instead of String

Unfortunately, this combination of features makes it inevitable to use unsafe Rust. The construction of unsized values has been adapted from the standard library’s Arc type; reference counting employs a parking_lot mutex since it must also ensure consistent access to and usage of the cleanup function that will remove the value from its interner. The test suite passes also under miri to check against some classes of undefined behavior in the unsafe code (including memory leaks).

API flavors

The same API is provided in two flavors:

Interning small values takes of the order of 100–200ns on a typical server CPU. The ord-based storage has an advantage when interning large values (like slices greater than 1kB). The hash-based storage has an advantage when keeping lots of values (many thousands and up) interned at the same time.

Nothing beats your own benchmarking, though.

Convenience access to interners

When employing interning for example within a dedicated deserialiser thread, it is best to create and locally use an interner, avoiding further synchronisation overhead. You can also store interners in thread-local variables if you only care about deduplication per thread.

That said, this crate also provides convenience functions based on a global type-indexed pool:

use intern_arc::{global::hash_interner};

let i1 = hash_interner().intern_ref("hello"); // -> InternedHash<str>
let i2 = hash_interner().intern_sized(vec![1, 2, 3]); // -> InternedHash<Vec<i32>>

You can also use the type-indexed pool yourself to control its lifetime:

use intern_arc::{global::OrdInternerPool, InternedOrd};

let mut pool = OrdInternerPool::new();
let i: InternedOrd<[u8]> = pool.get_or_create().intern_box(vec![1, 2, 3].into());


  • Convenience functions for managing type-indexed interner pools