---
---
{% include header-pre.html %}
<TITLE>Intrusive Data Structures in Rust</TITLE>
{% include header-post.html %}
{% include body-pre.html %}
<H1>Intrusive Data Structures in Rust</H1>
<P>
Intrusive data structures, when used appropriately, can improve code
reliability, resource utilization, and predictability of
performance, by tremendously reducing the number of allocations
necessary in a system design. Since it can be assumed that any
allocation may fail, reducing the number of allocations will
generally make it easier to reason rigorously about the possible
fault-points in an application.
</P>
<P>
Making faults easier to reason about and avoid (while keeping
resource usage low) is a large part of why the Rust language exists,
so it would be great if the language could take advantage of
intrusive structures. This is more difficult in Rust than in other
systems languages, because one of the major ways that Rust attempts
to make faults easier to reason about is by restricting the creation
of variable aliases - that is, by making it difficult to access the
same memory in two different ways. Working with intrusive structures
often means working with pointer arithmetic, which makes memory
aliasing much more difficult to statically analyze. It is difficult
to get code that uses intrusive structures to work in Rust.
</P>
<P>
I wrote
<A HREF="https://crates.io/crates/containerof">containerof</A> to
make generalized intrusive structures usable from Rust. It differs
from other approaches I've seen, in that I am <EM>not</EM> trying to
design specific intrusive structures in this module, but am rather
trying to solve the problem of supporting intrusive
structures <EM>in general</EM>. Here are the basic concepts:
</P>
<DL>
<DT>
<CODE>container</CODE>, <CODE>field</CODE>,
and <CODE>fieldtype</CODE>
</DT>
<DD>
<P>
At heart, an intrusive structure is a value embedded within a
container, in which the embedded value directly represents some
capability, and by embedding this value within a container the
container acquires that capability. That is,
the <CODE>fieldtype</CODE> describes a capability,
a <CODE>field</CODE> embodies the capability, and
a <CODE>container</CODE> receives the capability by embedding
a <CODE>field</CODE>.
</P>
</DD>
<DT>
<CODE>containerof::Intrusive</CODE>
and <CODE>containerof_intrusive!</CODE>
</DT>
<DD>
<P>
A trait, describing functions for translating from a
container-pointer to a field-pointer, and vice-versa.
The <CODE>containerof_intrusive!</CODE> macro defines a type
that implements this trait for a particular container and field
combination.
</P>
</DD>
<DT>
<CODE>from</CODE>, <CODE>into</CODE>, <CODE>as</CODE>,
and <CODE>of</CODE>
</DT>
<DD>
<P>
A <CODE>containerof::Intrusive</CODE> type defines several
routines, which describe different means of
translating <CODE>container</CODE> pointers
to <CODE>field</CODE> pointers, and vice-versa. To simplify
the <CODE>Intrusive</CODE> trait definition, the API follows a
hub-and-spokes type design: to translate
from <CODE>container</CODE> to <CODE>field</CODE>,
or <CODE>field</CODE> to <CODE>container</CODE>, or to or from
an <CODE>IntrusiveAlias</CODE> (described later), you must go
through an intermediate representation (of the type defined by
the <CODE>containerof_intrusive!</CODE> macro). This
intermediate type has translaters from source representations
(with prefixes <CODE>from</CODE> and <CODE>of</CODE>,
where <CODE>from</CODE> indicates a move and <CODE>of</CODE>
involves a borrow) and to source representations (with
prefixes <CODE>into</CODE> and <CODE>as</CODE>,
where <CODE>into</CODE> indicates a move and <CODE>as</CODE>
indicates a borrow).
</P>
</DD>
<DT>
<CODE>containerof::OwnBox</CODE>
and <CODE>containerof::BorrowBox</CODE>
</DT>
<DD>
<P>
Intrusive structures must, at root, rely on pointer-arithmetic to
allow translation between structure fields and their containers.
Rust provides a few ways of representing pointers internally:
</P>
<UL>
<LI>Borrowing pointers (<CODE>&</CODE> pointer types);</LI>
<LI>Unsafe pointers (<CODE>*</CODE> pointer types); and</LI>
<LI>Box pointers (<CODE>Box<></CODE> pointer types).
</UL>
<P>
None of these adequately represent the pointer manipulations
desired by intrusive structures, so <CODE>containerof</CODE>
defines two new classes of
pointer: <CODE>containerof::OwnBox</CODE>
and <CODE>containerof::BorrowBox</CODE>
(and <CODE>containerof::BorrowBoxMut</CODE>).
</P>
<P>
Owning pointers for intrusive structures should <EM>never</EM>
be dropped, since the drop behavior will be undefined unless
invoked on the outermost container, and <CODE>containerof</CODE>
is designed to allow transferring ownership of the container by
transferring ownership of the embedded field.
Therefore, <CODE>containerof</CODE> defines its own owning box
structure, called <CODE>OwnBox</CODE>. This is intended to be a
linear-type, so that eventually compilation will fail if
an <CODE>OwnBox</CODE> is dropped, rather than consumed. Until
Rust supports linear types, <CODE>OwnBox</CODE> types panic when
dropped.
</P>
<P>
On the other hand, borrow pointers do adequately represent the
semantics we want when translating borrows of fields to and from
borrows of containers. Unfortunately, there is an implementation
snag that prevents using borrow-pointers
for <CODE>Intrusive</CODE> intermediate types: it is impossible
to convert a borrow for the in-memory container or field into a
borrow for an intermediate type that does <EM>not</EM> exist in
memory at the point where the translation is necessary.
The <CODE>BorrowBox</CODE> and <CODE>BorrowBoxMut</CODE> types
are move types that represent a borrowing operation, so that
constructing a <CODE>BorrowBox</CODE> will behave very similarly
to constructing a borrow pointer.
</P>
</DD>
<DT><CODE>containerof::IntrusiveAlias</CODE></DT>
<DD>
The last major concept in the <CODE>containerof</CODE> crate is
the <CODE>containerof::IntrusiveAlias</CODE>
structure. <CODE>containerof::IntrusiveAlias</CODE> is designed to
have the same memory layout as any type defined by invocation of
the <CODE>containerof_intrusive!</CODE> macro. That's really all
there is to it. The major motivation for the class is precisely
that it is type-unsafe, so that it is possible to write code that
works with intrusive structures, without having
type-parameterization result in multiple identical copies of
intrusive structure code. In other words, it provides a means to
avoid excessive code overhead from monomorphisation.
</DD>
</DL>
{% include body-post.html %}