Expand description

varlen

Ergonomic variable-length types.

  1. Summary
  2. Motivation
  3. Examples
  4. Overview of types
  5. Use of Pin
  6. Feature flags

Summary

varlen defines foundational types and traits for working with variable-length types in Rust.

The main example of variable-length type is a struct that stores a dynamically-sized array directly in its storage, without requiring a pointer to a separate memory allocation. varlen helps you define such types, and lets you build arbitrary concatenations and structs of them. Additionally, it provides equivalents of the standard library’s Box<T> and Vec<T> types that are adapted to work well with variable-length types.

If you want to reduce the number of pointer indirections in your types by storing variable-sized arrays directly in your objects, then varlen is the library for you.

Motivation

Traditionally when we use variable-sized data such as strings or arrays, we use a pointer to a separately allocated object. For example, the following object …

type Person = (/* age */ usize, /* name */ Box<str>, /* email */ Box<str>);
let person: Person = 
    (16, Box::from("Harry Potter"), Box::from("harry.potter@example.com"));

… is represented in memory like this, with three separately allocated objects:

Sometimes we can reduce the number of object allocations by bringing the variable-length storage directly into the parent object, perhaps with a memory layout like this:

This layout reduced the number of object allocations from 3 to 2, potentially improving memory allocator performance, and potentially also improving CPU cache locality. It also reduced the number of pointers from 3 to 2, saving memory.

The main disadvantage of this layout is that size and layout of the Person object is not known at compile time; it is only known at runtime, when the lengths of the strings are known. Working with such layouts in plain Rust is cumbersome, and also requires unsafe code to do the necessary pointer arithmetic.

varlen lets you easily define and use such types, without you having to write any unsafe code. The following code will create an object with the memory layout from above:

use varlen::prelude::*;
type Person = Tup3</* age */ FixedLen<usize>, /* name */ Str, /* email */ Str>;
let person: VBox<Person> = VBox::new(tup3::Init(
    FixedLen(16),
    Str::copy("Harry Potter"),
    Str::copy("harry.potter@example.com"),
));

Examples

use varlen::prelude::*;

// Define a variable-length tuple:
type MyTuple = Tup3<FixedLen<usize>, Str, Array<u16>>;
let my_tuple: VBox<MyTuple> = VBox::new(tup3::Init(
    FixedLen(16), Str::copy("hello"), Array::copy(&[1u16, 2])));

// Put multiple objects in a sequence, with tightly packed memory layout:
let sequence: Seq<MyTuple> = seq![my_tuple.vcopy(), my_tuple.vcopy()];

// Or arena-allocate them, if the "bumpalo" crate feature is enabled:
let arena = bumpalo::Bump::new();
let arena_tuple: Owned<MyTuple> = Owned::new_in(my_tuple.vcopy(), &arena);

// Define a newtype wrapper for the tuple:
define_varlen_newtype! {
    #[repr(transparent)]
    pub struct MyStruct(MyTuple);

    with init: struct MyStructInit<_>(_);
    with inner_ref: fn inner(&self) -> &_;
    with inner_mut: fn inner_mut(self: _) -> _;
}
let my_struct: VBox<MyStruct> = VBox::new(MyStructInit(my_tuple));

// Define a variable-length struct via a procedural macro, if the "macro"
// crate feature is enabled.
#[define_varlen]
struct MyMacroStruct {
    age: usize,
    #[varlen]
    name: Str,
    #[varlen]
    email: Str,
    #[varlen]
    child: MyStruct,
}
let s: VBox<MyMacroStruct> = VBox::new(
    my_macro_struct::Init{
        age: 16,
        name: Str::copy("Harry Potter"),
        email: Str::copy("harry.potter@example.com"),
        child: my_struct,
    }
);

// #[define_varlen] also let you directly specify array lengths:
#[define_varlen]
struct MultipleArrays {
    #[controls_layout]
    len: usize,

    #[varlen_array]
    array1: [u16; *len],

    #[varlen_array]
    array2: [u8; *len],

    #[varlen_array]
    half_array: [u16; (*len) / 2],
}
let base_array = vec![0u16, 64000, 13, 105];
let a: VBox<MultipleArrays> = VBox::new(multiple_arrays::Init{
    len: base_array.len(),
    array1: FillSequentially(|i| base_array[i]),
    array2: FillSequentially(|i| base_array[base_array.len() - 1 - i] as u8),
    half_array: FillSequentially(|i| base_array[i * 2]),
});

Overview of types

varlen provides variable-length versions of various standard-library types and traits. This table gives the correspondence:

NameFixed-length type TVariable-length type TNotes
Immutable reference&T&T
Mutable reference&mut TPin<&mut T>Pin<> required for safety, see below
Owning, non-allocatedTOwned<'storage, T>Owned<T> is still a pointer to T’s payload
Owning, allocatedBox<T>VBox<T>
SequenceVec<T>Seq<T>Seq has tightly-packed variable-size elements. Random access is somewhat restricted
StringStringStrString payload immediately follows the size, no pointer following
Array (fixed-size elems)[Box<[u16]>]Array<u16>Array payload immediately follows the size, no pointer following
Tuple(T, U)Tup2<T, U>Field U might not be at a statically known offset from start of object
CloneClone::clone()VClone::vclone()
CopyVCopy::vcopy()

Use of Pin

Mutable references to variable-length types use Pin<&mut T> rather than &mut T. By doing so, we prevent patterns such as calling std::mem::swap on variable-length types. Such patterns would be a safety hazard, because the part of the type that the Rust compiler knows about when calling std::mem::swap is just the “fixed-size head” of the type. However, almost all variable-length types additionally have a “variable-sized tail” that the Rust compiler doesn’t know about. Swapping the head but not the tail could violate a type’s invariants, potentially breaking safety.

If you never write unsafe code, you don’t need to worry about this issue. The only practical consequence is that mutable access to a variable-length type is always mediated through Pin<&mut T> rather than &mut T, and you will have to work with the slightly more cumbersome Pin APIs.

On the other hand, if you write unsafe code, you may have to be aware of the following invariant. If T is a variable-length type, we require that any reference &T points to a “valid” T, which we consider to be one which has a fixed-length head (of size std::mem::size_of::<T>()) followed by a variable-length tail, and the head and tail are “consistent” with each other. Here, “consistent” means that they were produced by a call to one of the type’s Initializer instances. In unsafe code, where you might have access to a &mut T (without a Pin), you must avoid code patterns which modify the head without also correspondingly modifying the tail.

Feature flags

This crate has no required dependencies. The following feature flags exist, which can turn on some dependencies.

  • bumpalo. Enables support for allocating an Owned<T> in an bumpalo::Bump arena. Adds a dependency on bumpalo.
  • macro. Enables procedural macro support, for defining variable-length structs using #[define_varlen]. Adds a dependency on varlen_macro, syn and quote.
  • doc. Enables pretty SVG diagrams in documentation. Adds a lot of dependencies.

Re-exports

pub use crate::str::Str;
pub use array::Array;
pub use owned::Owned;
pub use seq::Seq;
pub use vbox::VBox;

Modules

A variable-length array with inline storage.

An ArrayInitializer<T> is an object that knows how to initialize the memory for a [T]. It is useful for initializing [T; N] or (of more relevance for this crate) initializing a varlen::array::Array<T>.

Marker types for variable-length fields of structs.

Macros for defining newtype wrappers around variable-length types.

A pointer to T that calls its destructor but not its deallocator when dropped.

Single module with almost all varlen exports

A sequence of variable-length objects in a flat buffer.

A string with inline storage.

Tuples of variable length types, potentially mixed with fixed-length types.

Equivalent of Box<T> for variable-length types.

Macros

Defines a newtype wrapper around a varlen type.

Lifts a crate::Initializer<T> implementation to a newtype.

Creates a sequence with the specified elements.

Structs

Presents a fixed-length type T as a variable-length type.

Initializer that clones from a FixedLen<T>.

The variable-length layout of a fixed-length type.

An initializer that constructs a bytewise copy of T.

Traits

A type that knows how to construct a variable-length object.

A layout of a variable-length object.

Support for cloning variable-length types.

Support for shallow byte-wise copy of varlen types.

The fundamental trait for variable-length types.

Attribute Macros

Macro for defining variable-length structs.