Crate justly

Crate justly 

Source
Expand description

Collections that don’t need extra bounds-checking.

§Overview

Justly implements “justified” wrappers for the standard Rust data structures in std::collections. A justified container is one whose keys, such as the usize indices into a Vec, keep track of their validity. This has a few major consequences, listed below.

§Trustworthiness

Justly is built on the idea of trustworthy knowledge: call::Call represents a value called by a name, which lets you encode knowledge about that named thing, such as key::Key, an index that you know is valid. If you uphold the safety properties described in these APIs, then such knowledge is called trustworthy, and so you can rely on it to build safe abstractions. Now unchecked indexing can be well and truly safe!

§No extra checks

A “checked” method is one that performs bounds checking, and an “unchecked” method is one that unsafely omits it. Methods in Justly that safely omit bounds checks are called check-free. The Rust compiler attempts to elide checks when it can prove that they will always succeed. Justly implements check-free APIs by giving you the tools to prove that checks can be skipped.

It’s worth noting that some checks are strictly necessary! For example, you must do a runtime test to tell whether an arbitrary integer index is a valid key. Justly only lets you avoid extra checks, which are not strictly necessary, yet the compiler cannot prove it. But once you have gotten a typed key, it proves that the test has already been done, and doesn’t need to be repeated.

Also, most indices that you get from the collection APIs are already known to be valid keys, so Justly merely adds that information in its wrapper APIs.

§No invalid indices

When referring to elements in a collection, Rust encourages using indices, relative to the collection, instead of absolute pointers. When a container is mutable, mutating methods can reallocate the container’s storage if its capacity is exceeded, thereby invalidating all pointers to that storage. So relative indexing makes it easier to uphold memory safety compared to absolute addressing, because indices automatically respond to capacity changes.

However, relative indices still don’t automatically handle changes in size: any method that decreases the size of the container must naturally invalidate any indices referring to elements outside of its new bounds.

Worse, since indices are just bare integers, they come with no guarantees about the relationship between the index and the value at that index. For example, suppose we have a vector, v, and we compute some indices into it, i and j.

let mut v = vec![1, 7, 16, 27, 40, 55];

let i = v.iter().enumerate().position(|(k, &v)| k * 10 == v).unwrap();
let j = v.iter().enumerate().position(|(k, &v)| k * 11 == v).unwrap();

println!("{}", v[i]);  // valid (40)
println!("{}", v[j]);  // valid (55)

If we remove() an element, then this can easily break code that makes assumptions about the indices: remove() not only makes j inaccessible, but also silently changes the meaning of i. This is like a use-after-free error.

v.remove(2);

println!("{}", v[i]);  // logic error (55)
println!("{}", v[j]);  // runtime error (panic)

And of course if we push() an element, the problem is compounded: it silently makes j accessible again, but with a different meaning as well.

v.push(68);

println!("{}", v[j]);  // logic error (68)

So Justly wraps those mutating methods that could invalidate the keys of a collection in consuming methods that give back a modified object, along with information about how the new keys are related to the old ones.

We do this with phantom type parameters, so there is no runtime cost.

§About this documentation

Wrapper types and methods are annotated with tags that describe their purpose, which are marked with curly braces {…} for easier searching.

§{free}

Marks methods that used to require bounds-checking on the original type, but are now check-free.

§{just}

On a container type, this says that the type is a wrapper for a plain container where some methods use typestate to add safety.

On a method, it marks places where we change the signature, typically either:

  • Taking ownership of self to replace a mut method on the wrapped type with one that tracks knowledge about the container

  • Adding wrappers such as key::Key to record some knowledge about the result

§{like}

Links to the original wrapped thing.

§{safe}

Marks things that used to be unsafe (usually unchecked) but are now safe (implying check-freedom).

§TODO

§Allow naming unsized things

call::Call and link::Link ask for T: Sized, but they don’t really need to, since all the other fields are meant to be phantoms.

§Allocator support

To support allocators, there should be a type parameter on the wrappers, such as A = std::alloc::Global.

§Implicit dereferencing

We’re uneasy about whether to add implementations of Deref and DerefMut that would let a value be implicitly called by a name, like these for Vec:

impl<'name, T> Deref for Vec<'name, T> {
    type Target = Call<'name, vec::Vec<T>>;
    fn deref(&self) -> &Call<'name, vec::Vec<T>> {
        &self.same
    }
}

impl<'name, T> DerefMut for Vec<'name, T> {
    fn deref_mut(
        &mut self,
    ) -> &mut Call<'name, vec::Vec<T>> {
        &mut self.same
    }
}

§Tracking capacity

At the time of this writing, these APIs only track the validity of indices. This means that if std::collections has a mutating method that doesn’t change any indices, we can just forward it and keep the same API, which is likely better for usability.

However, in principle we could just as well track the capacity too. This would reflect the std::collections guarantees about allocation at the type level, making it possible in some circumstances to statically guarantee that code will not allocate. But it would come at the ergonomic cost of using a different API for functions that might change the capacity.

Modules§

binary_heap
{like}:std::collections::binary_heap
btree_map
{like}:std::collections::btree_map
call
name ≡ value relationships: define a unique static name for a dynamic value.
called
Gives names to values.
hash_map
{like}:std::collections::hash_map
key
Indices known to be valid keys in collections.
link
name × value relationships: pair a static name with a dynamic value.
linked_list
{like}:std::collections::linked_list
name
Static, type-level names for runtime values.
only
Helper for unique names.
proof
Static proofs.
prop
Common properties of collections.
vec
{just} {like}:std::vec {like}:std::slice
vec_deque
{like}:std::collections::vec_deque