dizzy 0.1.0

Macros for safely interacting with DST newtypes
Documentation
  • Coverage
  • 100%
    2 out of 2 items documented1 out of 2 items with examples
  • Size
  • Source code size: 23.94 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 1.21 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 19s Average build duration of successful builds.
  • all releases: 19s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • eikopf/dizzy
    1 0 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • eikopf

dizzy

This crate provides the DstNewtype derive macro for defining newtypes of dynamically-sized types (DSTs) and generating standard boilerplate methods and trait implementations.

For example, you could define a strictly ASCII string slice:

use dizzy::DstNewtype;

#[derive(PartialEq, DstNewtype)]
#[dizzy(invariant = str::is_ascii)]
#[dizzy(constructor = pub const from_str, getter = pub const as_str)]
#[dizzy(derive(Debug, Deref))]
#[repr(transparent)]
struct AsciiStr(str);

assert!(AsciiStr::from_str("").is_some());
assert!(AsciiStr::from_str("dizzy").is_some());
assert_eq!(AsciiStr::from_str("dizzy").unwrap().len(), 5);
assert_eq!(AsciiStr::from_str("λ"), None);

Or you could define a non-empty generic slice type:

use dizzy::DstNewtype;

#[derive(Debug, PartialEq)]
struct EmptySliceError;

const fn slice1_invariant<T>(slice: &[T]) -> Result<(), EmptySliceError> {
    match slice.is_empty() {
        true => Err(EmptySliceError),
        false => Ok(()),
    }
}

#[derive(Debug, PartialEq, DstNewtype)]
#[dizzy(invariant = slice1_invariant, error = EmptySliceError)]
#[dizzy(constructor = const new)]
#[dizzy(getter = const get)]
#[repr(transparent)]
struct Slice1<T> {
    inner: [T],
}

assert_eq!(Slice1::<()>::new(&[]), Err(EmptySliceError));
assert!(Slice1::new(&[()]).is_ok());

More examples can be found in the tests/ directory.

Motivation

The central focus of this crate is avoiding boilerplate, and in particular avoiding unsafe boilerplate. When working with DST newtypes, it is often necessary to use unsafe code for even the most basic operations: construction requires [core::mem::transmute] and cloning a boxed DST usually entails either casting raw pointers or a less optimal conversion through String or Vec. You might also want to implement common traits like Deref<Item = Inner>, AsRef<Inner>, TryFrom<Inner>, Into<Inner>, Debug (via the inner field), and then to also define a corresponding owned type that wraps a mutable buffer, and at a certain point you will find yourself with an absurd amount of code for a few simple types that do almost nothing except convey a guarantee about their contents.

But DST newtypes can be incredibly useful! The standard library uses them for types like CStr, OsStr, and Path, and almost every protocol implementation finds itself dealing with string slices that obey a certain invariant at some point. Take as an example RFC 8984, which defines a JSON representation for calendar data; you need at least the following newtypes of str:

  • Id, a string of at least 1 and at most 255 bytes that matches the regex /[A-Za-z0-9\-\_]*/.
  • TimeZoneId, a string beginning with / that is a valid RFC 5545 paramtext value.
  • ImplicitJsonPointer, an RFC 6901 JSON pointer string without the leading /.
  • JsonPointer, an RFC 6901 JSON pointer string with the leading /.
  • Uri, an RFC3986 URI string.
  • Uid, a unique identifier of roughly 255 bytes which is probably an RFC 4122 UUID.

To implement all these newtypes, their traits, methods, and corresponding owned types by hand would be an enormous waste of time. You should only have to define a function for the invariant each of them obeys, declare that their inner type is str, and have the rest of the code automatically implemented. This is the exact purpose for which dizzy exists.

Attributes

By itself, applying #[derive(DstNewtype)] does almost nothing[^derive-dst-newtype-effects] except error if the invariant attribute has not been set. To generate methods and trait implementations, more attributes must be set using attributes of the forms #[dizzy(<key> = <value>)] and #[dizzy(<key>(...))]. These values can be set in multiple attributes, or within a single attribute delimited by commas.

invariant and error

The invariant attribute must be set to the path of a function, which must take a single parameter of type &T and return either bool or Result<(), E> (for the inner type T and designated error type E. It is the only mandatory attribute, since it describes which values of the inner type are valid as values of the subtype.

If the invariant function returns Result<(), E>, then the attribute error = <ty> (in which <ty> is the type expression corresponding to E) must be set; setting error when the invariant function returns bool will cause a compilation error.

The invariant function is considered successful if it returns either true or Ok(()), and unsuccessful if it returns either false or Err(_). By definition, the subtype represents exactly the set of values of the inner type for which the invariant is successful.

constructor and constructor_mut

The constructor attribute may be set to a function descriptor, in which case a single corresponding method is generated with the following signature:

  • (&Inner) -> Result<&Self, E> if the error attribute is defined as E.
  • (&Inner) -> Option<&Self> otherwise.

This method will call the invariant function on the input and successfully return Some(&Self) or Ok(&Self) if and only if the invariant succeeds. If the invariant returns some Err(_), that value is immediately returned.

The constructor_mut attribute has the same syntax as constructor, with the only difference being that the generated method will have the following signature:

  • (&mut Inner) -> Result<&mut Self, E> if the error attribute is defined as E.
  • (&mut Inner) -> Option<&mut Self> otherwise.

unsafe_constructor and unsafe_constructor_mut

The unsafe_constructor attribute may be set to a function descriptor, in which case a single corresponding method is generated with the signature (&Inner) -> &Self. This method will be unsafe, with the precondition that the invariant function must hold for all inputs to the method.

This method will immediately convert from &Inner to &Self without necessarily checking that the invariant holds, although the invariant function may still be invoked for some optimization levels and feature flags.

The unsafe_constructor_mut has the same syntax as unsafe_constructor, with the only difference being that the generated method will have the signature (&mut Inner) -> &mut Self.

getter

The getter attribute may be set to a function descriptor, in which case a single corresponding method is generated with the signature (&Self) -> &Inner. This method simply returns a reference to the underlying value of the inner type.

owned

The owned attribute may be set to a newtype descriptor, in which case a single corresponding type is generated to represent the owned version of the newtype. This type will have the same generic parameters and where clauses as the newtype, and implements the following traits automatically:

derive

The derive attribute takes the form derive(<ident> (, <ident>)*), in which each identifier corresponds to a trait implementation. The valid identifiers are as follows:

  • AsRef will generate an AsRef<Inner> implementation for Self by returning a reference to the underlying field.
  • CloneBoxed will generate a Clone implementation for Box<Self> by calling <Box<Inner> as From<&Inner>>::from and converting the resulting Box<Inner> into a Box<Self>. It introduces an additional Box<Inner>: for<'a> From<&'a Inner> bound on the resulting implementation.
  • Debug will generate a Debug implementation for Self by passing the underlying value to <Inner as Debug>::fmt. It introduces an additional Inner: Debug bound on the resulting implementation.
  • Deref will generate a Deref implementation for Self with Item = Inner by extracting the underlying value.
  • Into will generate a From<&Self> implementation for &Inner by extracting the underlying value.
  • IntoBoxed will generate From<&Self> and From<&mut Self> implementations for Box<Self> that behave exactly like CloneBoxed except that they take a reference rather than a boxed value. It introduces an additional Box<Inner>: for<'a> From<&'a Inner> bound on the resulting implementations.
  • TryFrom will generate TryFrom<&Inner> and TryFrom<&mut Inner> implementations for Self with the same behaviour as the methods generated by the constructor and constructor_mut attributes. If the invariant function returns bool, the error type will be ().

derive_owned

The derive_owned attribute has the same syntax as the derive attribute, and accepts the following identifiers:

  • Debug will generate a Debug implementation for Owned that matches the one generated for Self.
  • IntoBoxed will generate a From<Owned> implementation for Box<Self> by calling Into::into on the inner field of the owned struct.

Syntax

Some attributes use custom syntax, which is described here. If a syntax term is not defined here, it is conventional Rust syntax and will be defined in the Rust Reference.

Function Descriptor

A function descriptor describes the interface of a function without making reference to its parameters or return type. It has the following EBNF grammar:

function descriptor = { outer attribute }, [ visibility ], [ constness ], identifier ;
constness = "const" ;
outer attribute = ? The Rust Reference §7    ? ;
visibility      = ? The Rust Reference §12.6 ? ;
identifier      = ? The Rust Reference §2.3  ? ;

Newtype Descriptor

A newtype descriptor describes the interface of a newtype wrapping some inner type without making reference to its parameters. It has the following EBNF grammar:

newtype descriptor = { outer attribute }, [ visibility ], identifier, "(", type, ")" ;
outer attribute = ? The Rust Reference §7    ? ;
visibility      = ? The Rust Reference §12.6 ? ;
identifier      = ? The Rust Reference §2.3  ? ;
type            = ? The Rust Reference §10.1 ? ;

Mutability

In general, it is unsound to provide (safe) mutable access to the inner value of a newtype. As a simple example, consider the AsciiStr type defined above: if you had mutable access to the underlying str then you could very easily replace one of the bytes of the string with a byte outside the ASCII range. For this reason, dizzy does not provide facilities for automatically generating mutable getters[^mutable-constructors-are-allowed] or implementing the DerefMut trait, among other potentially unsound operations.

However, it is trivial to convert from &mut Self to &mut Inner within the defining module, and there are clearly cases where this conversion is useful. So the contract for mutability is as follows: you must ensure that all usages of mutable references to the inner value uphold the invariant of the newtype. Avoid interacting with the inner value directly rather than through generated methods as much as possible. Violating the invariant may cause undefined behavior, and will almost certainly cause logical bugs.

[^derive-dst-newtype-effects]: This isn't quite true, since DstNewtype will produce an error if the subject item is not a struct with at least one member, and also if that struct does not have the #[repr(transparent)] attribute. That transparency check is particularly important because the rest of the crate relies on newtype structs having the same representation as their underlying types.

[^mutable-constructors-are-allowed]: The reverse is not true, and mutable constructors (sending &mut Inner to &mut Self) are perfectly sound; consider the example of [str::from_utf8_mut].