# `dizzy`
This crate provides the `DstNewtype` derive macro for defining newtypes of [_dynamically-sized types_ (DSTs)](https://doc.rust-lang.org/reference/dynamically-sized-types.html) and generating standard boilerplate methods and trait implementations.
For example, you could define a strictly ASCII string slice:
```rust
use dizzy::DstNewtype;
#[derive(PartialEq, DstNewtype)]
#[dizzy(invariant = str::is_ascii)]
#[dizzy(constructor = pub const from_str, getter = pub const as_str)]
#[dizzy(derive(Debug, Deref))]
#[repr(transparent)]
struct AsciiStr(str);
assert!(AsciiStr::from_str("").is_some());
assert!(AsciiStr::from_str("dizzy").is_some());
assert_eq!(AsciiStr::from_str("dizzy").unwrap().len(), 5);
assert_eq!(AsciiStr::from_str("λ"), None);
```
Or you could define a non-empty generic slice type:
```rust
use dizzy::DstNewtype;
#[derive(Debug, PartialEq)]
struct EmptySliceError;
const fn slice1_invariant<T>(slice: &[T]) -> Result<(), EmptySliceError> {
match slice.is_empty() {
true => Err(EmptySliceError),
false => Ok(()),
}
}
#[derive(Debug, PartialEq, DstNewtype)]
#[dizzy(invariant = slice1_invariant, error = EmptySliceError)]
#[dizzy(constructor = const new)]
#[dizzy(getter = const get)]
#[repr(transparent)]
struct Slice1<T> {
inner: [T],
}
assert_eq!(Slice1::<()>::new(&[]), Err(EmptySliceError));
assert!(Slice1::new(&[()]).is_ok());
```
More examples can be found in the `tests/` directory.
## Motivation
The central focus of this crate is avoiding boilerplate, and in particular avoiding _unsafe_ boilerplate. When working with DST newtypes, it is often necessary to use unsafe code for even the most basic operations: construction requires [`core::mem::transmute`] and cloning a boxed DST usually entails either casting raw pointers or a less optimal conversion through `String` or `Vec`. You might also want to implement common traits like `Deref<Item = Inner>`, `AsRef<Inner>`, `TryFrom<Inner>`, `Into<Inner>`, `Debug` (via the inner field), and then to also define a corresponding owned type that wraps a mutable buffer, and at a certain point you will find yourself with an absurd amount of code for a few simple types that do almost nothing except convey a guarantee about their contents.
But DST newtypes can be incredibly useful! The standard library uses them for types like `CStr`, `OsStr`, and `Path`, and almost every protocol implementation finds itself dealing with string slices that obey a certain invariant at some point. Take as an example [RFC 8984](https://datatracker.ietf.org/doc/html/rfc8984), which defines a JSON representation for calendar data; you need at least the following newtypes of `str`:
- `Id`, a string of at least 1 and at most 255 bytes that matches the regex `/[A-Za-z0-9\-\_]*/`.
- `TimeZoneId`, a string beginning with `/` that is a valid RFC 5545 `paramtext` value.
- `ImplicitJsonPointer`, an RFC 6901 JSON pointer string without the leading `/`.
- `JsonPointer`, an RFC 6901 JSON pointer string *with* the leading `/`.
- `Uri`, an RFC3986 URI string.
- `Uid`, a unique identifier of roughly 255 bytes which is probably an RFC 4122 UUID.
To implement all these newtypes, their traits, methods, and corresponding owned types by hand would be an enormous waste of time. You should only have to define a function for the invariant each of them obeys, declare that their inner type is `str`, and have the rest of the code automatically implemented. This is the exact purpose for which `dizzy` exists.
## Attributes
By itself, applying `#[derive(DstNewtype)]` does almost nothing[^derive-dst-newtype-effects] except error if the `invariant` attribute has not been set. To generate methods and trait implementations, more attributes must be set using attributes of the forms `#[dizzy(<key> = <value>)]` and `#[dizzy(<key>(...))]`. These values can be set in multiple attributes, or within a single attribute delimited by commas.
### `invariant` and `error`
The `invariant` attribute must be set to the [path](https://doc.rust-lang.org/reference/expressions/path-expr.html) of a function, which must take a single parameter of type `&T` and return either `bool` or `Result<(), E>` (for the inner type `T` and designated error type `E`. It is the only mandatory attribute, since it describes which values of the inner type are valid as values of the subtype.
If the invariant function returns `Result<(), E>`, then the attribute `error = <ty>` (in which `<ty>` is the [type expression](https://doc.rust-lang.org/reference/types.html#r-type.name) corresponding to `E`) must be set; setting `error` when the invariant function returns `bool` will cause a compilation error.
The invariant function is considered *successful* if it returns either `true` or `Ok(())`, and *unsuccessful* if it returns either `false` or `Err(_)`. By definition, the subtype represents exactly the set of values of the inner type for which the invariant is successful.
### `constructor` and `constructor_mut`
The `constructor` attribute may be set to a [function descriptor](#function-descriptor), in which case a single corresponding method is generated with the following signature:
- `(&Inner) -> Result<&Self, E>` if the `error` attribute is defined as `E`.
- `(&Inner) -> Option<&Self>` otherwise.
This method will call the [invariant](#invariant-and-error) function on the input and successfully return `Some(&Self)` or `Ok(&Self)` if and only if the invariant succeeds. If the invariant returns some `Err(_)`, that value is immediately returned.
The `constructor_mut` attribute has the same syntax as `constructor`, with the only difference being that the generated method will have the following signature:
- `(&mut Inner) -> Result<&mut Self, E>` if the `error` attribute is defined as `E`.
- `(&mut Inner) -> Option<&mut Self>` otherwise.
### `unsafe_constructor` and `unsafe_constructor_mut`
The `unsafe_constructor` attribute may be set to a [function descriptor](#function-descriptor), in which case a single corresponding method is generated with the signature `(&Inner) -> &Self`. This method will be unsafe, with the precondition that the invariant function **must** hold for all inputs to the method.
This method will immediately convert from `&Inner` to `&Self` without necessarily checking that the invariant holds, although the invariant function may still be invoked for some optimization levels and feature flags.
The `unsafe_constructor_mut` has the same syntax as `unsafe_constructor`, with the only difference being that the generated method will have the signature `(&mut Inner) -> &mut Self`.
### `getter`
The `getter` attribute may be set to a [function descriptor](#function-descriptor), in which case a single corresponding method is generated with the signature `(&Self) -> &Inner`. This method simply returns a reference to the underlying value of the inner type.
### `owned`
The `owned` attribute may be set to a [newtype descriptor](#newtype-descriptor), in which case a single corresponding type is generated to represent the _owned_ version of the newtype. This type will have the same generic parameters and where clauses as the newtype, and implements the following traits automatically:
- [`impl Clone for Owned`](core::clone::Clone)
- [`impl From<&Newtype> for Owned`](core::convert::From)
- [`impl From<&mut Newtype> for Owned`](core::convert::TryFrom)
- [`impl Deref<Target = Newtype> for Owned`](core::ops::Deref)
- [`impl DerefMut for Owned`](core::ops::DerefMut)
- [`impl AsRef<Newtype> for Owned`](core::convert::AsRef)
- [`impl AsMut<Newtype> for Owned`](core::convert::AsMut)
- [`impl Borrow<Newtype> for Owned`](core::borrow::Borrow)
- [`impl ToOwned<Owned = Owned> for Newtype`](std::borrow::ToOwned)
- [`impl From<&'a Newtype> for Cow<'a, Newtype>`](std::borrow::Cow)
### `derive`
The `derive` attribute takes the form `derive(<ident> (, <ident>)*)`, in which each identifier corresponds to a trait implementation. The valid identifiers are as follows:
- `AsRef` will generate an `AsRef<Inner>` implementation for `Self` by returning a reference to the underlying field.
- `CloneBoxed` will generate a `Clone` implementation for `Box<Self>` by calling `<Box<Inner> as From<&Inner>>::from` and converting the resulting `Box<Inner>` into a `Box<Self>`. It introduces an additional `Box<Inner>: for<'a> From<&'a Inner>` bound on the resulting implementation.
- `Debug` will generate a `Debug` implementation for `Self` by passing the underlying value to `<Inner as Debug>::fmt`. It introduces an additional `Inner: Debug` bound on the resulting implementation.
- `Deref` will generate a `Deref` implementation for `Self` with `Item = Inner` by extracting the underlying value.
- `Into` will generate a `From<&Self>` implementation for `&Inner` by extracting the underlying value.
- `IntoBoxed` will generate `From<&Self>` and `From<&mut Self>` implementations for `Box<Self>` that behave exactly like `CloneBoxed` except that they take a reference rather than a boxed value. It introduces an additional `Box<Inner>: for<'a> From<&'a Inner>` bound on the resulting implementations.
- `TryFrom` will generate `TryFrom<&Inner>` and `TryFrom<&mut Inner>` implementations for `Self` with the same behaviour as the methods generated by the `constructor` and `constructor_mut` attributes. If the invariant function returns `bool`, the error type will be `()`.
### `derive_owned`
The `derive_owned` attribute has the same syntax as the `derive` attribute, and accepts the following identifiers:
- `Debug` will generate a `Debug` implementation for `Owned` that matches the one generated for `Self`.
- `IntoBoxed` will generate a `From<Owned>` implementation for `Box<Self>` by calling `Into::into` on the inner field of the owned struct.
## Syntax
Some attributes use custom syntax, which is described here. If a syntax term is not defined here, it is conventional Rust syntax and will be defined in [the Rust Reference](https://doc.rust-lang.org/stable/reference/).
### Function Descriptor
A _function descriptor_ describes the interface of a function without making reference to its parameters or return type. It has the following EBNF grammar:
```custom,{class=ebnf}
function descriptor = { outer attribute }, [ visibility ], [ constness ], identifier ;
constness = "const" ;
outer attribute = ? The Rust Reference §7 ? ;
visibility = ? The Rust Reference §12.6 ? ;
identifier = ? The Rust Reference §2.3 ? ;
```
### Newtype Descriptor
A _newtype descriptor_ describes the interface of a newtype wrapping some inner type without making reference to its parameters. It has the following EBNF grammar:
```custom,{class=ebnf}
newtype descriptor = { outer attribute }, [ visibility ], identifier, "(", type, ")" ;
outer attribute = ? The Rust Reference §7 ? ;
visibility = ? The Rust Reference §12.6 ? ;
identifier = ? The Rust Reference §2.3 ? ;
type = ? The Rust Reference §10.1 ? ;
```
## Mutability
In general, it is unsound to provide (safe) mutable access to the inner value of a newtype. As a simple example, consider the `AsciiStr` type defined above: if you had mutable access to the underlying `str` then you could very easily replace one of the bytes of the string with a byte outside the ASCII range. For this reason, `dizzy` does not provide facilities for automatically generating mutable getters[^mutable-constructors-are-allowed] or implementing the [`DerefMut`](core::ops::DerefMut) trait, among other potentially unsound operations.
However, it is trivial to convert from `&mut Self` to `&mut Inner` within the defining module, and there are clearly cases where this conversion is useful. So the contract for mutability is as follows: you **must** ensure that all usages of mutable references to the inner value uphold the invariant of the newtype. Avoid interacting with the inner value directly rather than through generated methods as much as possible. Violating the invariant may cause undefined behavior, and will almost certainly cause logical bugs.
[^derive-dst-newtype-effects]: This isn't quite true, since `DstNewtype` will produce an error if the subject item is not a struct with at least one member, and also if that struct does not have the `#[repr(transparent)]` attribute. That transparency check is particularly important because the rest of the crate relies on newtype structs having the same representation as their underlying types.
[^mutable-constructors-are-allowed]: The reverse is not true, and mutable constructors (sending `&mut Inner` to `&mut Self`) are perfectly sound; consider the example of [`str::from_utf8_mut`].