# `shape`: a decidable static shape system for JSON
This library implements a Rust-based type system powerful enough to represent
any kind of JSON data, offering type-theoretic operations like simplification,
satisfaction testing, child shape selection, union and intersection shapes,
delayed shape binding, flexible error handling, and more.
> [!CAUTION]
> This library is still in early-stage development, so you should not expect its
> API to be fully stable until the 1.0.0 release.
## Installation
This crate provides a library, so installation means adding it as a dependency
to your `Cargo.toml` file:
```sh
cargo add shape
```
## Documentation
See the [`cargo doc`-generated documentation](https://apollographql.github.io/shape-rs/shape)
for detailed information about the `Shape` struct and `ShapeCase` enum.
## The `Shape` struct
```rust
pub(crate) type Ref<T> = std::sync::Arc<T>;
#[derive(Clone, PartialEq, Eq)]
pub struct Shape {
case: Ref<ShapeCase>,
case_hash: u64,
}
impl Shape {
pub(crate) fn new_from_simplified(case: ShapeCase) -> Shape {
let case = Ref::new(case);
let case_hash = case.compute_hash();
Shape { case, case_hash }
}
}
```
To support flexible recombinations of shapes and their subshapes, the top-level
`Shape` struct wraps a reference counted `ShapeCase` enum variant. Reference
counting not only simplifies sharing subtrees among different `Shape`
structures, but also prevents `rustc` from complaining about the `Shape` struct
referring to itself without indirection.
Since this `ShapeCase` value is immutable, we can precompute and cache its hash
in the `case_hash` field, then manually implement the `std::hash::Hash` trait to
ignore the `case` field in favor of this cached `case_hash` field:
```rust
impl std::hash::Hash for Shape {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.case_hash.hash(state);
}
}
```
This allows `Shape` hashing to reuse previously computed subtree hashes, rather
than rehashing the entire tree each time a hash is requested.
### Obtaining a `Shape`
Instead of tracking whether a given `ShapeCase` has been simplified or not, we
can simply mandate that `Shape` always wraps simplified shapes.
This invariant is enforced by restricting how `Shape` instances can be
(publicly) created: all `Shape` instances must come either from calling
`ShapeCase::simplify`, or from calling the crate-internal static method
`Shape::new_from_simplified`.
```rust
impl ShapeCase {
pub fn simplify(self) -> Shape {
// Since this static method is crate-private, the ShapeCase::simplify
// is the only public way to create a Shape.
Shape::new_from_simplified(self.simplify_internal())
}
}
lazy_static! {
static ref BOOL_SHAPE: Shape = ShapeCase::Bool(None).simplify();
// ... additional static Shape instances ...
}
impl Shape {
// For convenience, Shape provides a number of static helper methods like
// Shape::bool() and Shape::bool_value(true).
pub fn bool() -> Shape {
// A common BOOL_SHAPE can be returned whenever Shape::bool() is called.
// This does not guarantee every Shape representing the Bool type will
// share the memory of this BOOL_SHAPE, but it helps reduce unnecessary
// memory allocations.
BOOL_SHAPE.clone()
}
pub fn bool_value(value: bool) -> Shape {
ShapeCase::Bool(Some(value)).simplify()
}
// Additional helpers...
pub fn int() -> Shape;
pub fn int_value(value: i64) -> Shape;
pub fn float() -> Shape;
pub fn string() -> Shape;
pub fn string_value(value: &str) -> Shape;
pub fn null() -> Shape;
pub fn none() -> Shape;
pub fn empty_object() -> Shape;
pub fn empty_map() -> IndexMap<String, Shape>;
pub fn object(fields: IndexMap<String, Shape>, rest: Option<Shape>) -> Shape;
pub fn array(prefix: &[Shape], tail: Option<Shape>) -> Shape;
pub fn tuple(shapes: &[Shape]) -> Shape;
pub fn list(of: Shape) -> Shape;
pub fn empty_array() -> Shape;
pub fn one(shapes: &[Shape]) -> Shape;
pub fn all(shapes: &[Shape]) -> Shape;
pub fn error(message: &str) -> Shape;
pub fn error_with_range(message: &str, range: OffsetRange) -> Shape;
pub fn error_with_partial(message: &str, partial: Shape) -> Shape;
pub fn error_with_partial_and_range(message: &str, partial: Shape, range: OffsetRange) -> Shape;
}
```
Notice that `ShapeCase::simplify` takes ownership of its input `self` value. If
this is not the behavior you want, you can always clone your `ShapeCase` value
before simplifying it, thereby simplifying only the clone. However, you cannot
use an unsimplified `ShapeCase` as a child of another `ShapeCase` value, since
child shapes are always wrapped with `Shape`.
### Testing supershape-subshape acceptance
To test whether the set of all values accepted by one `Shape` is a subset of the
set of all values accepted by another `Shape`, use the
`supershape.accepts(&subshape) -> bool` method, or its inverse
`subshape.satisfies(&supershape) -> bool`.
For example, a `Shape::one` union shape accepts any member shape of the union,
```rust
let int_string_union = Shape::one(&[Shape::int(), Shape::string()]);
assert!(int_string_union.accepts(&Shape::int()));
assert!(int_string_union.accepts(&Shape::string()));
assert!(Shape::int().satisfies(&int_string_union));
assert!(Shape::string().satisfies(&int_string_union));
assert_eq!(Shape::int().satisfies(&int_string_union), false);
assert_eq!(Shape::string().satisfies(&int_string_union), false);
assert_eq!(int_string_union.accepts(&Shape::float()), false);
```
#### Error satisfaction
A `ShapeCase::Error` variant generally represents a failure of shape processing,
but it can also optionally report `Some(partial)` shape information in cases
when there is a likely best guess at what the shape should be.
For this reason, a `ShapeCase::Error` shape either satisfies/accepts itself
trivially (according to `==` equality), or it can define a `partial` shape to
satisfy shapes that accept that `partial` shape.
This `partial: Option<Shape>` field allows errors to provide guidance
(potentially with chains of multiple errors) without interfering with the
accepts/satisfies logic.
```rust
let error = Shape::error_with_partial("Expected an integer", Shape::int());
assert!(error.accepts(&Shape::int()));
assert_eq!(error.accepts(&Shape::float()), false);
assert!(Shape::int_value(42).satisfies(&error));
assert_eq!(Shape::float().satisfies(&error), false);
```
#### The `null` singleton and the `None` shape
`ShapeCase::Null` represents the singleton `null` value found in JSON. It
satisfies and accepts only itself and no other shapes, except unions that allow
`null` as a member, or errors that wrap `null` as a partial shape.
`ShapeCase::None` represents the absence of a value, and is often used to
represent optional values. Like `null`, `None` is satisfied by (accepts) only
itself and no other shapes (except unions that include `None` as a member, or
errors that wrap `None` as a partial shape for some reason).
When either `null` or `None` participate in a `Shape::one` union shape, they are
usually preserved (other than being deduplicated) because they represent
distinct possibilities. However, `::Null` and `::None` do have a noteworthy
difference of behavior when simplifying `::All` intersection shapes.
When `null` participates in a `ShapeCase::All` intersection shape, it "poisons"
the intersection and causes the whole thing to simplify to `null`. This allows a
single intersection member shape to override the whole intersection, which is
useful for reporting certain kinds of error conditions (especially in GraphQL).
By contrast, `None` does not poison intersections, but is simply ignored. This
makes sense if you think of `Shape::all` intersections as _merging_ their member
shapes: when you merge `None` with another shape, you get the other shape back,
because `None` imposes no additional expectations.