shape 0.4.1

A decidable static shape system for JSON-like data
Documentation
# `shape`: a decidable static shape system for JSON

This library implements a Rust-based type system powerful enough to represent
any kind of JSON data, offering type-theoretic operations like simplification,
satisfaction testing, child shape selection, union and intersection shapes,
delayed shape binding, flexible error handling, and more.

> [!CAUTION]
> This library is still in early-stage development, so you should not expect its
> API to be fully stable until the 1.0.0 release.

## Installation

This crate provides a library, so installation means adding it as a dependency
to your `Cargo.toml` file:

```sh
cargo add shape
```

## Documentation

See the [`cargo doc`-generated documentation](https://apollographql.github.io/shape-rs/shape)
for detailed information about the `Shape` struct and `ShapeCase` enum.

## The `Shape` struct

```rust
pub(crate) type Ref<T> = std::sync::Arc<T>;

#[derive(Clone, PartialEq, Eq)]
pub struct Shape {
    case: Ref<ShapeCase>,
    case_hash: u64,
}

impl Shape {
    pub(crate) fn new_from_simplified(case: ShapeCase) -> Shape {
        let case = Ref::new(case);
        let case_hash = case.compute_hash();
        Shape { case, case_hash }
    }
}
```

To support flexible recombinations of shapes and their subshapes, the top-level
`Shape` struct wraps a reference counted `ShapeCase` enum variant. Reference
counting not only simplifies sharing subtrees among different `Shape`
structures, but also prevents `rustc` from complaining about the `Shape` struct
referring to itself without indirection.

Since this `ShapeCase` value is immutable, we can precompute and cache its hash
in the `case_hash` field, then manually implement the `std::hash::Hash` trait to
ignore the `case` field in favor of this cached `case_hash` field:

```rust
impl std::hash::Hash for Shape {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        self.case_hash.hash(state);
    }
}
```

This allows `Shape` hashing to reuse previously computed subtree hashes, rather
than rehashing the entire tree each time a hash is requested.

### Obtaining a `Shape`

Instead of tracking whether a given `ShapeCase` has been simplified or not, we
can simply mandate that `Shape` always wraps simplified shapes.

This invariant is enforced by restricting how `Shape` instances can be
(publicly) created: all `Shape` instances must come either from calling
`ShapeCase::simplify`, or from calling the crate-internal static method
`Shape::new_from_simplified`.

```rust
impl ShapeCase {
    pub fn simplify(self) -> Shape {
        // Since this static method is crate-private, the ShapeCase::simplify
        // is the only public way to create a Shape.
        Shape::new_from_simplified(self.simplify_internal())
    }
}

lazy_static! {
    static ref BOOL_SHAPE: Shape = ShapeCase::Bool(None).simplify();
    // ... additional static Shape instances ...
}

impl Shape {
    // For convenience, Shape provides a number of static helper methods like
    // Shape::bool() and Shape::bool_value(true).
    pub fn bool() -> Shape {
        // A common BOOL_SHAPE can be returned whenever Shape::bool() is called.
        // This does not guarantee every Shape representing the Bool type will
        // share the memory of this BOOL_SHAPE, but it helps reduce unnecessary
        // memory allocations.
        BOOL_SHAPE.clone()
    }

    pub fn bool_value(value: bool) -> Shape {
        ShapeCase::Bool(Some(value)).simplify()
    }

    // Additional helpers...
    pub fn int() -> Shape;
    pub fn int_value(value: i64) -> Shape;
    pub fn float() -> Shape;
    pub fn string() -> Shape;
    pub fn string_value(value: &str) -> Shape;
    pub fn null() -> Shape;
    pub fn none() -> Shape;
    pub fn empty_object() -> Shape;
    pub fn empty_map() -> IndexMap<String, Shape>;
    pub fn object(fields: IndexMap<String, Shape>, rest: Option<Shape>) -> Shape;
    pub fn array(prefix: &[Shape], tail: Option<Shape>) -> Shape;
    pub fn tuple(shapes: &[Shape]) -> Shape;
    pub fn list(of: Shape) -> Shape;
    pub fn empty_array() -> Shape;
    pub fn one(shapes: &[Shape]) -> Shape;
    pub fn all(shapes: &[Shape]) -> Shape;
    pub fn error(message: &str) -> Shape;
    pub fn error_with_range(message: &str, range: OffsetRange) -> Shape;
    pub fn error_with_partial(message: &str, partial: Shape) -> Shape;
    pub fn error_with_partial_and_range(message: &str, partial: Shape, range: OffsetRange) -> Shape;
}
```

Notice that `ShapeCase::simplify` takes ownership of its input `self` value. If
this is not the behavior you want, you can always clone your `ShapeCase` value
before simplifying it, thereby simplifying only the clone. However, you cannot
use an unsimplified `ShapeCase` as a child of another `ShapeCase` value, since
child shapes are always wrapped with `Shape`.

### Testing supershape-subshape acceptance

To test whether the set of all values accepted by one `Shape` is a subset of the
set of all values accepted by another `Shape`, use the
`supershape.accepts(&subshape) -> bool` method, or its inverse
`subshape.satisfies(&supershape) -> bool`.

For example, a `Shape::one` union shape accepts any member shape of the union,

```rust
let int_string_union = Shape::one(&[Shape::int(), Shape::string()]);

assert!(int_string_union.accepts(&Shape::int()));
assert!(int_string_union.accepts(&Shape::string()));
assert!(Shape::int().satisfies(&int_string_union));
assert!(Shape::string().satisfies(&int_string_union));

assert_eq!(Shape::int().satisfies(&int_string_union), false);
assert_eq!(Shape::string().satisfies(&int_string_union), false);
assert_eq!(int_string_union.accepts(&Shape::float()), false);
```

#### Error satisfaction

A `ShapeCase::Error` variant generally represents a failure of shape processing,
but it can also optionally report `Some(partial)` shape information in cases
when there is a likely best guess at what the shape should be.

For this reason, a `ShapeCase::Error` shape either satisfies/accepts itself
trivially (according to `==` equality), or it can define a `partial` shape to
satisfy shapes that accept that `partial` shape.

This `partial: Option<Shape>` field allows errors to provide guidance
(potentially with chains of multiple errors) without interfering with the
accepts/satisfies logic.

```rust
let error = Shape::error_with_partial("Expected an integer", Shape::int());

assert!(error.accepts(&Shape::int()));
assert_eq!(error.accepts(&Shape::float()), false);

assert!(Shape::int_value(42).satisfies(&error));
assert_eq!(Shape::float().satisfies(&error), false);
```

#### The `null` singleton and the `None` shape

`ShapeCase::Null` represents the singleton `null` value found in JSON. It
satisfies and accepts only itself and no other shapes, except unions that allow
`null` as a member, or errors that wrap `null` as a partial shape.

`ShapeCase::None` represents the absence of a value, and is often used to
represent optional values. Like `null`, `None` is satisfied by (accepts) only
itself and no other shapes (except unions that include `None` as a member, or
errors that wrap `None` as a partial shape for some reason).

When either `null` or `None` participate in a `Shape::one` union shape, they are
usually preserved (other than being deduplicated) because they represent
distinct possibilities. However, `::Null` and `::None` do have a noteworthy
difference of behavior when simplifying `::All` intersection shapes.

When `null` participates in a `ShapeCase::All` intersection shape, it "poisons"
the intersection and causes the whole thing to simplify to `null`. This allows a
single intersection member shape to override the whole intersection, which is
useful for reporting certain kinds of error conditions (especially in GraphQL).

By contrast, `None` does not poison intersections, but is simply ignored. This
makes sense if you think of `Shape::all` intersections as _merging_ their member
shapes: when you merge `None` with another shape, you get the other shape back,
because `None` imposes no additional expectations.