# `shape`: a portable static type system for JSON-compatible data
This library implements a Rust-based type system that can represent any kind of
JSON data, offering type-theoretic operations like simplification, acceptance
testing and validation, child shape selection, union and intersection shapes,
delayed shape binding, namespaces, automatic name propagation, copy-on-write
semantics, error handling, and more.
The term "shape" is used here as a very close synonym to "type" but with a focus
on structured JSON value types rather than abstract, object-oriented, functional
or higher-order types. The core idea is that both shape and type denote an
abstract (possibly infinite) set of possible values that satisfy a given
shape/type, and we can ask interesting questions about the relationships between
these sets of values. For example, when we simplify a shape, we are "asking"
whether another smaller representation of the shape exists that denotes the same
set of possible values.
> [!CAUTION]
> This library is still in early-stage development, so you should not expect its
> API to be fully stable until the 1.0.0 release.
## Installation
This crate provides a library, so installation means adding it as a dependency
to your `Cargo.toml` file:
```sh
cargo add shape
```
## Documentation
See the [`cargo doc`-generated documentation](https://apollographql.github.io/shape-rs/shape)
for detailed information about the `Shape` struct and `ShapeCase` enum.
## The `Shape` struct
```rust
pub(crate) type Ref<T> = std::sync::Arc<T>;
#[derive(Clone, Debug, Eq)]
pub struct Shape {
case: Ref<ShapeCase>,
meta: Ref<IndexSet<ShapeMeta>>,
}
```
To support recombinations of shapes and their subshapes, the top-level
`Shape` struct wraps a reference counted `ShapeCase` enum variant. Reference
counting not only simplifies sharing subtrees among different `Shape`
structures, but also prevents `rustc` from complaining about the `Shape` struct
referring to itself without indirection.
The `meta` field stores metadata about the shape, such as source code locations
where the shape was defined or derived from. This metadata is preserved when
shapes are combined or transformed.
The `Shape` struct implements `Hash` and `PartialEq` based on its `case` field,
ignoring the metadata:
```rust
impl Hash for Shape {
fn hash<H: Hasher>(&self, state: &mut H) {
self.case.hash(state);
}
}
impl PartialEq for Shape {
fn eq(&self, other: &Self) -> bool {
self.case == other.case
}
}
```
### Copy-on-write semantics
The `shape` library leverages `Arc` (Atomically Reference Counted) pointers to
provide efficient copy-on-write semantics for `Shape` instances. Since shapes are
immutable after creation, multiple `Shape` instances can safely share references
to the same underlying `ShapeCase` data without copying.
```rust
pub(crate) type Ref<T> = std::sync::Arc<T>;
```
When a shape needs to be modified (such as during name propagation or metadata
updates), the library uses `Arc::make_mut` to perform copy-on-write operations.
This ensures that:
- Shapes are only copied when actually modified, not when simply accessed
- Multiple shapes can share common subtrees efficiently
- Thread safety is maintained through `Arc`'s atomic reference counting
- Memory usage is minimized through structural sharing
This copy-on-write approach reduces memory usage when working with large
shape hierarchies or when creating many similar shapes that share common
substructures.
### Obtaining a `Shape`
Instead of tracking whether a given `ShapeCase` has been simplified or not, we
can simply mandate that `Shape` always wraps simplified shapes.
This invariant is enforced by restricting how `Shape` instances can be
(publicly) created: all `Shape` instances must come from calling the `Shape::new`
method with a simplified `ShapeCase` and associated metadata.
```rust
impl Shape {
pub(crate) fn new(case: ShapeCase, locations: impl IntoIterator<Item = Location>) -> Shape {
let meta = locations
.into_iter()
.map(ShapeMeta::Loc)
.collect::<IndexSet<_>>();
Shape {
case: Ref::new(case),
meta: Ref::new(meta),
}
}
// For convenience, Shape provides a number of static helper methods that include location tracking
pub fn bool(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn bool_value(value: bool, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn int(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn int_value(value: i64, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn float(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn string(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn string_value(value: &str, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn null(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn none(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn empty_object(locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn empty_map() -> IndexMap<String, Shape>;
pub fn object(fields: IndexMap<String, Shape>, rest: Shape, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn array(prefix: impl IntoIterator<Item = Shape>, tail: Shape, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn tuple(shapes: impl IntoIterator<Item = Shape>, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn list(of: Shape, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn one(shapes: impl IntoIterator<Item = Shape>, locations: impl IntoIterator<Item = Location>) -> Shape;
pub fn all(shapes: impl IntoIterator<Item = Shape>, locations: impl IntoIterator<Item = Location>) -> Shape;
}
```
The `impl IntoIterator<Item = ...>` parameters are intended to allow maximum
flexibility of iterator argument passing style, including `[shape1, shape2,
...]`, vector slices, etc.
### Shape validation and acceptance testing
The library provides two primary methods for checking shape compatibility:
1. **`shape.validate(&other) -> Option<ShapeMismatch>`** - Returns detailed error information when shapes don't match, including a hierarchy of causes explaining exactly where validation failed.
2. **`shape.accepts(&other) -> bool`** - A convenience method that returns `true` if validation succeeds (equivalent to `shape.validate(&other).is_none()`).
```rust
# use shape::{Shape, ShapeMismatch};
let expected = Shape::string([]);
let received = Shape::int([]);
// Quick boolean check
assert!(!expected.accepts(&received));
// Detailed validation information
let mismatch = expected.validate(&received);
assert_eq!(
mismatch,
Some(ShapeMismatch {
expected: expected.clone(),
received: received.clone(),
causes: vec![], // Leaf-level mismatch has no sub-causes
})
);
```
For example, a `Shape::one` union shape accepts any member shape of the union:
```rust
let int_string_union = Shape::one([Shape::int([]), Shape::string([])], []);
assert!(int_string_union.accepts(&Shape::int([])));
assert!(int_string_union.accepts(&Shape::string([])));
assert!(!int_string_union.accepts(&Shape::float([])));
// Using validate for more details
let mismatch = int_string_union.validate(&Shape::float([]));
assert!(mismatch.is_some()); // Float doesn't match Int or String
```
#### Error satisfaction
A `ShapeCase::Error` variant generally represents a failure of shape processing,
but it can also optionally report `Some(partial)` shape information in cases
when there is a likely best guess at what the shape should be.
An `Error` with `Some(partial)` behaves as much like that partial shape as
possible - it accepts the same shapes, supports the same field/item access, etc.
However, unlike regular shapes, errors are never deduplicated during
simplification, ensuring each error's diagnostic message is preserved.
This `partial: Option<Shape>` field allows errors to provide guidance
(potentially with chains of multiple errors) without interfering with the
accepts logic.
```rust
let error = Shape::error_with_partial("Expected an integer", Shape::int([]), []);
assert!(error.accepts(&Shape::int([])));
assert!(!error.accepts(&Shape::float([])));
assert!(error.accepts(&Shape::int_value(42, [])));
assert!(!error.accepts(&Shape::float([])));
```
#### The `null` singleton and the `None` shape
`ShapeCase::Null` represents the singleton `null` value found in JSON. It
accepts only itself and no other shapes, except unions that allow
`null` as a member, or errors that wrap `null` as a partial shape.
`ShapeCase::None` represents the absence of a value, and is often used to
represent optional values. Like `null`, `None` is satisfied by (accepts) only
itself and no other shapes (except unions that include `None` as a member, or
errors that wrap `None` as a partial shape for some reason).
When either `null` or `None` participate in a `Shape::one` union shape, they are
usually preserved (other than being deduplicated) because they represent
distinct possibilities. However, `::Null` and `::None` do have a noteworthy
difference of behavior when simplifying `::All` intersection shapes.
When `null` participates in a `ShapeCase::All` intersection shape, it "poisons"
the intersection and causes the whole thing to simplify to `null`. This allows a
single intersection member shape to override the whole intersection, which is
useful for reporting certain kinds of error conditions (especially in GraphQL).
By contrast, `None` does not poison intersections, but is simply ignored. This
makes sense if you think of `Shape::all` intersections as _merging_ their member
shapes: when you merge `None` with another shape, you get the other shape back,
because `None` imposes no additional expectations.
## Namespaces and named shapes
The `shape` library provides a namespace system for managing collections
of named shapes and enabling recursive type definitions. Namespaces solve two
important problems: they allow shapes to reference themselves (creating recursive
types), and they provide automatic name propagation throughout shape hierarchies.
### Basic namespace usage
A `Namespace` is a collection of named `Shape`s that supports a two-stage
lifecycle:
```rust
# use shape::{Shape, Namespace};
// Create an unfinalized namespace
let mut namespace = Namespace::new();
// Insert a shape with a name - this triggers automatic name propagation
let id_shape = Shape::one([Shape::string([]), Shape::int([])], []);
let named_shape = namespace.insert("ID", id_shape);
// The shape now has names throughout its hierarchy
assert_eq!(named_shape.pretty_print(), "One<String, Int>");
assert_eq!(
named_shape.pretty_print_with_names(),
"One<String (aka ID), Int (aka ID)> (aka ID)"
);
// Finalize the namespace to resolve all name references
let final_namespace = namespace.finalize();
```
### Automatic name propagation
When a shape is inserted into a namespace, the library automatically propagates
derived child names to all nested shapes within the hierarchy. This means that
every field, array element, and nested structure receives contextual naming
based on its position:
```rust
# use shape::{Shape, Namespace};
let user_shape = Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("name".to_string(), Shape::string([]));
fields.insert("age".to_string(), Shape::int([]));
fields.insert("contacts".to_string(), Shape::list(Shape::string([]), []));
fields
},
Shape::none([]),
[]
);
let mut namespace = Namespace::new();
let named_user = namespace.insert("User", user_shape);
// Names are automatically propagated to nested shapes:
// - User.name for the name field
// - User.age for the age field
// - User.contacts for the contacts array
// - User.contacts.* for elements in the contacts array
assert_eq!(
named_user.pretty_print_with_names(),
r#"{
age: Int (aka User.age),
contacts: List<String (aka User.contacts.*)> (aka User.contacts),
name: String (aka User.name),
} (aka User)"#
);
```
`Name`s can only be assigned to `Shape`s through `Namespace` insertion.
### Recursive type definitions
`Namespace`s enable recursive type definitions through `ShapeCase::Name`
references. A shape can reference other shapes in the namespace by name,
including itself:
```rust,no_run
// Define a recursive JSON type that can contain arrays and objects of itself
let json_shape = Shape::one([
Shape::null([]),
Shape::bool([]),
Shape::string([]),
Shape::int([]),
Shape::float([]),
Shape::dict(Shape::name("JSON", []), []), // Objects containing JSON values
Shape::list(Shape::name("JSON", []), []), // Arrays containing JSON values
], []);
let mut namespace = Namespace::new();
namespace.insert("JSON", json_shape);
let final_namespace = namespace.finalize();
let json_type = final_namespace.get("JSON").unwrap();
// The resolved type can now handle recursive structures
assert!(json_type.accepts(&Shape::null([])));
assert!(json_type.accepts(&Shape::list(Shape::string([]), [])));
assert!(json_type.accepts(&Shape::dict(Shape::int([]), [])));
```
### `Namespace` finalization
`Namespace`s have two distinct phases:
1. **`Namespace<NotFinal>`**: Allows mutations like `insert()` and `extend()`
2. **`Namespace<Final>`**: Read-only with all `ShapeCase::Name` references resolved
The finalization process:
- Ensures exclusive ownership of all shape references for thread safety
- Resolves `ShapeCase::Name` references by updating their `WeakScope` to provide
weak references to all named shapes
- Allows recursive type self-references that expand incrementally/lazily to
avoid infinite cycles
- Prevents `Arc` memory leaks by using only weak references (`WeakScope`) from
`ShapeCase::Name` to the target `Shape` defined in some `Namespace<Final>`
```rust,no_run
let mut namespace = Namespace::new();
namespace.insert("MyType", Shape::string([]));
namespace.extend(&other_namespace); // Can merge with other namespaces
let final_namespace = namespace.finalize(); // No more mutations allowed
let my_type = final_namespace.get("MyType"); // Resolved shape available
```
This namespace system, combined with automatic name propagation and copy-on-write
semantics, enables complex type modeling scenarios while maintaining performance
and thread safety.
## Metadata management and MergeSet
The `shape` library implements a metadata management system that tracks
provenance information (source locations and names) separately from the logical
structure of shapes (as defined by the `ShapeCase` enum). This separation allows
shapes to be compared and hashed based purely on their structural content, while
preserving all metadata for debugging, error reporting, and tooling purposes.
### The challenge: preserving metadata during simplification
One risk of using only logical structure for equality/hashing and excluding
metadata occurs when storing such data structures in a set or map. Structurally
equivalent shapes are generally deduplicated by these data structures, typically
clobbering/discarding metadata from one side of the collision.
Since `Shape` deduplication is a common and important simplification step, when
shapes are combined or simplified, the library uses a data structure called a
`MergeSet` to ensure metadata is merged from all contributing sources. For
example, the `null` shape may end up in a `ShapeCase::One` union from multiple
different sources, carrying different names, locations, and other metadata:
```rust
# use shape::{Shape, name::Namespace};
let mut ns = Namespace::new();
// Insert shapes with different names - this gives them name metadata
let nullable_id = ns.insert("NullableID", Shape::one([Shape::null([]), Shape::int([])], []));
let nullable_str = ns.insert("NullableString", Shape::one([Shape::null([]), Shape::string([])], []));
let optional = ns.insert("Optional", Shape::one([Shape::null([]), Shape::bool([])], []));
// Each union contains a null with different name metadata
assert_eq!(
nullable_id.pretty_print_with_names(),
"One<null (aka NullableID), Int (aka NullableID)> (aka NullableID)"
);
assert_eq!(
nullable_str.pretty_print_with_names(),
"One<null (aka NullableString), String (aka NullableString)> (aka NullableString)"
);
assert_eq!(
optional.pretty_print_with_names(),
"One<null (aka Optional), Bool (aka Optional)> (aka Optional)"
);
// Now create a union of these already-named shapes
let combined = Shape::one([nullable_id, nullable_str, optional], []);
// The union deduplicates structurally identical shapes.
// All three nulls merge into one, but it retains all the names via MergeSet.
assert_eq!(combined.pretty_print(), "One<null, Int, String, Bool>");
// With names shown, the output reveals how the shapes were deduplicated:
// The null has accumulated all three names via MergeSet
assert_eq!(
combined.pretty_print_with_names(),
r#"One<
null (aka NullableID, NullableString, Optional),
Int (aka NullableID),
String (aka NullableString),
Bool (aka Optional),
>"#
);
```
### The MetaMergeable trait
The `MetaMergeable` trait enables efficient in-place merging of metadata between
structurally identical shapes. When duplicate shapes are detected during
operations like union or intersection formation, their metadata can be combined
without duplicating the structural information:
```rust,ignore
pub trait MetaMergeable {
/// Merge metadata from `other` into `self`, returning true if any changes occurred
fn merge_meta_from(&mut self, other: &Self) -> bool;
}
```
Both `Shape` and `Name` implement `MetaMergeable`, allowing them to:
- Combine location sets from multiple sources
- Merge name information across shape transformations
- Preserve all provenance data during simplification
- Perform copy-on-write updates only when metadata actually changes
### MergeSet: deduplication with metadata preservation
`MergeSet<T>` is a specialized collection that combines the deduplication
properties of `IndexSet` with automatic metadata merging. When a duplicate item
is inserted, instead of being ignored, its metadata is merged with the existing
item:
```rust
# use shape::Shape;
# use shape::location::Location;
let loc1 = Location::new("source1", 10, 0);
let loc2 = Location::new("source2", 20, 0);
let loc3 = Location::new("source3", 30, 0);
// Create shapes with different source locations but identical structure
let shapes = [
Shape::string([loc1]),
Shape::int([loc2]),
Shape::string([loc3]), // Same structure as first, different location
];
let union = Shape::one(shapes, []);
// The resulting union contains two distinct shapes (String and Int),
// but the String shape now has location information from both loc1 and loc3
assert_eq!(union.pretty_print(), "One<String, Int>");
```
This system is used internally by:
- **`Shape::one()`** - Unions that deduplicate member shapes while preserving all metadata
- **`Shape::all()`** - Intersections that merge compatible shapes and their provenance
- **`Namespace` operations** - When inserting named `Shape`s into a
`Namespace<NotFinal>`, multiple shapes added with the same name are merged
using `Shape::all`, which often leads to merging the metadata of
`ShapeCase`-equivalent shapes in the resulting intersection
## The ShapeVisitor trait
The `ShapeVisitor` trait provides a visitor pattern for traversing and analyzing
`Shape` structures. It enables custom logic for processing different
`ShapeCase` variants while maintaining full control over traversal decisions,
especially around recursive name resolution.
### Visitor methods and default fallback
The trait defines visit methods for each `ShapeCase` variant, all with default
implementations that delegate to a `default()` method:
```rust,ignore
pub trait ShapeVisitor<Output = ()> {
type Error: std::error::Error;
type Output;
fn default(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_bool(&mut self, shape: &Shape, value: &Option<bool>) -> Result<Self::Output, Self::Error>;
fn visit_string(&mut self, shape: &Shape, value: &Option<String>) -> Result<Self::Output, Self::Error>;
fn visit_int(&mut self, shape: &Shape, value: &Option<i64>) -> Result<Self::Output, Self::Error>;
fn visit_float(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_null(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_array(&mut self, shape: &Shape, prefix: &[Shape], tail: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_object(&mut self, shape: &Shape, fields: &IndexMap<String, Shape>, rest: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_one(&mut self, shape: &Shape, one: &One) -> Result<Self::Output, Self::Error>;
fn visit_all(&mut self, shape: &Shape, all: &All) -> Result<Self::Output, Self::Error>;
fn visit_name(&mut self, shape: &Shape, name: &Name, weak: &WeakScope) -> Result<Self::Output, Self::Error>;
fn visit_unknown(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_none(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
fn visit_error(&mut self, shape: &Shape, err: &Error) -> Result<Self::Output, Self::Error>;
}
```
Each visit method receives both the complete `Shape` (with metadata) and the
inner data specific to that variant. This design allows visitors to access both
structural information and provenance data during traversal.
### Critical design decision: name resolution
An important design choice of `ShapeVisitor` lies in how it handles
`ShapeCase::Name` variants. **The visitor does NOT automatically resolve name
bindings**, even when a `WeakScope` is available. This design prevents infinite
loops when traversing recursive type definitions:
```rust,ignore
// In the visitor implementation:
ShapeCase::Name(name, weak) => visitor.visit_name(self, name, weak),
// Notice: no automatic call to weak.upgrade(name) here!
```
The comment in the source code explains:
```rust
// It's tempting to perform that resolution here unconditionally,
// but then the visitor could fall into an infinitely deep cycle, so
// we have to let the implementation of visit_name decide whether to
// proceed further.
```
This decision gives `visit_name` implementations complete control over whether
and how to resolve named references, enabling cycle-aware traversal strategies.
### Example: detecting unbound names
Here's a practical visitor that finds `ShapeCase::Name` variants that cannot be
resolved in their `WeakScope`, which is useful for validating shape definitions
before namespace finalization:
```rust
# use shape::{Shape, ShapeVisitor, name::Name, name::WeakScope};
# use std::collections::HashSet;
struct UnboundNameDetector {
unbound_names: HashSet<String>,
}
#[derive(Debug)]
struct DetectionError;
impl std::fmt::Display for DetectionError {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "Detection error")
}
}
impl std::error::Error for DetectionError {}
impl UnboundNameDetector {
fn new() -> Self {
Self {
unbound_names: HashSet::new(),
}
}
fn into_unbound_names(self) -> HashSet<String> {
self.unbound_names
}
}
impl ShapeVisitor for UnboundNameDetector {
type Error = DetectionError;
type Output = ();
fn default(&mut self, _shape: &Shape) -> Result<Self::Output, Self::Error> {
// For most shapes, do nothing
Ok(())
}
fn visit_name(
&mut self,
_shape: &Shape,
name: &Name,
weak: &WeakScope,
) -> Result<Self::Output, Self::Error> {
// Check if this name can be resolved in the current WeakScope
if weak.upgrade(name).is_none() {
// Name cannot be resolved - add base name to unbound set
if let Some(base_name) = name.base_name() {
self.unbound_names.insert(base_name.to_string());
}
}
// Note: We deliberately do NOT recursively visit the resolved shape
// to avoid infinite loops in recursive type definitions
Ok(())
}
}
// Usage example:
# /*
let mut detector = UnboundNameDetector::new();
shape.visit_shape(&mut detector)?;
let unbound = detector.into_unbound_names();
if !unbound.is_empty() {
println!("Found unbound names: {:?}", unbound);
}
# */
```
### Use cases
The `ShapeVisitor` trait enables:
- **Reference validation**: Detecting unresolved `ShapeCase::Name` references
before namespace finalization
- **Shape analysis**: Collecting metrics about shape complexity, depth, or
variant distribution
- **Custom serialization**: Converting shapes to alternative formats while
handling recursive references appropriately
- **Constraint checking**: Validating shapes against custom rules or schemas
- **Transformation**: Building modified shape trees with cycle-aware logic
- **Documentation generation**: Extracting shape information for API
documentation while handling recursive types safely
The key insight is that `ShapeVisitor` provides a safe foundation for complex
shape analysis by requiring explicit decisions about name resolution, preventing
the infinite loops that would otherwise occur with recursive type definitions.
## JSON validation and conversion
The library provides direct support for validating JSON data against shapes and
converting JSON values into their corresponding shape representations.
### Converting JSON to shapes
```rust
# use shape::Shape;
# use serde_json::json;
let json = json!({
"a": true,
"b": "hello",
"c": 42,
"d": null,
"e": [1, 2, 3],
});
let json_shape = Shape::from_json(&json);
assert_eq!(
json_shape.pretty_print(),
r#"{
a: true,
b: "hello",
c: 42,
d: null,
e: [1, 2, 3],
}"#
);
```
### Validating JSON against shapes
```rust
# use shape::Shape;
# use serde_json::json;
let expected_shape = Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("name".to_string(), Shape::string([]));
fields.insert("age".to_string(), Shape::int([]));
fields
},
Shape::none([]),
[]
);
let valid_json = json!({"name": "Alice", "age": 30});
let invalid_json = json!({"name": "Bob", "age": "thirty"});
assert!(expected_shape.accepts_json(&valid_json));
assert!(!expected_shape.accepts_json(&invalid_json));
// Get detailed mismatch information
let mismatch = expected_shape.validate_json(&invalid_json);
assert!(mismatch.is_some());
```
## Child shape navigation
Shapes provide methods to navigate to the shapes of nested fields and array elements.
### Field and item access
```rust
# use shape::Shape;
let object_shape = Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("a".to_string(), Shape::bool([]));
fields.insert("b".to_string(), Shape::string([]));
fields
},
Shape::none([]),
[]
);
// Access field shapes
assert_eq!(object_shape.field("a", []).pretty_print(), "Bool");
assert_eq!(object_shape.field("b", []).pretty_print(), "String");
assert_eq!(object_shape.field("missing", []).pretty_print(), "None");
let array_shape = Shape::array(
[Shape::bool([]), Shape::int([])],
Shape::string([]),
[]
);
// Access array element shapes
assert_eq!(array_shape.item(0, []).pretty_print(), "Bool");
assert_eq!(array_shape.item(1, []).pretty_print(), "Int");
assert_eq!(array_shape.item(2, []).pretty_print(), "One<String, None>");
```
### Field access on arrays
When accessing a field on an array shape, the field access is mapped over all
array elements:
```rust
# use shape::Shape;
let users_array = Shape::array(
[
Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("name".to_string(), Shape::string([]));
fields
},
Shape::none([]),
[]
),
],
Shape::none([]),
[]
);
// Accessing a field on an array maps it over elements
let names_shape = users_array.field("name", []);
assert_eq!(names_shape.pretty_print(), "[String]");
```
## Special shape types
### Unknown shape
`Shape::unknown()` accepts any shape and is absorbed when it appears in unions:
```rust
# use shape::Shape;
let unknown = Shape::unknown([]);
// Unknown accepts everything
assert!(unknown.accepts(&Shape::string([])));
assert!(unknown.accepts(&Shape::int([])));
assert!(unknown.accepts(&Shape::empty_object([])));
// Unknown subsumes all other shapes in unions
let union_with_unknown = Shape::one(
[Shape::string([]), Shape::unknown([]), Shape::int([])],
[]
);
assert_eq!(union_with_unknown.pretty_print(), "Unknown");
```
### Dict shape
`Shape::dict()` represents objects with arbitrary string keys but consistent value types:
```rust
# use shape::Shape;
let string_dict = Shape::dict(Shape::string([]), []);
// Accepts objects with any keys, as long as values are strings
let dict_instance = Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("foo".to_string(), Shape::string([]));
fields.insert("bar".to_string(), Shape::string([]));
fields
},
Shape::string([]), // rest: all other fields are also strings
[]
);
assert!(string_dict.accepts(&dict_instance));
```
### Record shape
`Shape::record()` represents objects with exactly the specified fields and no others:
```rust
# use shape::Shape;
let user_record = Shape::record(
{
let mut fields = Shape::empty_map();
fields.insert("id".to_string(), Shape::int([]));
fields.insert("name".to_string(), Shape::string([]));
fields
},
[]
);
// Record shapes have no rest/wildcard - only exact fields
assert_eq!(user_record.pretty_print(), r#"{ id: Int, name: String }"#);
assert_eq!(user_record.field("other", []).pretty_print(), "None");
```
## Location tracking
The `Location` system tracks where shapes originate in source text:
```rust
# use shape::{Shape, location::Location};
let loc1 = Location::new("file1.json", 10, 5);
let loc2 = Location::new("file2.json", 20, 8);
let shape_with_location = Shape::string([loc1.clone()]);
// Locations are preserved through operations
let union = Shape::one([
Shape::string([loc1]),
Shape::string([loc2]),
], []);
// The union's string shape now has both locations
```
## Pretty printing
Shapes provide human-readable string representations:
```rust
# use shape::Shape;
let complex_shape = Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("name".to_string(), Shape::string([]));
fields.insert("tags".to_string(), Shape::list(Shape::string([]), []));
fields.insert("metadata".to_string(), Shape::dict(Shape::int([]), []));
fields
},
Shape::none([]),
[]
);
assert_eq!(
complex_shape.pretty_print(),
r#"{
metadata: Dict<Int>,
name: String,
tags: List<String>,
}"#
);
// With names (when inserted into a namespace)
# use shape::Namespace;
let mut ns = Namespace::new();
let named = ns.insert("Config", complex_shape);
assert_eq!(
named.pretty_print_with_names(),
r#"{
metadata: Dict<Int (aka Config.metadata.*)> (aka Config.metadata),
name: String (aka Config.name),
tags: List<String (aka Config.tags.*)> (aka Config.tags),
} (aka Config)"#
);
```
## Error shapes
Error shapes represent validation failures and can carry partial shape information. An
`Error` with `Some(partial)` shape behaves as much like that partial shape as possible
(for acceptance testing, field access, etc.), except that it won't be deduplicated
during simplification - each error remains distinct to preserve diagnostic information:
```rust
# use shape::Shape;
let error = Shape::error("Type mismatch", []);
let error_with_partial = Shape::error_with_partial(
"Expected string",
Shape::string([]),
[]
);
// Errors with partial shapes satisfy shapes that accept the partial
assert!(error_with_partial.accepts(&Shape::string([])));
assert!(!error_with_partial.accepts(&Shape::int([])));
// Multiple errors with the same partial shape remain distinct in unions
let error1 = Shape::error_with_partial("Parse failed", Shape::int([]), []);
let error2 = Shape::error_with_partial("Validation failed", Shape::int([]), []);
let union = Shape::one([error1, error2], []);
// Both errors are preserved, not deduplicated like regular shapes would be
// Errors can be nested in structures
let object_with_error = Shape::object(
{
let mut fields = Shape::empty_map();
fields.insert("valid".to_string(), Shape::string([]));
fields.insert("invalid".to_string(), Shape::error("Failed validation", []));
fields
},
Shape::none([]),
[]
);
// Errors can chain: an Error with another Error as its partial
let inner_error = Shape::error_with_partial(
"Value out of range",
Shape::int([]), // Best guess: should be an int
[]
);
let outer_error = Shape::error_with_partial(
"Configuration failed",
inner_error, // The partial is itself an error
[]
);
// This creates a chain of errors with diagnostic information at each level
assert!(outer_error.accepts(&Shape::int([]))); // Accepts int due to chain
```
## Shape validation hierarchy
The `Shape::validate()` method returns `Option<ShapeMismatch>` providing
detailed, hierarchical information about validation failures. Each
`ShapeMismatch` contains a `causes: Vec<ShapeMismatch>` field that creates a
tree of errors, allowing precise identification of where validation failed in
complex nested structures.
### ShapeMismatch structure
```rust
# use shape::{Shape, ShapeMismatch};
pub struct ShapeMismatch {
pub expected: Shape,
pub received: Shape,
pub causes: Vec<ShapeMismatch>,
}
```
When validation fails, the returned `ShapeMismatch` captures not just the
top-level mismatch, but also the specific sub-shapes that caused the failure.
This hierarchical structure helps debug complex shape validation issues.
### Example: object field validation
Here's a real example from the test suite showing how field-level mismatches
appear in the causes array:
```rust
# use shape::{Shape, ShapeMismatch};
let object_a_bool_b_int = Shape::record(
{
let mut fields = Shape::empty_map();
fields.insert("a".to_string(), Shape::bool([]));
fields.insert("b".to_string(), Shape::int([]));
fields
},
[]
);
let object_a_int_b_bool = Shape::record(
{
let mut fields = Shape::empty_map();
fields.insert("a".to_string(), Shape::int([]));
fields.insert("b".to_string(), Shape::bool([]));
fields
},
[]
);
// Validation fails with detailed causes for each field mismatch
assert_eq!(
object_a_bool_b_int.validate(&object_a_int_b_bool),
Some(ShapeMismatch {
expected: object_a_bool_b_int.clone(),
received: object_a_int_b_bool.clone(),
causes: vec![
ShapeMismatch {
expected: Shape::bool([]), // Field "a" expected Bool
received: Shape::int([]), // but received Int
causes: Vec::new(),
},
ShapeMismatch {
expected: Shape::int([]), // Field "b" expected Int
received: Shape::bool([]), // but received Bool
causes: Vec::new(),
},
]
})
);
```
This hierarchical error structure allows tools and error reporters to:
- Show exactly which fields or array elements failed validation
- Provide context-aware error messages at each level
- Navigate from high-level structural mismatches down to specific value differences
- Build detailed validation reports for complex nested data structures
The `causes` field can be empty for leaf-level mismatches (like primitive type
differences), or contain multiple entries for compound shapes where several
sub-validations failed simultaneously.