shape 0.6.0-preview.0

A portable static type system for JSON-compatible data
Documentation

shape: a portable static type system for JSON-compatible data

This library implements a Rust-based type system that can represent any kind of JSON data, offering type-theoretic operations like simplification, acceptance testing and validation, child shape selection, union and intersection shapes, delayed shape binding, namespaces, automatic name propagation, copy-on-write semantics, error handling, and more.

The term "shape" is used here as a very close synonym to "type" but with a focus on structured JSON value types rather than abstract, object-oriented, functional or higher-order types. The core idea is that both shape and type denote an abstract (possibly infinite) set of possible values that satisfy a given shape/type, and we can ask interesting questions about the relationships between these sets of values. For example, when we simplify a shape, we are "asking" whether another smaller representation of the shape exists that denotes the same set of possible values.

[!CAUTION] This library is still in early-stage development, so you should not expect its API to be fully stable until the 1.0.0 release.

Installation

This crate provides a library, so installation means adding it as a dependency to your Cargo.toml file:

cargo add shape

Documentation

See the cargo doc-generated documentation for detailed information about the Shape struct and ShapeCase enum.

The Shape struct

pub(crate) type Ref<T> = std::sync::Arc<T>;

#[derive(Clone, Debug, Eq)]
pub struct Shape {
    case: Ref<ShapeCase>,
    meta: Ref<IndexSet<ShapeMeta>>,
}

To support recombinations of shapes and their subshapes, the top-level Shape struct wraps a reference counted ShapeCase enum variant. Reference counting not only simplifies sharing subtrees among different Shape structures, but also prevents rustc from complaining about the Shape struct referring to itself without indirection.

The meta field stores metadata about the shape, such as source code locations where the shape was defined or derived from. This metadata is preserved when shapes are combined or transformed.

The Shape struct implements Hash and PartialEq based on its case field, ignoring the metadata:

impl Hash for Shape {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.case.hash(state);
    }
}

impl PartialEq for Shape {
    fn eq(&self, other: &Self) -> bool {
        self.case == other.case
    }
}

Copy-on-write semantics

The shape library leverages Arc (Atomically Reference Counted) pointers to provide efficient copy-on-write semantics for Shape instances. Since shapes are immutable after creation, multiple Shape instances can safely share references to the same underlying ShapeCase data without copying.

pub(crate) type Ref<T> = std::sync::Arc<T>;

When a shape needs to be modified (such as during name propagation or metadata updates), the library uses Arc::make_mut to perform copy-on-write operations. This ensures that:

  • Shapes are only copied when actually modified, not when simply accessed
  • Multiple shapes can share common subtrees efficiently
  • Thread safety is maintained through Arc's atomic reference counting
  • Memory usage is minimized through structural sharing

This copy-on-write approach reduces memory usage when working with large shape hierarchies or when creating many similar shapes that share common substructures.

Obtaining a Shape

Instead of tracking whether a given ShapeCase has been simplified or not, we can simply mandate that Shape always wraps simplified shapes.

This invariant is enforced by restricting how Shape instances can be (publicly) created: all Shape instances must come from calling the Shape::new method with a simplified ShapeCase and associated metadata.

impl Shape {
    pub(crate) fn new(case: ShapeCase, locations: impl IntoIterator<Item = Location>) -> Shape {
        let meta = locations
            .into_iter()
            .map(ShapeMeta::Loc)
            .collect::<IndexSet<_>>();
        Shape {
            case: Ref::new(case),
            meta: Ref::new(meta),
        }
    }

    // For convenience, Shape provides a number of static helper methods that include location tracking
    pub fn bool(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn bool_value(value: bool, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn int(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn int_value(value: i64, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn float(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn string(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn string_value(value: &str, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn null(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn none(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn empty_object(locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn empty_map() -> IndexMap<String, Shape>;
    pub fn object(fields: IndexMap<String, Shape>, rest: Shape, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn array(prefix: impl IntoIterator<Item = Shape>, tail: Shape, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn tuple(shapes: impl IntoIterator<Item = Shape>, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn list(of: Shape, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn one(shapes: impl IntoIterator<Item = Shape>, locations: impl IntoIterator<Item = Location>) -> Shape;
    pub fn all(shapes: impl IntoIterator<Item = Shape>, locations: impl IntoIterator<Item = Location>) -> Shape;
}

The impl IntoIterator<Item = ...> parameters are intended to allow maximum flexibility of iterator argument passing style, including [shape1, shape2, ...], vector slices, etc.

Shape validation and acceptance testing

The library provides two primary methods for checking shape compatibility:

  1. shape.validate(&other) -> Option<ShapeMismatch> - Returns detailed error information when shapes don't match, including a hierarchy of causes explaining exactly where validation failed.

  2. shape.accepts(&other) -> bool - A convenience method that returns true if validation succeeds (equivalent to shape.validate(&other).is_none()).

# use shape::{Shape, ShapeMismatch};
let expected = Shape::string([]);
let received = Shape::int([]);

// Quick boolean check
assert!(!expected.accepts(&received));

// Detailed validation information
let mismatch = expected.validate(&received);
assert_eq!(
    mismatch,
    Some(ShapeMismatch {
        expected: expected.clone(),
        received: received.clone(),
        causes: vec![],  // Leaf-level mismatch has no sub-causes
    })
);

For example, a Shape::one union shape accepts any member shape of the union:

let int_string_union = Shape::one([Shape::int([]), Shape::string([])], []);

assert!(int_string_union.accepts(&Shape::int([])));
assert!(int_string_union.accepts(&Shape::string([])));
assert!(!int_string_union.accepts(&Shape::float([])));

// Using validate for more details
let mismatch = int_string_union.validate(&Shape::float([]));
assert!(mismatch.is_some());  // Float doesn't match Int or String

Error satisfaction

A ShapeCase::Error variant generally represents a failure of shape processing, but it can also optionally report Some(partial) shape information in cases when there is a likely best guess at what the shape should be.

An Error with Some(partial) behaves as much like that partial shape as possible - it accepts the same shapes, supports the same field/item access, etc. However, unlike regular shapes, errors are never deduplicated during simplification, ensuring each error's diagnostic message is preserved.

This partial: Option<Shape> field allows errors to provide guidance (potentially with chains of multiple errors) without interfering with the accepts logic.

let error = Shape::error_with_partial("Expected an integer", Shape::int([]), []);

assert!(error.accepts(&Shape::int([])));
assert!(!error.accepts(&Shape::float([])));

assert!(error.accepts(&Shape::int_value(42, [])));
assert!(!error.accepts(&Shape::float([])));

The null singleton and the None shape

ShapeCase::Null represents the singleton null value found in JSON. It accepts only itself and no other shapes, except unions that allow null as a member, or errors that wrap null as a partial shape.

ShapeCase::None represents the absence of a value, and is often used to represent optional values. Like null, None is satisfied by (accepts) only itself and no other shapes (except unions that include None as a member, or errors that wrap None as a partial shape for some reason).

When either null or None participate in a Shape::one union shape, they are usually preserved (other than being deduplicated) because they represent distinct possibilities. However, ::Null and ::None do have a noteworthy difference of behavior when simplifying ::All intersection shapes.

When null participates in a ShapeCase::All intersection shape, it "poisons" the intersection and causes the whole thing to simplify to null. This allows a single intersection member shape to override the whole intersection, which is useful for reporting certain kinds of error conditions (especially in GraphQL).

By contrast, None does not poison intersections, but is simply ignored. This makes sense if you think of Shape::all intersections as merging their member shapes: when you merge None with another shape, you get the other shape back, because None imposes no additional expectations.

Namespaces and named shapes

The shape library provides a namespace system for managing collections of named shapes and enabling recursive type definitions. Namespaces solve two important problems: they allow shapes to reference themselves (creating recursive types), and they provide automatic name propagation throughout shape hierarchies.

Basic namespace usage

A Namespace is a collection of named Shapes that supports a two-stage lifecycle:

# use shape::{Shape, Namespace};
// Create an unfinalized namespace
let mut namespace = Namespace::new();

// Insert a shape with a name - this triggers automatic name propagation
let id_shape = Shape::one([Shape::string([]), Shape::int([])], []);
let named_shape = namespace.insert("ID", id_shape);

// The shape now has names throughout its hierarchy
assert_eq!(named_shape.pretty_print(), "One<String, Int>");
assert_eq!(
    named_shape.pretty_print_with_names(),
    "One<String (aka ID), Int (aka ID)> (aka ID)"
);

// Finalize the namespace to resolve all name references
let final_namespace = namespace.finalize();

Automatic name propagation

When a shape is inserted into a namespace, the library automatically propagates derived child names to all nested shapes within the hierarchy. This means that every field, array element, and nested structure receives contextual naming based on its position:

# use shape::{Shape, Namespace};
let user_shape = Shape::object(
    {
        let mut fields = Shape::empty_map();
        fields.insert("name".to_string(), Shape::string([]));
        fields.insert("age".to_string(), Shape::int([]));
        fields.insert("contacts".to_string(), Shape::list(Shape::string([]), []));
        fields
    },
    Shape::none([]),
    []
);

let mut namespace = Namespace::new();
let named_user = namespace.insert("User", user_shape);

// Names are automatically propagated to nested shapes:
// - User.name for the name field
// - User.age for the age field
// - User.contacts for the contacts array
// - User.contacts.* for elements in the contacts array
assert_eq!(
    named_user.pretty_print_with_names(),
    r#"{
  age: Int (aka User.age),
  contacts: List<String (aka User.contacts.*)> (aka User.contacts),
  name: String (aka User.name),
} (aka User)"#
);

Names can only be assigned to Shapes through Namespace insertion.

Recursive type definitions

Namespaces enable recursive type definitions through ShapeCase::Name references. A shape can reference other shapes in the namespace by name, including itself:

// Define a recursive JSON type that can contain arrays and objects of itself
let json_shape = Shape::one([
    Shape::null([]),
    Shape::bool([]),
    Shape::string([]),
    Shape::int([]),
    Shape::float([]),
    Shape::dict(Shape::name("JSON", []), []),  // Objects containing JSON values
    Shape::list(Shape::name("JSON", []), []),  // Arrays containing JSON values
], []);

let mut namespace = Namespace::new();
namespace.insert("JSON", json_shape);

let final_namespace = namespace.finalize();
let json_type = final_namespace.get("JSON").unwrap();

// The resolved type can now handle recursive structures
assert!(json_type.accepts(&Shape::null([])));
assert!(json_type.accepts(&Shape::list(Shape::string([]), [])));
assert!(json_type.accepts(&Shape::dict(Shape::int([]), [])));

Namespace finalization

Namespaces have two distinct phases:

  1. Namespace<NotFinal>: Allows mutations like insert() and extend()
  2. Namespace<Final>: Read-only with all ShapeCase::Name references resolved

The finalization process:

  • Ensures exclusive ownership of all shape references for thread safety
  • Resolves ShapeCase::Name references by updating their WeakScope to provide weak references to all named shapes
  • Allows recursive type self-references that expand incrementally/lazily to avoid infinite cycles
  • Prevents Arc memory leaks by using only weak references (WeakScope) from ShapeCase::Name to the target Shape defined in some Namespace<Final>
let mut namespace = Namespace::new();
namespace.insert("MyType", Shape::string([]));
namespace.extend(&other_namespace);  // Can merge with other namespaces

let final_namespace = namespace.finalize();  // No more mutations allowed
let my_type = final_namespace.get("MyType"); // Resolved shape available

This namespace system, combined with automatic name propagation and copy-on-write semantics, enables complex type modeling scenarios while maintaining performance and thread safety.

Metadata management and MergeSet

The shape library implements a metadata management system that tracks provenance information (source locations and names) separately from the logical structure of shapes (as defined by the ShapeCase enum). This separation allows shapes to be compared and hashed based purely on their structural content, while preserving all metadata for debugging, error reporting, and tooling purposes.

The challenge: preserving metadata during simplification

One risk of using only logical structure for equality/hashing and excluding metadata occurs when storing such data structures in a set or map. Structurally equivalent shapes are generally deduplicated by these data structures, typically clobbering/discarding metadata from one side of the collision.

Since Shape deduplication is a common and important simplification step, when shapes are combined or simplified, the library uses a data structure called a MergeSet to ensure metadata is merged from all contributing sources. For example, the null shape may end up in a ShapeCase::One union from multiple different sources, carrying different names, locations, and other metadata:

# use shape::{Shape, name::Namespace};
let mut ns = Namespace::new();

// Insert shapes with different names - this gives them name metadata
let nullable_id = ns.insert("NullableID", Shape::one([Shape::null([]), Shape::int([])], []));
let nullable_str = ns.insert("NullableString", Shape::one([Shape::null([]), Shape::string([])], []));
let optional = ns.insert("Optional", Shape::one([Shape::null([]), Shape::bool([])], []));

// Each union contains a null with different name metadata
assert_eq!(
    nullable_id.pretty_print_with_names(),
    "One<null (aka NullableID), Int (aka NullableID)> (aka NullableID)"
);
assert_eq!(
    nullable_str.pretty_print_with_names(),
    "One<null (aka NullableString), String (aka NullableString)> (aka NullableString)"
);
assert_eq!(
    optional.pretty_print_with_names(),
    "One<null (aka Optional), Bool (aka Optional)> (aka Optional)"
);

// Now create a union of these already-named shapes
let combined = Shape::one([nullable_id, nullable_str, optional], []);

// The union deduplicates structurally identical shapes.
// All three nulls merge into one, but it retains all the names via MergeSet.
assert_eq!(combined.pretty_print(), "One<null, Int, String, Bool>");

// With names shown, the output reveals how the shapes were deduplicated:
// The null has accumulated all three names via MergeSet
assert_eq!(
    combined.pretty_print_with_names(),
    r#"One<
  null (aka NullableID, NullableString, Optional),
  Int (aka NullableID),
  String (aka NullableString),
  Bool (aka Optional),
>"#
);

The MetaMergeable trait

The MetaMergeable trait enables efficient in-place merging of metadata between structurally identical shapes. When duplicate shapes are detected during operations like union or intersection formation, their metadata can be combined without duplicating the structural information:

pub trait MetaMergeable {
    /// Merge metadata from `other` into `self`, returning true if any changes occurred
    fn merge_meta_from(&mut self, other: &Self) -> bool;
}

Both Shape and Name implement MetaMergeable, allowing them to:

  • Combine location sets from multiple sources
  • Merge name information across shape transformations
  • Preserve all provenance data during simplification
  • Perform copy-on-write updates only when metadata actually changes

MergeSet: deduplication with metadata preservation

MergeSet<T> is a specialized collection that combines the deduplication properties of IndexSet with automatic metadata merging. When a duplicate item is inserted, instead of being ignored, its metadata is merged with the existing item:

# use shape::Shape;
# use shape::location::Location;
let loc1 = Location::new("source1", 10, 0);
let loc2 = Location::new("source2", 20, 0);
let loc3 = Location::new("source3", 30, 0);

// Create shapes with different source locations but identical structure
let shapes = [
    Shape::string([loc1]),
    Shape::int([loc2]),
    Shape::string([loc3]),  // Same structure as first, different location
];

let union = Shape::one(shapes, []);

// The resulting union contains two distinct shapes (String and Int),
// but the String shape now has location information from both loc1 and loc3
assert_eq!(union.pretty_print(), "One<String, Int>");

This system is used internally by:

  • Shape::one() - Unions that deduplicate member shapes while preserving all metadata
  • Shape::all() - Intersections that merge compatible shapes and their provenance
  • Namespace operations - When inserting named Shapes into a Namespace<NotFinal>, multiple shapes added with the same name are merged using Shape::all, which often leads to merging the metadata of ShapeCase-equivalent shapes in the resulting intersection

The ShapeVisitor trait

The ShapeVisitor trait provides a visitor pattern for traversing and analyzing Shape structures. It enables custom logic for processing different ShapeCase variants while maintaining full control over traversal decisions, especially around recursive name resolution.

Visitor methods and default fallback

The trait defines visit methods for each ShapeCase variant, all with default implementations that delegate to a default() method:

pub trait ShapeVisitor<Output = ()> {
    type Error: std::error::Error;
    type Output;

    fn default(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;

    fn visit_bool(&mut self, shape: &Shape, value: &Option<bool>) -> Result<Self::Output, Self::Error>;
    fn visit_string(&mut self, shape: &Shape, value: &Option<String>) -> Result<Self::Output, Self::Error>;
    fn visit_int(&mut self, shape: &Shape, value: &Option<i64>) -> Result<Self::Output, Self::Error>;
    fn visit_float(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
    fn visit_null(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
    fn visit_array(&mut self, shape: &Shape, prefix: &[Shape], tail: &Shape) -> Result<Self::Output, Self::Error>;
    fn visit_object(&mut self, shape: &Shape, fields: &IndexMap<String, Shape>, rest: &Shape) -> Result<Self::Output, Self::Error>;
    fn visit_one(&mut self, shape: &Shape, one: &One) -> Result<Self::Output, Self::Error>;
    fn visit_all(&mut self, shape: &Shape, all: &All) -> Result<Self::Output, Self::Error>;
    fn visit_name(&mut self, shape: &Shape, name: &Name, weak: &WeakScope) -> Result<Self::Output, Self::Error>;
    fn visit_unknown(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
    fn visit_none(&mut self, shape: &Shape) -> Result<Self::Output, Self::Error>;
    fn visit_error(&mut self, shape: &Shape, err: &Error) -> Result<Self::Output, Self::Error>;
}

Each visit method receives both the complete Shape (with metadata) and the inner data specific to that variant. This design allows visitors to access both structural information and provenance data during traversal.

Critical design decision: name resolution

An important design choice of ShapeVisitor lies in how it handles ShapeCase::Name variants. The visitor does NOT automatically resolve name bindings, even when a WeakScope is available. This design prevents infinite loops when traversing recursive type definitions:

// In the visitor implementation:
ShapeCase::Name(name, weak) => visitor.visit_name(self, name, weak),
// Notice: no automatic call to weak.upgrade(name) here!

The comment in the source code explains:

// It's tempting to perform that resolution here unconditionally,
// but then the visitor could fall into an infinitely deep cycle, so
// we have to let the implementation of visit_name decide whether to
// proceed further.

This decision gives visit_name implementations complete control over whether and how to resolve named references, enabling cycle-aware traversal strategies.

Example: detecting unbound names

Here's a practical visitor that finds ShapeCase::Name variants that cannot be resolved in their WeakScope, which is useful for validating shape definitions before namespace finalization:

# use shape::{Shape, ShapeVisitor, name::Name, name::WeakScope};
# use std::collections::HashSet;
struct UnboundNameDetector {
    unbound_names: HashSet<String>,
}

#[derive(Debug)]
struct DetectionError;
impl std::fmt::Display for DetectionError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "Detection error")
    }
}
impl std::error::Error for DetectionError {}

impl UnboundNameDetector {
    fn new() -> Self {
        Self {
            unbound_names: HashSet::new(),
        }
    }

    fn into_unbound_names(self) -> HashSet<String> {
        self.unbound_names
    }
}

impl ShapeVisitor for UnboundNameDetector {
    type Error = DetectionError;
    type Output = ();

    fn default(&mut self, _shape: &Shape) -> Result<Self::Output, Self::Error> {
        // For most shapes, do nothing
        Ok(())
    }

    fn visit_name(
        &mut self,
        _shape: &Shape,
        name: &Name,
        weak: &WeakScope,
    ) -> Result<Self::Output, Self::Error> {
        // Check if this name can be resolved in the current WeakScope
        if weak.upgrade(name).is_none() {
            // Name cannot be resolved - add base name to unbound set
            if let Some(base_name) = name.base_name() {
                self.unbound_names.insert(base_name.to_string());
            }
        }
        // Note: We deliberately do NOT recursively visit the resolved shape
        // to avoid infinite loops in recursive type definitions
        Ok(())
    }
}

// Usage example:
# /*
let mut detector = UnboundNameDetector::new();
shape.visit_shape(&mut detector)?;
let unbound = detector.into_unbound_names();
if !unbound.is_empty() {
    println!("Found unbound names: {:?}", unbound);
}
# */

Use cases

The ShapeVisitor trait enables:

  • Reference validation: Detecting unresolved ShapeCase::Name references before namespace finalization
  • Shape analysis: Collecting metrics about shape complexity, depth, or variant distribution
  • Custom serialization: Converting shapes to alternative formats while handling recursive references appropriately
  • Constraint checking: Validating shapes against custom rules or schemas
  • Transformation: Building modified shape trees with cycle-aware logic
  • Documentation generation: Extracting shape information for API documentation while handling recursive types safely

The key insight is that ShapeVisitor provides a safe foundation for complex shape analysis by requiring explicit decisions about name resolution, preventing the infinite loops that would otherwise occur with recursive type definitions.

JSON validation and conversion

The library provides direct support for validating JSON data against shapes and converting JSON values into their corresponding shape representations.

Converting JSON to shapes

# use shape::Shape;
# use serde_json::json;
let json = json!({
    "a": true,
    "b": "hello",
    "c": 42,
    "d": null,
    "e": [1, 2, 3],
});

let json_shape = Shape::from_json(&json);
assert_eq!(
    json_shape.pretty_print(),
    r#"{
  a: true,
  b: "hello",
  c: 42,
  d: null,
  e: [1, 2, 3],
}"#
);

Validating JSON against shapes

# use shape::Shape;
# use serde_json::json;
let expected_shape = Shape::object(
    {
        let mut fields = Shape::empty_map();
        fields.insert("name".to_string(), Shape::string([]));
        fields.insert("age".to_string(), Shape::int([]));
        fields
    },
    Shape::none([]),
    []
);

let valid_json = json!({"name": "Alice", "age": 30});
let invalid_json = json!({"name": "Bob", "age": "thirty"});

assert!(expected_shape.accepts_json(&valid_json));
assert!(!expected_shape.accepts_json(&invalid_json));

// Get detailed mismatch information
let mismatch = expected_shape.validate_json(&invalid_json);
assert!(mismatch.is_some());

Child shape navigation

Shapes provide methods to navigate to the shapes of nested fields and array elements.

Field and item access

# use shape::Shape;
let object_shape = Shape::object(
    {
        let mut fields = Shape::empty_map();
        fields.insert("a".to_string(), Shape::bool([]));
        fields.insert("b".to_string(), Shape::string([]));
        fields
    },
    Shape::none([]),
    []
);

// Access field shapes
assert_eq!(object_shape.field("a", []).pretty_print(), "Bool");
assert_eq!(object_shape.field("b", []).pretty_print(), "String");
assert_eq!(object_shape.field("missing", []).pretty_print(), "None");

let array_shape = Shape::array(
    [Shape::bool([]), Shape::int([])],
    Shape::string([]),
    []
);

// Access array element shapes
assert_eq!(array_shape.item(0, []).pretty_print(), "Bool");
assert_eq!(array_shape.item(1, []).pretty_print(), "Int");
assert_eq!(array_shape.item(2, []).pretty_print(), "One<String, None>");

Field access on arrays

When accessing a field on an array shape, the field access is mapped over all array elements:

# use shape::Shape;
let users_array = Shape::array(
    [
        Shape::object(
            {
                let mut fields = Shape::empty_map();
                fields.insert("name".to_string(), Shape::string([]));
                fields
            },
            Shape::none([]),
            []
        ),
    ],
    Shape::none([]),
    []
);

// Accessing a field on an array maps it over elements
let names_shape = users_array.field("name", []);
assert_eq!(names_shape.pretty_print(), "[String]");

Special shape types

Unknown shape

Shape::unknown() accepts any shape and is absorbed when it appears in unions:

# use shape::Shape;
let unknown = Shape::unknown([]);

// Unknown accepts everything
assert!(unknown.accepts(&Shape::string([])));
assert!(unknown.accepts(&Shape::int([])));
assert!(unknown.accepts(&Shape::empty_object([])));

// Unknown subsumes all other shapes in unions
let union_with_unknown = Shape::one(
    [Shape::string([]), Shape::unknown([]), Shape::int([])],
    []
);
assert_eq!(union_with_unknown.pretty_print(), "Unknown");

Dict shape

Shape::dict() represents objects with arbitrary string keys but consistent value types:

# use shape::Shape;
let string_dict = Shape::dict(Shape::string([]), []);

// Accepts objects with any keys, as long as values are strings
let dict_instance = Shape::object(
    {
        let mut fields = Shape::empty_map();
        fields.insert("foo".to_string(), Shape::string([]));
        fields.insert("bar".to_string(), Shape::string([]));
        fields
    },
    Shape::string([]),  // rest: all other fields are also strings
    []
);
assert!(string_dict.accepts(&dict_instance));

Record shape

Shape::record() represents objects with exactly the specified fields and no others:

# use shape::Shape;
let user_record = Shape::record(
    {
        let mut fields = Shape::empty_map();
        fields.insert("id".to_string(), Shape::int([]));
        fields.insert("name".to_string(), Shape::string([]));
        fields
    },
    []
);

// Record shapes have no rest/wildcard - only exact fields
assert_eq!(user_record.pretty_print(), r#"{ id: Int, name: String }"#);
assert_eq!(user_record.field("other", []).pretty_print(), "None");

Location tracking

The Location system tracks where shapes originate in source text:

# use shape::{Shape, location::Location};
let loc1 = Location::new("file1.json", 10, 5);
let loc2 = Location::new("file2.json", 20, 8);

let shape_with_location = Shape::string([loc1.clone()]);

// Locations are preserved through operations
let union = Shape::one([
    Shape::string([loc1]),
    Shape::string([loc2]),
], []);
// The union's string shape now has both locations

Pretty printing

Shapes provide human-readable string representations:

# use shape::Shape;
let complex_shape = Shape::object(
    {
        let mut fields = Shape::empty_map();
        fields.insert("name".to_string(), Shape::string([]));
        fields.insert("tags".to_string(), Shape::list(Shape::string([]), []));
        fields.insert("metadata".to_string(), Shape::dict(Shape::int([]), []));
        fields
    },
    Shape::none([]),
    []
);

assert_eq!(
    complex_shape.pretty_print(),
    r#"{
  metadata: Dict<Int>,
  name: String,
  tags: List<String>,
}"#
);

// With names (when inserted into a namespace)
# use shape::Namespace;
let mut ns = Namespace::new();
let named = ns.insert("Config", complex_shape);
assert_eq!(
    named.pretty_print_with_names(),
    r#"{
  metadata: Dict<Int (aka Config.metadata.*)> (aka Config.metadata),
  name: String (aka Config.name),
  tags: List<String (aka Config.tags.*)> (aka Config.tags),
} (aka Config)"#
);

Error shapes

Error shapes represent validation failures and can carry partial shape information. An Error with Some(partial) shape behaves as much like that partial shape as possible (for acceptance testing, field access, etc.), except that it won't be deduplicated during simplification - each error remains distinct to preserve diagnostic information:

# use shape::Shape;
let error = Shape::error("Type mismatch", []);
let error_with_partial = Shape::error_with_partial(
    "Expected string",
    Shape::string([]),
    []
);

// Errors with partial shapes satisfy shapes that accept the partial
assert!(error_with_partial.accepts(&Shape::string([])));
assert!(!error_with_partial.accepts(&Shape::int([])));

// Multiple errors with the same partial shape remain distinct in unions
let error1 = Shape::error_with_partial("Parse failed", Shape::int([]), []);
let error2 = Shape::error_with_partial("Validation failed", Shape::int([]), []);
let union = Shape::one([error1, error2], []);
// Both errors are preserved, not deduplicated like regular shapes would be

// Errors can be nested in structures
let object_with_error = Shape::object(
    {
        let mut fields = Shape::empty_map();
        fields.insert("valid".to_string(), Shape::string([]));
        fields.insert("invalid".to_string(), Shape::error("Failed validation", []));
        fields
    },
    Shape::none([]),
    []
);

// Errors can chain: an Error with another Error as its partial
let inner_error = Shape::error_with_partial(
    "Value out of range",
    Shape::int([]),  // Best guess: should be an int
    []
);
let outer_error = Shape::error_with_partial(
    "Configuration failed",
    inner_error,  // The partial is itself an error
    []
);

// This creates a chain of errors with diagnostic information at each level
assert!(outer_error.accepts(&Shape::int([])));  // Accepts int due to chain

Shape validation hierarchy

The Shape::validate() method returns Option<ShapeMismatch> providing detailed, hierarchical information about validation failures. Each ShapeMismatch contains a causes: Vec<ShapeMismatch> field that creates a tree of errors, allowing precise identification of where validation failed in complex nested structures.

ShapeMismatch structure

# use shape::{Shape, ShapeMismatch};
pub struct ShapeMismatch {
    pub expected: Shape,
    pub received: Shape,
    pub causes: Vec<ShapeMismatch>,
}

When validation fails, the returned ShapeMismatch captures not just the top-level mismatch, but also the specific sub-shapes that caused the failure. This hierarchical structure helps debug complex shape validation issues.

Example: object field validation

Here's a real example from the test suite showing how field-level mismatches appear in the causes array:

# use shape::{Shape, ShapeMismatch};
let object_a_bool_b_int = Shape::record(
    {
        let mut fields = Shape::empty_map();
        fields.insert("a".to_string(), Shape::bool([]));
        fields.insert("b".to_string(), Shape::int([]));
        fields
    },
    []
);

let object_a_int_b_bool = Shape::record(
    {
        let mut fields = Shape::empty_map();
        fields.insert("a".to_string(), Shape::int([]));
        fields.insert("b".to_string(), Shape::bool([]));
        fields
    },
    []
);

// Validation fails with detailed causes for each field mismatch
assert_eq!(
    object_a_bool_b_int.validate(&object_a_int_b_bool),
    Some(ShapeMismatch {
        expected: object_a_bool_b_int.clone(),
        received: object_a_int_b_bool.clone(),
        causes: vec![
            ShapeMismatch {
                expected: Shape::bool([]),  // Field "a" expected Bool
                received: Shape::int([]),   // but received Int
                causes: Vec::new(),
            },
            ShapeMismatch {
                expected: Shape::int([]),   // Field "b" expected Int
                received: Shape::bool([]),  // but received Bool
                causes: Vec::new(),
            },
        ]
    })
);

This hierarchical error structure allows tools and error reporters to:

  • Show exactly which fields or array elements failed validation
  • Provide context-aware error messages at each level
  • Navigate from high-level structural mismatches down to specific value differences
  • Build detailed validation reports for complex nested data structures

The causes field can be empty for leaf-level mismatches (like primitive type differences), or contain multiple entries for compound shapes where several sub-validations failed simultaneously.