serde-reflection 0.3.2

Extract representations of Serde data formats
Documentation
# serde-reflection

[![serde-reflection on crates.io](https://img.shields.io/crates/v/serde-reflection)](https://crates.io/crates/serde-reflection)
[![Documentation (latest release)](https://docs.rs/serde-reflection/badge.svg)](https://docs.rs/serde-reflection/)
[![Documentation (master)](https://img.shields.io/badge/docs-master-brightgreen)](https://novifinancial.github.io/serde-reflection/serde_reflection/)
[![License](https://img.shields.io/badge/license-Apache-green.svg)](../LICENSE-APACHE)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](../LICENSE-MIT)

This crate provides a way to extract format descriptions for Rust containers
that implement the Serialize and/or Deserialize trait(s) of Serde.

## Quick Start

Very often, Serde traits are only implemented using Serde derive macros.
In this case, simply
* call `trace_type` on the desired top-level definitions, then
* add a call to `trace_type` for each `enum` type. (This will fix any `MissingVariants` error.)

```rust
#[derive(Deserialize)]
struct Foo {
  bar: Bar,
  choice: Choice,
}

#[derive(Deserialize)]
struct Bar(u64);

#[derive(Deserialize)]
enum Choice { A, B, C }

// Start the tracing session.
let mut tracer = Tracer::new(TracerConfig::default());
let samples = Samples::new();

// Trace the desired top-level type(s).
tracer.trace_type::<Foo>(&samples)?;

// Also trace each enum type separately to fix any `MissingVariants` error.
tracer.trace_type::<Choice>(&samples)?;

// Obtain the registry of Serde formats and serialize it in YAML (for instance).
let registry = tracer.registry()?;
let data = serde_yaml::to_string(&registry).unwrap() + "\n";
assert_eq!(&data, r#"---
Bar:
  NEWTYPESTRUCT: U64
Choice:
  ENUM:
    0:
      A: UNIT
    1:
      B: UNIT
    2:
      C: UNIT
Foo:
  STRUCT:
    - bar:
        TYPENAME: Bar
    - choice:
        TYPENAME: Choice
"#);
```

## Troubleshooting

The error type used in this crate provides a method `error.explanation()` to help with
troubleshooting during format tracing.

## Overview

In the following, more complete example, we extract the Serde formats of two containers
`Name` and `Person` and demonstrate how to handle a custom implementation of `serde::Deserialize`
for `Name`.

```rust
use serde_reflection::{ContainerFormat, Error, Format, Samples, Tracer, TracerConfig};

#[derive(Serialize, PartialEq, Eq, Debug, Clone)]
struct Name(String);
// impl<'de> Deserialize<'de> for Name { ... }

#[derive(Serialize, Deserialize, PartialEq, Eq, Debug, Clone)]
enum Person {
    NickName(Name),
    FullName { first: Name, last: Name },
}

// Start a session to trace formats.
let mut tracer = Tracer::new(TracerConfig::default());
// Create a store to hold samples of Rust values.
let mut samples = Samples::new();

// For every type (here `Name`), if a user-defined implementation of `Deserialize` exists and
// is known to perform custom validation checks, use `trace_value` first so that `samples`
// contains a valid Rust value of this type.
let bob = Name("Bob".into());
tracer.trace_value(&mut samples, &bob)?;
assert!(samples.value("Name").is_some());

// Now, let's trace deserialization for the top-level type `Person`.
// We pass a reference to `samples` so that sampled values are used for custom types.
let (format, values) = tracer.trace_type::<Person>(&samples)?;
assert_eq!(format, Format::TypeName("Person".into()));

// As a byproduct, we have also obtained sample values of type `Person`.
// We can see that the user-provided value `bob` was used consistently to pass
// validation checks for `Name`.
assert_eq!(values[0], Person::NickName(bob.clone()));
assert_eq!(values[1], Person::FullName { first: bob.clone(), last: bob.clone() });

// We have no more top-level types to trace, so let's stop the tracing session and obtain
// a final registry of containers.
let registry = tracer.registry()?;

// We have successfully extracted a format description of all Serde containers under `Person`.
assert_eq!(
    registry.get("Name").unwrap(),
    &ContainerFormat::NewTypeStruct(Box::new(Format::Str)),
);
match registry.get("Person").unwrap() {
    ContainerFormat::Enum(variants) => assert_eq!(variants.len(), 2),
     _ => panic!(),
};

// Export the registry in YAML.
let data = serde_yaml::to_string(&registry).unwrap() + "\n";
assert_eq!(&data, r#"---
Name:
  NEWTYPESTRUCT: STR
Person:
  ENUM:
    0:
      NickName:
        NEWTYPE:
          TYPENAME: Name
    1:
      FullName:
        STRUCT:
          - first:
              TYPENAME: Name
          - last:
              TYPENAME: Name
"#);
```

## Tracing Serialization with `trace_value`

Tracing the serialization of a Rust value `v` consists of visiting the structural
components of `v` in depth and recording Serde formats for all the visited types.

```rust
#[derive(Serialize)]
struct FullName<'a> {
  first: &'a str,
  middle: Option<&'a str>,
  last: &'a str,
}

let mut tracer = Tracer::new(TracerConfig::default());
let mut samples = Samples::new();
tracer.trace_value(&mut samples, &FullName { first: "", middle: Some(""), last: "" })?;
let registry = tracer.registry()?;
match registry.get("FullName").unwrap() {
    ContainerFormat::Struct(fields) => assert_eq!(fields.len(), 3),
    _ => panic!(),
};
```

This approach works well but it can only recover the formats of datatypes for which
nontrivial samples have been provided:

* In enums, only the variants explicitly covered by user samples will be recorded.

* Providing a `None` value or an empty vector `[]` within a sample may result in
formats that are partially unknown.

```rust
let mut tracer = Tracer::new(TracerConfig::default());
let mut samples = Samples::new();
tracer.trace_value(&mut samples, &FullName { first: "", middle: None, last: "" })?;
assert_eq!(tracer.registry().unwrap_err(), Error::UnknownFormatInContainer("FullName".to_string()));
```

For this reason, we introduce a complementary set of APIs to trace deserialization of types.

## Tracing Deserialization with `trace_type<T>`

Deserialization-tracing APIs take a type `T`, the current tracing state, and a
reference to previously recorded samples as input.

### Core Algorithm and High-Level API

The core algorithm `trace_type_once<T>`
attempts to reconstruct a witness value of type `T` by exploring the graph of all the types
occurring in the definition of `T`. At the same time, the algorithm records the
formats of all the visited structs and enum variants.

For the exploration to be able to terminate, the core algorithm `trace_type_once<T>` explores
each possible recursion point only once (see paragraph below).
In particular, if `T` is an enum, `trace_type_once<T>` discovers only one variant of `T` at a time.

For this reason, the high-level API `trace_type<T>`
will repeat calls to `trace_type_once<T>` until all the variants of `T` are known.
Variant cases of `T` are explored in sequential order, starting with index `0`.

### Coverage Guarantees

Under the assumptions listed below, a single call to `trace_type<T>` is guaranteed to
record formats for all the types that `T` depends on. Besides, if `T` is an enum, it
will record all the variants of `T`.

(0) Container names must not collide. If this happens, consider using `#[serde(rename = "name")]`,
or implementing serde traits manually.

(1) The first variants of mutually recursive enums must be a "base case". That is,
defaulting to the first variant for every enum type (along with `None` for option values
and `[]` for sequences) must guarantee termination of depth-first traversals of the graph of type
declarations.

(2) If a type runs custom validation checks during deserialization, sample values must have been provided
previously by calling `trace_value`. Besides, the corresponding registered formats
must not contain unknown parts.

### Design Considerations

Whenever we traverse the graph of type declarations using deserialization callbacks, the type
system requires us to return valid Rust values of type `V::Value`, where `V` is the type of
a given `visitor`. This contraint limits the way we can stop graph traversal to only a few cases.

The first 4 cases are what we have called *possible recursion points* above:

* while visiting an `Option<T>` for the second time, we choose to return the value `None` to stop;
* while visiting an `Seq<T>` for the second time, we choose to return the empty sequence `[]`;
* while visiting an `Map<K, V>` for the second time, we choose to return the empty map `{}`;
* while visiting an `enum T` for the second time, we choose to return the first variant, i.e.
a "base case" by assumption (1) above.

In addition to the cases above,

* while visiting a container, if the container's name is mapped to a recorded value,
we MAY decide to use it.

The default configuration `TracerConfig:default()` always picks the recorded value for a
`NewTypeStruct` and never does in the other cases.

For efficiency reasons, the current algorithm does not attempt to scan the variants of enums
other than the parameter `T` of the main call `trace_type<T>`. As a consequence, each enum type must be
traced separately.

## Contributing

See the [CONTRIBUTING](../CONTRIBUTING.md) file for how to help out.

## License

This project is available under the terms of either the [Apache 2.0 license](../LICENSE-APACHE) or the [MIT license](../LICENSE-MIT).

<!--
README.md is generated from README.tpl by cargo readme. To regenerate:

cargo install cargo-readme
cargo readme > README.md
-->