comparable_derive 0.3.0

A library for comparing data structures in Rust, oriented toward testing
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
The `comparable` crate defines the trait [`Comparable`], along with a derive
macro for auto-generating instances of this trait for most data types.
Primarily the purpose of this trait is to offer a method,
[`Comparable::comparison`], by which two values of any type supporting that
trait can yield a summary of the differences between them.

Note that unlike other crates that do data differencing (primarily between
scalars and collections), `comparable` has been written primarily with testing
in mind. That is, the purpose of generating such change descriptions is to
enable writing tests that assert the set of expected changes after some
operation between an initial state and the resulting state. This goal also
means that some types, like
[`HashMap`](https://doc.rust-lang.org/std/collections/struct.HashMap.html),
must be differenced after ordering the keys first, so that the set of changes
produced can be made deterministic and thus expressible as a test expectation.

To these ends, the function [`assert_changes`] is also provided, taking two
values of the same type along with an expected "change description" as
returned by `foo.comparison(&bar)`. This function uses the
[`pretty_assertions`](https://crates.io/crates/pretty_assertions) crate under
the hood so that minute differences within deep structures can be easily seen
in the failure output.

# Quickstart

If you want to get started quickly with the [`Comparable`] crate to enhance unit
testing, do the following:

1. Add the `comparable` crate as a dependency, enabling `features = ["derive"]`.
2. Derive the `Comparable` trait on as many structs and enums as needed.
3. Structure your unit tests to follow these three phases:
   a. Create the initial state or dataset you intend to test and make a copy
      of it.
   b. Apply your operations and changes to this state.
   c. Use [`assert_changes`] between the initial state and the resulting state
      to assert that whatever happened is exactly what you expected to happen.

The main benefit of this approach over the usual method of "probing" the
resulting state -- to ensure it changed as you expected it to-- is that it
asserts against the exhaustive set of changes to ensure that no unintended
side-effects occurred beyond what you expected to happen. In this way, it is
both a positive and a negative test: checking for what you expect to see as
well as what you don't expect to see.

# The Comparable trait

The [`Comparable`] trait has two associated types and two methods, one pair
corresponding to _value descriptions_ and the other to _value changes_:

```rust
pub trait Comparable {
    type Desc: std::cmp::PartialEq + std::fmt::Debug;
    fn describe(&self) -> Self::Desc;

    type Change: std::cmp::PartialEq + std::fmt::Debug;
    fn comparison(&self, other: &Self) -> comparable::Changed<Self::Change>;
}
```

## Descriptions: the [`Comparable::Desc`] associated type

Value descriptions (the [`Comparable::Desc`] associated type) are needed
because value hierarchies can involve many types. Perhaps some of these types
implement `PartialEq` and `Debug`, but not all. To work around this
limitation, the [`Comparable`] derive macro creates a "mirror" of your data
structure with all the same constructors ands field, but using the
[`Comparable::Desc`] associated type for each of its contained types.

```
# use comparable_derive::*;
#[derive(Comparable)]
struct Foo {
  bar: u32,
  baz: u32
}
```

This generates a description that mirrors the original type, but using type
descriptions rather than the types themselves:

```
struct FooDesc {
  bar: <u32 as comparable::Comparable>::Desc,
  baz: <u32 as comparable::Comparable>::Desc
}
```

You may also choose an alternate description type, such as a reduced form of a
value or some other type entirely. For example, complex structures could
describe themselves by the set of changes they represent from a `Default`
value. This is so common, that it's supported via a `compare_default` macro
attribute provided by `comparable`:

```
# use comparable_derive::*;
#[derive(Comparable)]
#[compare_default]
struct Foo { /* ...lots of fields... */ }

impl Default for Foo {
    fn default() -> Self { Foo {} }
}
```

For scalars, the [`Comparable::Desc`] type is the same as the type it's
describing, and these are called "self-describing".

There are other macro attributes provided for customizing things even further,
which are covered below, beginning at the section on [Structures](#structs).

## Changes: the [`Comparable::Change`] associated type

When two values of a type differ, this difference gets represented using the
associated type [`Comparable::Change`]. Such values are produced by the
[`Comparable::comparison`] method, which actually returns `Changed<Change>`
since the result may be either `Changed::Unchanged` or
`Changed::Changed(_changes_)`.[^option]

[^option] `Changed` is just a different flavor of the `Option` type, created
to make changesets clearer than just seeing `Some` in various places.

The primary purpose of a [`Comparable::Change`] value is to compare it to a
set of changes you expected to see, so design choices have been made to
optimize for clarity and printing rather than, say, the ability to transform
one value into another by applying a changeset. This is entirely possible give
a dataset and a change description, but no work has been done to achieve this
goal.

How changes are represented can differ greatly between scalars, collections,
structs and enums, so more detail is given below in the section discussing
each of these types.

# Scalars

[`Comparable`] traits have been implemented for all of the basic scalar types.
These are self-describing, and use a [`Comparable::Change`] structure named
after the type that holds the previous and changed values. For example, the
following assertions hold:

```
# use comparable::*;
assert_changes(&100, &100, Changed::Unchanged);
assert_changes(&100, &200, Changed::Changed(I32Change(100, 200)));
assert_changes(&true, &false, Changed::Changed(BoolChange(true, false)));
assert_changes(
    &"foo",
    &"bar",
    Changed::Changed(StringChange("foo".to_string(), "bar".to_string())),
);
```

# Vec and Set Collections

The set collections for which [`Comparable`] has been implemented are: `Vec`,
`HashSet`, and `BTreeSet`.

The `Vec` uses `Vec<VecChange>` to report all of the indices at which changes
happened. Note that it cannot detect insertions in the middle, and so will
likely report every item as changed from there until the end of the vector, at
which point it will report an added member.

`HashSet` and `BTreeSet` types both report changes the same way, using the
`SetChange` type. Note that in order for `HashSet` change results to be
deterministic, the values in a `HashSet` must support the `Ord` trait so they
can be sorted prior to comparison. Sets cannot tell when specific members have
change, and so only report changes in terms of `SetChange::Added` and
`SetChange::Removed`.

Here are a few examples, taken from the `comparable_test` test suite:

```
# use comparable::*;
# use std::collections::HashSet;
// Vectors
assert_changes(
    &vec![1 as i32, 2],
    &vec![1 as i32, 2, 3],
    Changed::Changed(vec![VecChange::Added(2, 3)]),
);
assert_changes(
    &vec![1 as i32, 3],
    &vec![1 as i32, 2, 3],
    Changed::Changed(vec![
        VecChange::Changed(1, I32Change(3, 2)),
        VecChange::Added(2, 3),
    ]),
);
assert_changes(
    &vec![1 as i32, 2, 3],
    &vec![1 as i32, 3],
    Changed::Changed(vec![
        VecChange::Changed(1, I32Change(2, 3)),
        VecChange::Removed(2, 3),
    ]),
);
assert_changes(
    &vec![1 as i32, 2, 3],
    &vec![1 as i32, 4, 3],
    Changed::Changed(vec![VecChange::Changed(1, I32Change(2, 4))]),
);

// Sets
assert_changes(
    &HashSet::from(vec![1 as i32, 2].into_iter().collect()),
    &HashSet::from(vec![1 as i32, 2, 3].into_iter().collect()),
    Changed::Changed(vec![SetChange::Added(3)]),
);
assert_changes(
    &HashSet::from(vec![1 as i32, 3].into_iter().collect()),
    &HashSet::from(vec![1 as i32, 2, 3].into_iter().collect()),
    Changed::Changed(vec![SetChange::Added(2)]),
);
assert_changes(
    &HashSet::from(vec![1 as i32, 2, 3].into_iter().collect()),
    &HashSet::from(vec![1 as i32, 3].into_iter().collect()),
    Changed::Changed(vec![SetChange::Removed(2)]),
);
assert_changes(
    &HashSet::from(vec![1 as i32, 2, 3].into_iter().collect()),
    &HashSet::from(vec![1 as i32, 4, 3].into_iter().collect()),
    Changed::Changed(vec![SetChange::Added(4), SetChange::Removed(2)]),
);
```

Note that if the first `VecChange::Change` above had used an index of 1
instead of 0, the resulting failure would look something like this:

```text
running 1 test
test test_comparable_bar ... FAILED

failures:

---- test_comparable_bar stdout ----
thread 'test_comparable_bar' panicked at 'assertion failed: `(left == right)`

Diff < left / right > :
 Changed(
     [
         Change(
<            1,
>            0,
             I32Change(
                 100,
                 200,
             ),
         ),
     ],
 )

', /Users/johnw/src/comparable/comparable/src/lib.rs:19:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    test_comparable_bar
```

# Map Collections

**TODO**: jww (2021-11-01): Need content here.

# <a name="structs"></a>Structures

Differencing arbitrary structures was the original motive for creating
`comparable`. This is made feasible using a [`Comparable`] derive macro that
auto-generates code needed for such comparisons. The purpose of this section
is to explain how this macro works, and the various attribute macros that can
be used to guide the process. If all else fails, manual trait implementations
are always an alternative.

For the purpose of the following sub-sections, we consider the following
structure:

```
# use comparable_derive::*;
#[derive(Comparable)]
struct Foo {
  bar: u32,
  baz: u32,
  #[comparable_ignore]
  quux: Box<dyn FnOnce(u32)>
}
```

## `comparable_ignore`

The first attribute macro you'll notice that can be applied to individual
fields is `#[comparable_ignore]`, which must be used if the type in question
cannot be compared for differences.

## `comparable_synthetic`

The `#[comparable_synthetic { <BINDINGS...> }]` attribute allows you to attach
one or more "synthetic properties" to a field, which are then considered in
both descriptions and change sets, as if they were actual fields with the
computed value. Here is an example:

```
# use comparable_derive::*;
#[derive(Comparable)]
pub struct Synthetics {
    #[comparable_synthetic {
        let full_value = |x: &Self| -> u8 { x.ensemble.iter().sum() };
    }]
    #[comparable_ignore]
    pub ensemble: Vec<u8>,
}
```

This structure has an `ensemble` field containing a vector of `u8` values.
However, in tests we may not care if the vector's contents change, so long as
the final sum remains the same. This is done by ignoring the ensemble field so
that it's not generated or described at all, while creating a synthetic field
_derived from the full object_ that yields the sum.

Note that the syntax for the `comparable_synthetic` attribute is rather
specific: a series of simply-named `let` bindings, where the value in each
case is a fully typed closure that takes a reference to the object containing
the original field (`&Self`), and yields a value of some type for which
[`Comparable`] has been implemented or derived.

## Deriving Comparable for structs: the Desc type

By default, deriving [`Comparable`] for a structure will create a "mirror" of
that structure, with all the same fields, but replacing every type `T` with
`<T as Comparable>::Desc`:

```
# use comparable::*;
struct FooDesc {
  bar: <u32 as Comparable>::Desc,
  baz: <u32 as Comparable>::Desc
}
```

This process can be influenced using several attribute macros.

### `self_describing`

If the `self_describing` attribute is used, the [`Comparable::Desc`] type is
set to be the type itself, and the [`Comparable::describe`] method return a
clone of the value.

Note the following traits are required for self-describing types: `Clone`,
`Debug` and `PartialEq`.

### `no_description`

If you want no description at all for a type, since you only care about how it
has changed and never want to report a description of the value in any other
context, then you can use `#[no_description]`. This sets the
[`Comparable::Desc`] type to be unit, and the [`Comparable::describe`] method
accordingly:

```ignore
type Desc = ();

fn describe(&self) -> Self::Desc {
    ()
}
```

It is assumed that when this is appropriate, such values will never appear in
any change output, so consider a different approach if you see lots of units
turning up.

### `describe_type` and `describe_body`

You can have more control over description by specifying exactly the text that
should appear for the [`Comparable::Desc`] type and the body of the
[`Comparable::describe`] function. Basically, for the following definition:

```ignore
# use comparable_derive::*;
#[derive(Comparable)]
#[describe_type(T)]
#[describe_body(B)]
struct Foo {
  bar: u32,
  baz: u32
}
```

The following is generated:

```ignore
type Desc = T;

fn describe(&self) -> Self::Desc {
    B
}
```

This also means that the expression argument passed to `describe_body` may
reference the `self` parameter. Here is a real-world use case:

```
# use comparable_derive::*;
#[cfg_attr(feature = "comparable",
           derive(comparable::Comparable),
           describe_type(String),
           describe_body(self.to_string()))]
struct Foo {}
```

This same approach could be used to represent large blobs of data by their
checksum hash, for example, or large data structures that you don't need to
ever display by their Merkle root hash.

#### `compare_default`

When the `#[compare_default]` attribute macro is used, the
[`Comparable::Desc`] type is defined to be the same as the
[`Comparable::Change`] type, with the [`Comparable::describe`] method being
implemented as a comparison against the value of `Default::default()`:

```ignore
# use comparable::*;
impl comparable::Comparable for Foo {
    type Desc = Self::Change;

    fn describe(&self) -> Self::Desc {
        Foo::default().comparison(self).unwrap_or_default()
    }

    type Change = Vec<FooChange>;

    /* ... */
}
```

Note that changes for structures are always a vector, since this allows
changes to be reported separately for each field. More on this in the
following section.

## Deriving Comparable for structs: the Change type

By default for structs, deriving [`Comparable`] creates an `enum` with
variants for each field in the `struct`, and it represents changes using a
vector of such values. This means that for the following definition:

```
# use comparable_derive::*;
#[derive(Comparable)]
struct Foo {
  bar: u32,
  baz: u32
}
```

The [`Comparable::Change`] type is defined to be `Vec<FooChange>`, with
`FooChange` as follows:

```ignore
#[derive(PartialEq, Debug)]
enum FooChange {
    Bar(<u32 as Comparable>::Change),
    Baz(<u32 as Comparable>::Change),
}

impl comparable::Comparable for Foo {
    type Desc = FooDesc;
    type Change = Vec<FooChange>;
}
```

Here is an abbreviated example of how this looks when asserting changes:

```ignore
assert_changes(
    &initial_foo, &later_foo,
    Changed::Changed(vec![
        FooChange::Bar(...),
        FooChange::Baz(...),
    ]));
```

If the field hasn't been changed it won't appear in the vector, and each field
appears at most once. The reason for taking this approach is that structures
with many, many fields can be represented by a small change set if most of the
other fields were left untouched.

### Special case: Unit structs

If a struct has no fields it can never change, and so only a unitary
[`Comparable::Desc`] type is generated.

### Special case: Singleton structs

If a struct has only one field, there is no reason to specify changes using a
vector, since either the struct is unchanged or just that one field has
changed. For this reason, singleton structs optimize away the vector and use
`type Change = [type]Change` in their [`Comparable`] derivation, rather than
`type Change = Vec<[type]Change>` as for multi-field structs.

### `comparable_public` and `comparable_private`

By default, the auto-generated [`Comparable::Desc`] and [`Comparable::Change`]
types have the same visibility as their parent. This may not be appropriate,
however, if you want to keep the original data type private but allow
exporting of descriptions and change sets. To support this -- and the converse
-- you can use `#[comparable_public]` and `#[comparable_private]` to be
explicit about the visibility of these generated types.


# <a name="enums"></a>Enumerations

Enumerations are handled quite differently from structures, for the main
reason that while a `struct` is always a product of fields, an `enum` can be
more than a sum of variants -- but also a sum of products.

To unpack that a bit: By a product of fields, it is meant that a `struct` is a
simple grouping of typed fields, where the same fields are available for
_every_ value of such a structure.

Meanwhile, an `enum` is a sum, or choice, among variants, but some of these
variants can themselves contain groups of fields, as though there were an
unnamed structure embedded in the variant. Consider the following `enum`,
which will be used for all the following examples:

```
# use comparable_derive::*;
#[derive(Comparable)]
enum MyEnum {
    One(bool),
    Two { two: Vec<bool>, two_more: u32 },
    Three,
}
```

Here we see variant that has a variant with no fields (`Three`), one with
unnamed fields (`One`), and one with named fields like a usual structure
(`Two`). The problem, though, is that these embedded structures are never
represented as independent types, so we can't define [`Comparable`] for them
and just compute the differences between the enum arguments. Nor can we just
create a copy of the field type with a real name and generate [`Comparable`]
for it, because not every value is copyable or clonable, and it gets very
tricky to auto-generate a new hierarchy built out fields with reference types
all the way down...

Instead, the following gets generated, which can end up being a bit verbose,
but captures the full nature of any differences:

```ignore
enum MyEnumChange {
    BothOne(<bool as comparable::Comparable>::Change),
    BothTwo {
        two: Changed<<Vec<bool> as comparable::Comparable>::Change>,
        two_more: Changed<Baz as comparable::Comparable>::Change
    },
    BothThree,
    Different(
        <MyEnum as comparable::Comparable>::Desc,
        <MyEnum as comparable::Comparable>::Desc
    ),
}
```

Note that variants with singleton fields do not use [`Comparable::Change`],
since that information is already reflected when the variant is reported as
having changed at all using, for example, `BothOne`. In the case of `BothTwo`,
each of the field types is wrapped in `Changed` because it's possible that
either one or both of the fields may changed.

### Special case: Empty enums

If a enum has no variants it cannot be constructed, so both the
[`Comparable::Desc`] or [`Comparable::Change`] types are omitted and it is
always reported as unchanged.

# <a name="unions"></a>Unions

Unions cannot derive [`Comparable`] instances at the present time.