test_vectors/
lib.rs

1//! Execute tests against test vectors stored in external files
2//!
3//! The [macro@test_vectors] macro annotates a _test criterion function_ which executes against
4//! multiple _cases_. Each case expands to a standalone rust unit test (ie a `#[test]` function). The
5//! data for each case is stored in a _case directory_, where each argument to the test criterion
6//! function is associated with a file. All of the case directories live inside a _corpus
7//! directory_ which is specified with the `dir` parameter to [macro@test_vectors].
8//!
9//! # Example
10//!
11//! Suppose we have a fancy-dancy crate which can replace spaces with hyphens in a string, and we
12//! want to test that functionality against a bunch of input vector files.
13//!
14//! We can organize the crate contents like this:
15//!
16//! - `Cargo.toml` - depending on [test-vectors](crate) in `[dev-dependencies]`
17//! - `src/lib.rs` - containing the example code below
18//! - `test-data/example1/alpha/input` - containing `this is alpha`
19//! - `test-data/example1/alpha/expected` - containing `this_is_alpha`
20//! - `test-data/example1/beta/input` - containing `this is beta`
21//! - `test-data/example1/beta/expected` - containing `this_is_beta`
22//!
23//! Now in our `lib.rs` we have:
24//!
25//! ```
26//! pub fn replace_spaces_with_underscores(input: &str) -> String {
27//!     input.replace(' ', "_")
28//! }
29//!
30//! #[test_vectors::test_vectors(
31//! # doctest = true,
32//!   dir = "test-data/example1"
33//! )]
34//! fn test_replace(input: &[u8], expected: &[u8]) -> Result<(), std::str::Utf8Error> {
35//!     // Test setup:
36//!     let instr = std::str::from_utf8(input)?;
37//!     let expstr = std::str::from_utf8(expected)?;
38//!
39//!     // Application code test target:
40//!     let output = replace_spaces_with_underscores(instr);
41//!
42//!     // Test verification:
43//!     assert_eq!(expstr, &output);
44//!
45//!     Ok(())
46//! }
47//! ```
48//!
49//! This creates two rust unit tests from the case directories inside the corpus directory
50//! `test-data/example1`. The cases are named after the case directories `alpha` and
51//! `beta`. For each test, the file contents of the `input` and `expected` files in the case
52//! directory are mapped to the `&[u8]` test criterion function arguments. The output of `cargo
53//! test` will include something like this:
54//!
55//! ```text
56//! test test_replace_alpha ... ok
57//! test test_replace_beta ... ok
58//! ```
59//!
60//! # Motivations
61//!
62//! This design is well suited to tests which benefit from any of these features:
63//!
64//! - Separate the same criterion function into separate cases by input (similar to the
65//!   [test-case](https://docs.rs/test-case) crate). If a subset of cases fail for the same test
66//!   criterion, the `cargo test` output immediately identifies the specific failing cases, in
67//!   contrast to a single `#[test]` function that loops over test vectors as a whole.
68//! - Test raw un-encoded data stored directly in files, rather than rust-specific literal
69//!   representations. This can help avoid divergence between live production data versus rust
70//!   literal representations aimed at representing the data.
71//! - Test against external files, which facilitates _conformance testing_
72//!   of multiple implementations against a common set of test vectors. For example, a network
73//!   protocol standard may include a set of message serialization test
74//!   vectors which multiple implementations validate against.
75//! - Use other external tools on the external data files. For example,
76//!   if a video codec metadata parsing library has external test vector files, other tools for
77//!   examining that video format, such as interactive video players, can be used directly
78//!   on the test vectors.
79//!
80//! # Corpus and Case Directories
81//!
82//! The corpus directory is specified by the `dir` macro argument. This is a path relative to the
83//! `CARGO_MANIFEST_DIR` environment variable (which is where the crates `Cargo.toml` lives).
84//!
85//! Every directory inside a corpus directory is expected to be a case directory (after
86//! traversing symlinks). Non-directories are ignored, and it's good practice to have a `README.md`
87//! file explaining the corpus.
88//!
89//! Inside a case directory, only the paths derived from the criterion function argument names are
90//! accessed, and other contents are ignored, so a good practice is a `README.md` explaining
91//! the intention of the case. Another nuance of this behavior is that different criterion
92//! functions might reuse the same corpus directory.
93//!
94//! For example, a case directory under `test-data/example2` might have these files:
95//!
96//! - `input` with content `this is the input`
97//! - `underscores` with the content `this_is_the_input`
98//! - `elided` with the content `thisistheinput`
99//!
100//! Then two different criterion functions might test different conversions of the same inputs:
101//!
102//! ```
103//! use test_vectors::test_vectors;
104//! use std::str::Utf8Error;
105//!
106//! #[test_vectors(
107//! # doctest = true,
108//!   dir = "test-data/example2"
109//! )]
110//! fn replace_spaces_with_underscores(input: &[u8], underscores: &[u8]) -> Result<(), Utf8Error> {
111//!     let instr = std::str::from_utf8(input)?;
112//!     let expstr = std::str::from_utf8(underscores)?;
113//!     let output = instr.replace(' ', "_");
114//!     assert_eq!(expstr, &output);
115//!     Ok(())
116//! }
117//!
118//! #[test_vectors(
119//! # doctest = true,
120//!   dir = "test-data/example2"
121//! )]
122//! fn elide_spaces(input: &[u8], elided: &[u8]) -> Result<(), Utf8Error> {
123//!     let instr = std::str::from_utf8(input)?;
124//!     let expstr = std::str::from_utf8(elided)?;
125//!     let output = instr.replace(' ', "");
126//!     assert_eq!(expstr, &output);
127//!     Ok(())
128//! }
129//! ```
130//!
131//! Since both criterion functions use the same corpus and both take `input`, they are testing
132//! against the same test vector `input` files, while each function reads a different test vector
133//! for its specific functionality, ie `underscores` vs `elided`.
134//!
135//! # Automatic Input Conversion From Bytes
136//!
137//! Arguments to criterion test functions are translated with `TryFrom<&[u8]>` against the
138//! file contents, which are available as `&[u8]`. Since this trait provides a blanket
139//! implementation, an argument of type `&[u8]` is the basic supported type.
140//!
141//! For other types, this can take care of some boiler-plate for converting inputs by using a
142//! standard rust trait. This approach, versus supporting customizeable conversions in the macro
143//! interface keeps the macro interface and logic simpler by relying on this standard rust trait.
144//!
145//! The result of conversion is unwrapped, so any failure of conversion causes a panic and the test
146//! case will fail. The call site looks something like:
147//!
148//! ```text
149//! <T>::try_from(include_bytes!(…)).unwrap()
150//! ```
151//!
152//! Recall in the first example, we explicitly called [std::str::from_utf8] to convert the byte
153//! slice parameters. This is an example of a conversion function that is not available via
154//! `TryFrom<&[u8]>` (because there might be multiple ways to convert bytes into a `str`). So that
155//! example highlights how test criterion functions may need to rely on newtype wrapper types to
156//! perform conversions. The [test-vectors](crate) crate provides some commonly needed wrapper types, such as [Utf8Str] for that case. Compare the example in the [Utf8Str] docs to the first example above.
157//!
158//! If a test needs some custom conversion, it may need to implement a custom new-type wrapper, as
159//! the next example shows:
160//!
161//! # Example of Implementing a Custom Conversion New-Type
162//!
163//! Suppose your crate type `T` implements [serde](https://doc.rs/serde)'s `Deserialize`
164//! trait, your test vectors are JSON data, and you want to remove the boilerplate of deserializing
165//! JSON in your criterion functions.
166//!
167//! You could implement a newtype that performs the conversion for you:
168//!
169//! ```
170//! use serde::Deserialize;
171//! use test_vectors::test_vectors;
172//!
173//! #[derive(Deserialize)]
174//! struct AppType {
175//!     valid: bool
176//! }
177//!
178//! impl AppType {
179//!     fn is_valid(&self) -> bool {
180//!         self.valid
181//!     }
182//! }
183//!
184//! struct AppTypeFromJson(AppType);
185//!
186//! impl std::ops::Deref for AppTypeFromJson {
187//!     type Target = AppType;
188//!
189//!     fn deref(&self) -> &Self::Target {
190//!         &self.0
191//!     }
192//! }
193//!
194//! impl TryFrom<&[u8]> for AppTypeFromJson
195//! {
196//!     type Error = serde_json::Error;
197//!
198//!     fn try_from(input: &[u8]) -> Result<Self, Self::Error> {
199//!         serde_json::from_slice(input).map(AppTypeFromJson)
200//!     }
201//! }
202//!
203//! #[test_vectors(
204//! # doctest = true,
205//!   dir = "test-data/example3"
206//! )]
207//! fn validate(value: AppTypeFromJson) {
208//!     // Perform test-logic on `value`:
209//!     assert!(value.is_valid());
210//! }
211//! ```
212//!
213//! # Criterion Function Return Type
214//!
215//! The return type of a criterion function is replicated directly for each test case, and the test
216//! returns the criterion function result unaltered. Criterion functions can return `()` or [Result] with identical behavior to unit tests.
217
218mod utf8str;
219
220pub use self::utf8str::Utf8Str;
221pub use test_vectors_macro::test_vectors;