ufotofu/
fuzz_testing_tutorial.rs

1//! A tutorial about [fuzz testing](https://rust-fuzz.github.io/book/introduction.html) ufotofu-related code.
2//!
3//! This tutorial assumes that you already know what fuzz testing is, why you would want to do it, and how to do [structure-aware fuzzing](https://rust-fuzz.github.io/book/cargo-fuzz/structure-aware-fuzzing.html) with the `cargo fuzz` utility.
4//!
5//! Note that all functionality described here is available only when you enable the `dev` feature of utofotu.
6//!
7//! ## Obtaining Arbitrary Producers and Consumers
8//!
9//! Often you might find yourself coding up some functionality that should work for arbitrary producers or consumers. Take, for example the following function which counts how many items a bulk producer produces:
10//!
11//! ```
12//! # use ufotofu::prelude::*;
13//! async fn bulk_count<P: BulkProducer>(p: &mut P) -> usize {
14//!     let mut counted = 0;
15//!
16//!     loop {
17//!         match p
18//!             .expose_items(async |slots| {
19//!                 counted += slots.len();
20//!                 (slots.len(), ())
21//!             })
22//!             .await
23//!         {
24//!             Ok(Left(())) => { /* no-op, continue counting */ }
25//!             _ => return counted,
26//!         }
27//!     }
28//! }
29//! ```
30//!
31//! There is an obvious way of checking this function for correctness: randomly generate some items, then create a bulk producer which produces those items, call `bulk_count` on it, and verify that `bulk_count` returns the number of items you created.
32//!
33//! The [`TestProducer`](crate::producer::TestProducer) type is a testing utility which lets you create (bulk) producers with configurable behaviour. And its [`Arbitrary`](arbitrary::Arbitrary) impl allows a fuzzer to create producers with arbitrary behaviour.
34//!
35//! The following fuzz test uses `TestProducer` to check the correctness of the `bulk_count` function:
36//!
37//! ```no_run
38//! #![no_main]
39//! use libfuzzer_sys::fuzz_target;
40//! use ufotofu::prelude::*;
41//!
42//! # #[cfg(feature = "dev")] {
43//! // Tell the fuzzer to generate `TestProducers` with `u16` Items and `()` as Final and Error.
44//! fuzz_target!(|data: TestProducer<u16, (), ()>| {
45//!     let mut p = data;
46//!
47//!     // Obtain the number of items that will be produced.
48//!     // (`p.as_slice()` returns a slice of all items which are yet to be produced.)
49//!     let len = p.as_slice().len();
50//!
51//!     // The `pollster` crate lets you run async code in a sync closure.
52//!     pollster::block_on(async {
53//!         // Call our function on the TestProducer, crash if it behaves incorrectly.
54//!         assert_eq!(bulk_count(&mut p).await, len);
55//!     });
56//! });
57//! #
58//! # async fn bulk_count<P: BulkProducer>(p: &mut P) -> usize {
59//! #     let mut counted = 0;
60//! #
61//! #     loop {
62//! #         match p
63//! #             .expose_items(async |slots| {
64//! #                 counted += slots.len();
65//! #                 (slots.len(), ())
66//! #             })
67//! #             .await
68//! #         {
69//! #             Ok(Left(())) => { /* no-op, continue counting */ }
70//! #             _ => return counted,
71//! #         }
72//! #     }
73//! # }
74//! # }
75//! ```
76//!
77//! The counterpart for generating arbitrary (bulk) *consumers* is the [`TestConsumer`](crate::consumer::TestConsumer). The following example shows our fuzz test for the [`pipe`](crate::pipe) function: we generate a random producer *and* a random consumer, and check that after piping all data ends up where it should end up.
78//!
79//! ```no_run
80//! #![no_main]
81//! use libfuzzer_sys::fuzz_target;
82//! use ufotofu::{pipe, prelude::*, PipeError};
83//!
84//! # #[cfg(feature = "dev")] {
85//! fuzz_target!(
86//!     |data: (TestProducer<u16, u16, u16>, TestConsumer<u16, u16, u16>)| {
87//!         pollster::block_on(async {
88//!             let (mut pro, mut con) = data;
89//!
90//!             // Create a copy of the items that will be produced.
91//!             let items = pro.as_slice().to_vec();
92//!             // Create a copy of the last value to be produced (a `Result<Final, Error>`).
93//!             let last = *pro.peek_last().unwrap();
94//!
95//!             // Pipe the producer into the consumer, then check the outcome.
96//!             match pipe(&mut pro, &mut con).await {
97//!                 // Neither producer nor consumer emitted an error.
98//!                 Ok(()) => {
99//!                     // The producer emitted its final item.
100//!                     assert!(pro.did_already_emit_last());
101//!
102//!                     // The slice of consumed items matches the slice of produced items.
103//!                     assert_eq!(con.as_slice(), pro.already_produced());
104//!
105//!                     // The final value of the producer was used to close the consumer.
106//!                     assert_eq!(con.peek_final(), Some(&last.unwrap()));
107//!                 }
108//!                 // The consumer emitted an error before the producer did.
109//!                 Err(PipeError::Consumer(_err)) => {
110//!                     // The consumer truly did emit its error already.
111//!                     assert!(con.did_already_error());
112//!
113//!                     // The slice of consumed items matches the slice of produced items,
114//!                     // except the last one is missing iff the error did not occur on closing.
115//!                     if pro.did_already_emit_last() {
116//!                         assert_eq!(con.as_slice(), pro.already_produced());
117//!                     } else {
118//!                         let len_prod = pro.already_produced().len();
119//!                         assert_eq!(con.as_slice(), &pro.already_produced()[..len_prod - 1]);
120//!                     }
121//!                 }
122//!                 // The producer emitted an error before the consumer did.
123//!                 Err(PipeError::Producer(_err)) => {
124//!                     // The producer truly did emit its error already, and the consumer did not.
125//!                     assert!(pro.did_already_emit_last());
126//!                     assert!(!con.did_already_error());
127//!
128//!                     // The slice of consumed items matches the slice of produced items.
129//!                     assert_eq!(con.as_slice(), &items[..con.as_slice().len()]);
130//!                     assert_eq!(pro.as_slice(), &items[con.as_slice().len()..]);
131//!                 }
132//!             }
133//!         });
134//!     }
135//! );
136//! # }
137//! ```
138//!
139//! Note that the `Arbitrary` impls of `TestProducer` and `TestConsumer` not only create random sequences ans errors, they also create random patterns of how large the slices exposed in bulk processing are, and they will make the trait methods randomly yield back to the executor. The resulting tests are pretty thorough!
140//!
141//! ## Testing Producers and Consumers
142//!
143//! Testing code that works with arbitrary producers and consumers is one thing, but often you have implemented a concrete bulk producer or bulk cosumer yourself, and would like to verify its correctness. In particular, it should function correctly even under the most bizarre combinations of individual operations, bulk operations, and flush/slurp calls. This is where [`BulkProducerExt::to_bulk_scrambled`](crate::producer::BulkProducerExt::to_bulk_scrambled) and [`BulkConsumerExt::to_bulk_scrambled`](crate::consumer::BulkConsumerExt::to_bulk_scrambled) enter the picture.
144//!
145//! These methods allow to you to wrap any bulk processor to obtain a new one. The new processor produces or consumes the exact same sequences, but it interacts with the wrapped producer according to a set pattern. And by letting a fuzzer generate this pattern, we can have it explore all sorts of interesting corner cases.
146//!
147//! Take, as a silly example, the following bulk explorer for repeating an item exactly 17 times:
148//!
149//! ```no_run
150//! use ufotofu::prelude::*;
151//!
152//! struct SeventeenTimes<T> {
153//!     already_emitted: usize,
154//!     items: [T; 17],
155//! }
156//!
157//! fn seventeen_times<T: Clone>(item: T) -> SeventeenTimes<T> {
158//!     SeventeenTimes {
159//!         already_emitted: 0,
160//!         items: core::array::from_fn(|_| item.clone()),
161//!     }
162//! }
163//!
164//! impl<T: Clone> Producer for SeventeenTimes<T> {
165//!     type Item = T;
166//!     type Final = ();
167//!     type Error = Infallible;
168//!
169//!     async fn produce(&mut self) -> Result<Either<Self::Item, Self::Final>, Self::Error> {
170//!         if self.already_emitted == 17 {
171//!             return Ok(Right(()));
172//!         } else {
173//!             self.already_emitted += 1;
174//!             return Ok(Left(self.items[0].clone()));
175//!         }
176//!     }
177//!
178//!     async fn slurp(&mut self) -> Result<(), Self::Error> {
179//!         return Ok(());
180//!     }
181//! }
182//!
183//! impl<T: Clone> BulkProducer for SeventeenTimes<T> {
184//!     async fn expose_items<F, R>(&mut self, f: F) -> Result<Either<R, Self::Final>, Self::Error>
185//!     where
186//!         F: AsyncFnOnce(&[Self::Item]) -> (usize, R),
187//!     {
188//!         if self.already_emitted == 17 {
189//!             return Ok(Right(()));
190//!         } else {
191//!             let (amount, ret) = f(&self.items[self.already_emitted..]).await;
192//!             self.already_emitted += amount;
193//!             return Ok(Left(ret));
194//!         }
195//!     }
196//! }
197//! ```
198//!
199//! The following short fuzz test ensures that it operates correctly:
200//!
201//! ```no_run
202//! #![no_main]
203//! use libfuzzer_sys::fuzz_target;
204//! use ufotofu::{prelude::*, queues::new_fixed};
205//!
206//! # #[cfg(feature = "dev")] {
207//! // Tell the fuzzer to generate a random item to repeat, and
208//! // the random data we need to create a scrambled version of `SeventeenTimes`.
209//! fuzz_target!(|data: (u16, usize, Vec<BulkProducerOperation>)| {
210//!     pollster::block_on(async {
211//!         let (item, mut buffer_size, ops) = data;
212//!         // Ensure the scrambler uses a non-empty but not memory-shattering internal buffer.
213//!         buffer_size = buffer_size.clamp(1, 4096);
214//!
215//!         // Create the producer-under-test...
216//!         let pro = seventeen_times(item);
217//!         // ...and wrap it in a scrambler.
218//!         let mut scrambled = pro.to_bulk_scrambled(new_fixed(buffer_size), ops);
219//!
220//!         // Interact with the scrambled version in a simple way, the scrambler then
221//!         // exercises the producer-under-test according to the access pattern supplied
222//!         // by the fuzzer.
223//!         for _ in 0..17 {
224//!             assert_eq!(scrambled.produce().await, Ok(Left(item)));
225//!         }
226//!         assert_eq!(scrambled.produce().await, Ok(Right(())));
227//!     });
228//! });
229//! # struct SeventeenTimes<T> {
230//! #     already_emitted: usize,
231//! #     items: [T; 17],
232//! # }
233//! #
234//! # fn seventeen_times<T: Clone>(item: T) -> SeventeenTimes<T> {
235//! #     SeventeenTimes {
236//! #         already_emitted: 0,
237//! #         items: core::array::from_fn(|_| item.clone()),
238//! #     }
239//! # }
240//! #
241//! # impl<T: Clone> Producer for SeventeenTimes<T> {
242//! #     type Item = T;
243//! #     type Final = ();
244//! #     type Error = Infallible;
245//! #
246//! #      async fn produce(&mut self) -> Result<Either<Self::Item, Self::Final>, Self::Error> {
247//! #         if self.already_emitted == 17 {
248//! #             return Ok(Right(()));
249//! #         } else {
250//! #             self.already_emitted += 1;
251//! #             return Ok(Left(self.items[0].clone()));
252//! #         }
253//! #     }
254//! #
255//! #     async fn slurp(&mut self) -> Result<(), Self::Error> {
256//! #         return Ok(());
257//! #     }
258//! # }
259//! #
260//! # impl<T: Clone> BulkProducer for SeventeenTimes<T> {
261//! #     async fn expose_items<F, R>(&mut self, f: F) -> Result<Either<R, Self::Final>, Self::Error>
262//! #     where
263//! #         F: AsyncFnOnce(&[Self::Item]) -> (usize, R),
264//! #     {
265//! #         if self.already_emitted == 17 {
266//! #             return Ok(Right(()));
267//! #         } else {
268//! #             let (amount, ret) = f(&self.items[self.already_emitted..]).await;
269//! #             self.already_emitted += amount;
270//! #             return Ok(Left(ret));
271//! #         }
272//! #     }
273//! # }
274//! # }
275//! ```
276//!
277//! Although the test code simply calls [`produce`](crate::Producer::produce) repeatedly, the wrapped `SeventeenTimes` producer is subjected to whichever bizarre access pattern the fuzzer has dreamed up.
278//!
279//! The following example demonstrates not only scrambling of bulk *consumers*, but it also shows a common technique for fuzzing adaptors: we employ both a `TestConsumer` (as the consumer being adapted) and scrambling (to exercise the adaptor under arbitrary usage patterns). Note that the scrambler for bulk consumers may internally buffer items before forwarding them to the wrapped consumer, so you may have to `flush` it before seeing the final results (demonstrated in the consumer-error-handling case of the example).
280//!
281//! ```no_run
282//! #![no_main]
283//! use libfuzzer_sys::fuzz_target;
284//! use ufotofu::{prelude::*, queues::new_fixed};
285//!
286//! # #[cfg(feature = "dev")] {
287//! // Generate a TestConsumer, buffer it with the bulk_buffered method, turn *that* into a
288//! // bulk scrambled, feed a sequence of values into it, and test that the wrapped buffered
289//! // TestConsumer forwarded the same items to the underlying TestConsumer as a control
290//! // consumer received.
291//! fuzz_target!(|data: (
292//!     TestConsumer<u16, u16, u16>,
293//!     usize,
294//!     usize,
295//!     Vec<BulkConsumerOperation>,
296//!     Box<[u16]>,
297//!     u16
298//! )| {
299//!     pollster::block_on(async {
300//!         let (con, mut buffered_buffer_size, mut scrambler_buffer_size, ops, input, fin) = data;
301//!         // Clamp the sizes of internal buffers to ensure non-empty, not-too-large queues.
302//!         buffered_buffer_size = buffered_buffer_size.clamp(1, 4096);
303//!         scrambler_buffer_size = scrambler_buffer_size.clamp(1, 4096);
304//!
305//!         // A plain consumer; our buffered, scrambled version should behave just like this one.
306//!         let mut control = con.clone();
307//!
308//!         // The consumer we want to test. We will feed equal sequences to both the control
309//!         // consumer and this one, asserting that the consumer to which we added buffering
310//!         // ultimately receives the same data as the non-buffered one.
311//!         let consumer_under_test = con.to_bulk_buffered(new_fixed(buffered_buffer_size));
312//!
313//!         // And we scramble the buffered consumer, to exercise all usage patterns.
314//!         let mut scrambled =
315//!             consumer_under_test.to_bulk_scrambled(new_fixed(scrambler_buffer_size), ops);
316//!
317//!         for x in &input {
318//!             match control.consume_item(*x).await {
319//!                 Ok(()) => {
320//!                     // Whenever the control consumer successfully consumes an item, we also
321//!                     // feed that item into the buffered consumer.
322//!                     assert!(scrambled.consume_item(*x).await.is_ok());
323//!                 }
324//!                 Err(control_err) => {
325//!                     // When the control consumer errors, we want to ensure that the buffered
326//!                     // consumer also errors. Due to buffering, we might need a flush to trigger
327//!                     // the error.
328//!                     let scrambled_err = match scrambled.consume_item(*x).await {
329//!                         Err(err) => err,
330//!                         Ok(()) => scrambled.flush().await.expect_err(
331//!                             "control consumer errored, so the buffered consumer must also error",
332//!                             // (or the buffered consumer is buggy, which is what test for)
333//!                         ),
334//!                     };
335//!
336//!                     // After retrieving the error of the buffered consumer, we check that it is
337//!                     // the correct error, and that before the error, equal sequences were
338//!                     // consumed by both the control and the wrapped consumer.
339//!                     assert_eq!(control_err, scrambled_err);
340//!                     assert_eq!(control, scrambled.into_inner().into_inner());
341//!                     return;
342//!                 }
343//!             }
344//!         }
345//!
346//!         // We consumed all items without error.
347//!         // Check that closing yields equal results.
348//!         assert_eq!(control.consume_final(fin).await, scrambled.consume_final(fin).await);
349//!
350//!         // Finally assert that the scrambled and buffered consumer ultimately received equal
351//!         // sequences of values.
352//!         assert_eq!(control, scrambled.into_inner().into_inner());
353//!     });
354//! });
355//! # }
356//! ```
357//!
358//! ## Scrambling Non-Bulk Processors
359//!
360//! Analogously to [`BulkProducerExt::to_bulk_scrambled`](crate::producer::BulkProducerExt::to_bulk_scrambled) and [`BulkConsumerExt::to_bulk_scrambled`](crate::consumer::BulkConsumerExt::to_bulk_scrambled), there are also scramblers for non-bulk processors: [`ProducerExt::to_scrambled`](crate::producer::ProducerExt::to_scrambled) and [`ConsumerExt::to_scrambled`](crate::consumer::ConsumerExt::to_scrambled). These do fairly little beyond injecting the occasional call to `flush` or `slurp`, but it cannot hurt to be thorough in testing.
361//!
362//! The following example tests that the [`IntoConsumer`](crate::IntoConsumer) and [`IntoProducer`](crate::IntoProducer) impls for [`BTreeSet`](std::collections::BTreeSet) commute irrespective of any flushing or slurping:
363//!
364//! ```no_run
365//! #![no_main]
366//!
367//! use std::collections::BTreeSet;
368//!
369//! use libfuzzer_sys::fuzz_target;
370//! use ufotofu::{prelude::*, queues::new_fixed};
371//!
372//! # #[cfg(feature = "dev")] {
373//! // Generate a random btree set, call into_producer, scramble that producer.
374//! // Then use IntoConsumer on the empty btree set, scramble the consumer, and
375//! // check that the original set is being reconstructed.
376//!
377//! fuzz_target!(|data: (
378//!     BTreeSet<u16>,
379//!     usize,
380//!     Vec<ConsumerOperation>,
381//!     usize,
382//!     Vec<ProducerOperation>,
383//! )| {
384//!     pollster::block_on(async {
385//!         let (input, mut con_buffer_size, con_ops, mut pro_buffer_size, pro_ops) = data;
386//!         con_buffer_size = con_buffer_size.clamp(1, 4096);
387//!         pro_buffer_size = pro_buffer_size.clamp(1, 4096);
388//!
389//!         let input_copy = input.clone();
390//!
391//!         let mut pro = input
392//!             .into_producer()
393//!             .to_scrambled(new_fixed(pro_buffer_size), pro_ops);
394//!
395//!         let mut produced = vec![0; input_copy.len()];
396//!         assert_eq!(pro.overwrite_full_slice(&mut produced[..]).await, Ok(()));
397//!
398//!         let mut con = BTreeSet::default()
399//!             .into_consumer()
400//!             .to_scrambled(new_fixed(con_buffer_size), con_ops);
401//!
402//!         assert_eq!(con.consume_full_slice(&produced[..]).await, Ok(()));
403//!         con.flush().await.unwrap();
404//!
405//!         let consumed: BTreeSet<u16> = con.into_inner().into();
406//!
407//!         assert_eq!(consumed, input_copy);
408//!     });
409//! });
410//! # }
411//! ```
412//!
413//! ## Other
414//!
415//! The [`ProducerExt::equals`](crate::producer::ProducerExt::equals) method is convenient for asserting that some tested producer emits the same sequence as a control producer. Its boolean output is rather useless for debugging, however. The [`ProducerExt::equals_dbg`](crate::producer::ProducerExt::equals_dbg) does the same as [`ProducerExt::equals`](crate::producer::ProducerExt::equals), but additionally logs the values `produced` by both producers.