1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
#![no_std]
#![warn(missing_docs)]

//! This crate offers two different ways to deserialize sequences without
//! allocating.
//!
//! # Example
//!
//! Given the following JSON:
//!
//! ```json
//! [
//!     {"id": 0, "name": "bob", "subscribed_to": ["rust", "knitting", "cooking"]},
//!     {"id": 1, "name": "toby 🐶", "subscribed_to": ["sticks", "tennis-balls"]},
//!     {"id": 2, "name": "alice", "subscribed_to": ["rust", "hiking", "paris"]},
//!     {"id": 3, "name": "mark", "subscribed_to": ["rust", "rugby", "doctor-who"]},
//!     {"id": 4, "name": "vera", "subscribed_to": ["rust", "mma", "philosophy"]}
//! ]
//! ```
//! we can process it without allocating a 5-sized vector of items as follow:
//!
//! ```rust
//! use serde_deser_iter::top_level::DeserializerExt;
//! # use std::{fs::File, io::BufReader, path::PathBuf, collections::HashSet};
//! #
//! /// The type each item in the sequence will be deserialized to.
//! #[derive(serde::Deserialize)]
//! struct DataEntry {
//!     // Not all fields are needed, but we could add "name"
//!     // and "id".
//!     subscribed_to: Vec<String>,
//! }
//!
//! fn main() -> anyhow::Result<()> {
//!     #
//!     # let example_json_path: PathBuf = [env!("CARGO_MANIFEST_DIR"), "examples", "top_level_data.json"]
//!     #   .iter()
//!     #   .collect();
//!     let buffered_file: BufReader<File> = BufReader::new(File::open(example_json_path)?);
//!     let mut json_deserializer = serde_json::Deserializer::from_reader(buffered_file);
//!     let mut all_channels = HashSet::new();
//!
//!     json_deserializer.for_each(|entry: DataEntry| all_channels.extend(entry.subscribed_to))?;
//!     println!("All existing channels:");
//!     for channel in all_channels {
//!         println!("  - {channel}")
//!     }
//!     Ok(())
//! }
//! ```
//!
//! # Top-level vs deep
//!
//! ## Top-level
//!
//! The [`top_level`] module offers the most user friendly and powerful way to
//! deserialize sequences. However, it is restricted to sequences defined at
//! the top-level. For example it can work on each `{"name": ...}` from the following JSON
//!
//! ```json
//! [
//!     {"name": "object1"},
//!     {"name": "object2"},
//!     {"name": "object3"}
//! ]
//! ```
//!
//! but not if they are deeper in the structure:
//!
//! ```json
//! {
//!     "result": [
//!         {"name": "object1"},
//!         {"name": "object2"},
//!         {"name": "object3"}
//!     ]
//! }
//! ```
//!
//! ## Deep
//!
//! The [`deep`] module allows working on sequences located at any depth
//! (and even nested one, though cumbersomely). However it does not allow to
//! run closures on the iterated items, only functions, and its interface is
//! less intuitive than [`top_level`].
//!
//! # Early returns
//!
//! **Caution.**  In case of an early return from the aggregating function,
//! all remaining items will still be deserialized (but discarded immediately).
//! This is because the format deserializers expect to have consume the whole
//! sequence before continuing.
//!
//! # FAQ
//!
//! ## Is this really iteration?
//!
//! This crate arguibly offers a form on internal iteration, as opposed to
//! the external iteration proposed by Rust, see this blog post
//! [section](https://without.boats/blog/why-async-rust/index.html#iterators) for
//! more
//!
//! ## I don't understand how to use your crate to parse JSONL (one JSON object per line)
//!
//! That's because you can't. Parsing a file containining a sequence of well-formated
//! serialziation separated by whitespace needs to be done by the format deserializer.
//! For JSON for example, use [serde_json::StreamDeserializer](https://docs.rs/serde_json/latest/serde_json/struct.StreamDeserializer.html).

pub mod deep;

pub mod top_level;