url_cleaner_engine/tutorial/cleaner/control_flow/
repeat.rs

1//! # Repeat
2//!
3//! Sometimes, an action needs to be applied multiple times.
4//!
5//! This mainly happens when cleaning redirect URLs. If you handle `t.co` after `bit.ly`, then a `bit.ly` redirect going to a `t.co` redirect won't be fully expanded.
6//!
7//! Using [`Action::Repeat`] (and, for convenience, [maps]) allows for handling redirect chains of any length and in any order.
8//!
9//! ```Json
10//! {"Repeat": {
11//!   "actions": [
12//!     {"PartMap": {
13//!       "part": "Host",
14//!       "map": {
15//!         "t.co": "ExpandRedirect",
16//!         "bit.ly": "ExpandRedirect",
17//!         "youtube.com": {"RemoveQueryParam": "si"}
18//!       }
19//!     }}
20//!   ]
21//! }}
22//! ```
23//!
24//! Let's say you give the above example a `t.co` URL that redirects to a `bit.ly` URL that redirects to a `youtube.com` URL with an `si` query parameter.
25//!
26//! 1. `Repeat` makes a backup of the current [`TaskState`] to later compare with the [`TaskState`] at the end of the loop.
27//!
28//! 2. `Repeat` applies its contained actions once. Here it will expand the `t.co` URL into the intermediate `bit.ly` URL[^redirect_policy].
29//!
30//! [^redirect_policy]: Technically this is only true if the [`HttpClientConfig::redirect_policy`] is set to [`RedirectPolicy::None`], which for the default cleaner is usually true as it's set in the [`Params::http_client_config`] used by all HTTP requests.
31//!
32//! 3. Once all the contained actions are applied, `Repeat` will compare the modified [`TaskState`] with the backup made in step 1.
33//!
34//! 4. Because the values are different, `Repeat` does steps 1 through 3 again.
35//!
36//! 5. At the start of the second iteration, `Repeat` once again makes a backup of the current task's state.
37//!
38//! 7. The second iteration expands the `bit.ly` URL into the `youtube.com` URL with an `si` query parameter.
39//!
40//! 8. As with step 4, the task's state has changed, and thus `Repeat` will run a third time.
41//!
42//! 9. The third iteration removes the `si` query parameter from the `youtube.com` URL and then runs a fourth time.
43//!
44//! 10. On the fourth iteration, while the [`Action::RemoveQueryParam`] *is* applied, the end result is identical to the backup made at the start of the loop, and therefore `Repeat` exits and does not run a fifth time.
45//!
46//! By default, [`Action::Repeat`] applies its contained actions at most 10 times, but this limit can be increased up to [`u64::MAX`]. The limit is that high because I wanted to make a Turing Machine.
47//!
48//! ## Reverted changes
49//!
50//! It's important to note that `Repeat` won't loop if a URL is changed then the change is reverted. It only cares if the state at the end of a loop is the same as the state at the start of that loop.
51//!
52//! For example, assuming you don't intentionally give this cleaner a URL with an `unused_parameter` query parameter to make what I'm about to say wrong, this `Repeat` will only apply its actions once.
53//!
54//! ```Json
55//! {
56//!   "actions": [
57//!     {"Repeat": {
58//!       "actions": [
59//!         {"SetQueryParam": {
60//!           "query_param": "unused_parameter",
61//!           "value": "whatever"
62//!         }},
63//!         {"RemoveQueryParam": "unused_parameter"}
64//!       ]
65//!     }}
66//!   ]
67//! }
68//! ```
69//!
70//! To validate this intuition, see the [global debugging](debugging#global) tutorial.
71
72pub(crate) use super::*;
73