Expand description
§Repeat
Sometimes, an action needs to be applied multiple times.
This mainly happens when cleaning redirect URLs. If you handle t.co after bit.ly, then a bit.ly redirect going to a t.co redirect won’t be fully expanded.
Using Action::Repeat (and, for convenience, maps) allows for handling redirect chains of any length and in any order.
{"Repeat": {
"actions": [
{"PartMap": {
"part": "Host",
"map": {
"t.co": "ExpandRedirect",
"bit.ly": "ExpandRedirect",
"youtube.com": {"RemoveQueryParam": "si"}
}
}}
]
}}Let’s say you give the above example a t.co URL that redirects to a bit.ly URL that redirects to a youtube.com URL with an si query parameter.
-
Repeatmakes a backup of the currentTaskStateto later compare with theTaskStateat the end of the loop. -
Repeatapplies its contained actions once. Here it will expand thet.coURL into the intermediatebit.lyURL1.
-
Once all the contained actions are applied,
Repeatwill compare the modifiedTaskStatewith the backup made in step 1. -
Because the values are different,
Repeatdoes steps 1 through 3 again. -
At the start of the second iteration,
Repeatonce again makes a backup of the current task’s state. -
The second iteration expands the
bit.lyURL into theyoutube.comURL with ansiquery parameter. -
As with step 4, the task’s state has changed, and thus
Repeatwill run a third time. -
The third iteration removes the
siquery parameter from theyoutube.comURL and then runs a fourth time. -
On the fourth iteration, while the
Action::RemoveQueryParamis applied, the end result is identical to the backup made at the start of the loop, and thereforeRepeatexits and does not run a fifth time.
By default, Action::Repeat applies its contained actions at most 10 times, but this limit can be increased up to u64::MAX. The limit is that high because I wanted to make a Turing Machine.
§Reverted changes
It’s important to note that Repeat won’t loop if a URL is changed then the change is reverted. It only cares if the state at the end of a loop is the same as the state at the start of that loop.
For example, assuming you don’t intentionally give this cleaner a URL with an unused_parameter query parameter to make what I’m about to say wrong, this Repeat will only apply its actions once.
{
"actions": [
{"Repeat": {
"actions": [
{"SetQueryParam": {
"query_param": "unused_parameter",
"value": "whatever"
}},
{"RemoveQueryParam": "unused_parameter"}
]
}}
]
}To validate this intuition, see the global debugging tutorial.
Technically this is only true if the
HttpClientConfig::redirect_policyis set toRedirectPolicy::None, which for the default cleaner is usually true as it’s set in theParams::http_client_configused by all HTTP requests. ↩