stackblur_iter/iter.rs
1//! This module implements the [`StackBlur`] struct and the improved version of
2//! the Stackblur algorithm.
3//!
4//! ## The improved Stackblur algorithm
5//!
6//! As previously stated, this crate implements a modified version of Stackblur,
7//! and understanding the original algorithm is key to understanding the
8//! improvements that have been made to it.
9//!
10//! The original Stackblur is essentially a weighted box-blur. Instead of taking
11//! the average of a flat array of pixels, like this:
12//!
13//! ```text
14//! ( 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 7
15//! ```
16//!
17//! it takes a weighted average of the pixels:
18//!
19//! ```text
20//! ( 03 +
21//! 02 + 03 + 04 +
22//! 01 + 02 + 03 + 04 + 05 +
23//! 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 16
24//! ```
25//!
26//! This is a rough approximation of a Gaussian blur, and in fact it's already
27//! most of the way there to being a complete algorithm on its own. You can just
28//! multiply each pixel by something like `radius + 1 - center_dist` when you
29//! make your sum, then divide by `radius * (radius + 2) + 1`.
30//!
31//! But there are two problems with this:
32//!
33//! 1. That would be *O*(n(m * 2 + 1)²) complexity, where `n` is the number of
34//! pixels in the image, and `m` is the radius of the blur. This is basically
35//! just as expensive as running a proper convolution filter.
36//!
37//! 2. How do we handle pixels off the edge of the image?
38//!
39//! I'm scared of #1 so I'm going to handle #2 first. What most implementations
40//! choose to do is just repeat the edge of the image:
41//!
42//! ```text
43//! ( 00 +
44//! 00 + 00 + 01 +
45//! 00 + 00 + 00 + 01 + 02 +
46//! 00 + 00 + 00 + 00 + 01 + 02 + 03 ) / 16
47//! ```
48//!
49//! But this creates even more problems, because the edge of the blur will be
50//! quite biased towards the pixels that aren't even in the image. This is known
51//! as "edge bleeding", where a single pixel at the edge of a blur can cause
52//! very large and ugly artifacts to show up.
53//!
54//! The solution, of course, is to not calculate the denominator using that
55//! equation from earlier, and instead incrementally update the denominator as
56//! pixels are scanned, allowing you to sum a varying number of pixels:
57//!
58//! ```text
59//! ( 00 +
60//! 00 + 01 +
61//! 00 + 01 + 02 +
62//! 00 + 01 + 02 + 03 ) / 10
63//!
64//! ( 01 +
65//! 00 + 01 + 02 +
66//! 00 + 01 + 02 + 03 +
67//! 00 + 01 + 02 + 03 + 04 ) / 13
68//!
69//! ( 02 +
70//! 01 + 02 + 03 +
71//! 00 + 01 + 02 + 03 + 04 +
72//! 00 + 01 + 02 + 03 + 04 + 05 ) / 15
73//!
74//! ( 03 +
75//! 02 + 03 + 04 +
76//! 01 + 02 + 03 + 04 + 05 +
77//! 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 16
78//! ```
79//!
80//! This is one of the improvements made to the Stackblur algorithm that is
81//! implemented by this crate.
82//!
83//! Now for #1 - the complexity problem. It is possible to make a streaming
84//! Stackblur that does not need to read any pixels out of order or more than
85//! once, or even know how long the input is. In fact, that's exactly what this
86//! crate does.
87//!
88//! First you fill up the cache with `radius + 1` pixels in order to be able to
89//! make the first sum, then you start producing results until you run out of
90//! pixels, then you produce the last `radius` results using what you have in
91//! cache. I don't have the skill to explain the algorithm in full detail, but
92//! it's open-source so you can look at it if you want.
93//!
94//! In this crate the "cache" is not actually the actual heap of values, as that
95//! would be too slow. Instead, the "cache" is a list of changes to make to the
96//! rate of change of the sum, and the denominator is updated incrementally in
97//! response to the number of leading and trailing values changing.
98//!
99//! The reason the whole rate of change thing exists is so that adding to the
100//! sum and registering it in the cache becomes *O*(1) instead of *O*(2n+1)
101//! (where `n` is the radius). It's basically the most important thing that
102//! makes the algorithm constant-time.
103
104use std::collections::VecDeque;
105
106use crate::traits::StackBlurrable;
107
108/// An iterator that implements an improved Stackblur algorithm.
109///
110/// For any [`StackBlurrable`] element `T` and any iterator `I` over items of
111/// type `T`, [`StackBlur`] will yield the same amount of elements as `I`,
112/// blurred together according to the improved Stackblur algorithm as described
113/// by the [`iter`][crate::iter] module documentation.
114///
115/// ## Usage
116///
117/// [`StackBlur`] just wraps another iterator with a radius and a cache (called
118/// `ops` by [`StackBlur::new`]):
119///
120/// ```compile_fail
121/// # use std::collections::VecDeque;
122/// # use std::num::Wrapping;
123/// # use stackblur_iter::iter::StackBlur;
124/// #
125/// let arr = [255u8, 0, 0, 0, 127, 0, 0, 0, 255u8];
126/// let iter = arr.iter().copied().map(|i| Wrapping(i as usize));
127/// let blur = StackBlur::new(iter, 2, VecDeque::new());
128/// ```
129///
130/// That example unfortunately doesn't compile because `Wrapping<usize>` doesn't
131/// implement `Mul<usize>` and `Div<usize>`, which are required for the
132/// averaging that [`StackBlur`] performs. It's recommended to create a newtype
133/// wrapper around whatever you plan on blurring, and implement all the traits
134/// required by [`StackBlurrable`].
135///
136/// A [`StackBlur`] always yields exactly as many items as its inner iterator
137/// does. Additionally, a non-fused iterator which repeats will cause the
138/// [`StackBlur`] to repeat as well.
139///
140/// After using the [`StackBlur`], you can retrieve the [`VecDeque`] back out
141/// of it by calling [`StackBlur::into_ops`].
142pub struct StackBlur<T: StackBlurrable, I: Iterator<Item = T>> {
143 iter: I,
144 radius: usize,
145 sum: T,
146 rate: T,
147 dnom: usize,
148 ops: VecDeque<T>,
149 leading: usize,
150 trailing: usize,
151 done: bool,
152 first: bool
153}
154
155impl<T: StackBlurrable, I: Iterator<Item = T>> StackBlur<T, I> {
156 /// Creates a new [`StackBlur`] from the provided iterator, radius, and
157 /// [`VecDeque`].
158 ///
159 /// The iterator is not advanced until a call to [`StackBlur::next`].
160 pub fn new(iter: I, radius: usize, ops: VecDeque<T>) -> Self {
161 Self {
162 iter,
163 radius,
164 sum: T::default(),
165 rate: T::default(),
166 dnom: 0,
167 ops,
168 leading: 0,
169 trailing: 0,
170 done: true,
171 first: true
172 }
173 }
174
175 /// Consumes this [`StackBlur`] and returns the inner [`VecDeque`].
176 pub fn into_ops(self) -> VecDeque<T> {
177 self.ops
178 }
179
180 fn init(&mut self) {
181 self.done = false;
182
183 for sub in 0..=self.radius {
184 let item = match self.iter.next() {
185 Some(item) => item,
186 None => break
187 };
188
189 if sub == 0 {
190 let start = self.radius + 1;
191 let needed = start * 2 + 2;
192 self.ops.reserve(needed.saturating_sub(self.ops.capacity()));
193 self.ops.iter_mut().take(start).for_each(|place| *place = T::default());
194 self.ops.resize_with(start, T::default);
195
196 self.sum = T::default();
197 self.rate = T::default();
198 self.dnom = 0;
199 self.leading = 0;
200 self.trailing = 0;
201 }
202
203 let mul = self.radius + 1 - sub;
204 self.sum += item.clone() * mul;
205 self.rate += item.clone();
206 self.dnom += mul;
207
208 if self.dnom > mul {
209 self.trailing += 1;
210 }
211
212 self.ops[sub] -= item.clone() * 2;
213 self.ops.push_back(item);
214 }
215
216 if self.dnom == 0 {
217 self.done = true;
218 }
219 }
220}
221
222impl<T: StackBlurrable, I: Iterator<Item = T>> Iterator for StackBlur<T, I> {
223 type Item = T;
224
225 #[inline]
226 fn next(&mut self) -> Option<Self::Item> {
227 if self.done {
228 self.init();
229
230 if !std::mem::replace(&mut self.first, false) {
231 return None;
232 }
233 }
234
235 let result = self.sum.clone() / self.dnom;
236
237 self.rate += self.ops.pop_front().unwrap();
238 self.sum += self.rate.clone();
239
240 if self.leading < self.radius {
241 self.leading += 1;
242 self.dnom += self.radius + 1 - self.leading;
243 }
244
245 if self.radius == 0 || self.trailing == self.radius {
246 if let Some(item) = self.iter.next() {
247 self.sum += item.clone();
248 self.rate += item.clone();
249 self.ops[self.radius] -= item.clone() * 2;
250 self.ops.push_back(item);
251 } else if self.radius > 0 {
252 self.dnom -= self.radius + 1 - self.trailing;
253 self.trailing -= 1;
254 } else {
255 self.done = true;
256 }
257 } else if self.trailing > 0 {
258 self.dnom -= self.radius + 1 - self.trailing;
259 self.trailing -= 1;
260 } else if self.trailing == 0 {
261 self.done = true;
262 }
263
264 Some(result)
265 }
266}