stackblur_iter/
iter.rs

1//! This module implements the [`StackBlur`] struct and the improved version of
2//! the Stackblur algorithm.
3//!
4//! ## The improved Stackblur algorithm
5//!
6//! As previously stated, this crate implements a modified version of Stackblur,
7//! and understanding the original algorithm is key to understanding the
8//! improvements that have been made to it.
9//!
10//! The original Stackblur is essentially a weighted box-blur. Instead of taking
11//! the average of a flat array of pixels, like this:
12//!
13//! ```text
14//! ( 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 7
15//! ```
16//!
17//! it takes a weighted average of the pixels:
18//!
19//! ```text
20//!              ( 03 +
21//!           02 + 03 + 04 +
22//!      01 + 02 + 03 + 04 + 05 +
23//! 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 16
24//! ```
25//!
26//! This is a rough approximation of a Gaussian blur, and in fact it's already
27//! most of the way there to being a complete algorithm on its own. You can just
28//! multiply each pixel by something like `radius + 1 - center_dist` when you
29//! make your sum, then divide by `radius * (radius + 2) + 1`.
30//!
31//! But there are two problems with this:
32//!
33//! 1. That would be *O*(n(m * 2 + 1)²) complexity, where `n` is the number of
34//!    pixels in the image, and `m` is the radius of the blur. This is basically
35//!    just as expensive as running a proper convolution filter.
36//!
37//! 2. How do we handle pixels off the edge of the image?
38//!
39//! I'm scared of #1 so I'm going to handle #2 first. What most implementations
40//! choose to do is just repeat the edge of the image:
41//!
42//! ```text
43//!              ( 00 +
44//!           00 + 00 + 01 +
45//!      00 + 00 + 00 + 01 + 02 +
46//! 00 + 00 + 00 + 00 + 01 + 02 + 03 ) / 16
47//! ```
48//!
49//! But this creates even more problems, because the edge of the blur will be
50//! quite biased towards the pixels that aren't even in the image. This is known
51//! as "edge bleeding", where a single pixel at the edge of a blur can cause
52//! very large and ugly artifacts to show up.
53//!
54//! The solution, of course, is to not calculate the denominator using that
55//! equation from earlier, and instead incrementally update the denominator as
56//! pixels are scanned, allowing you to sum a varying number of pixels:
57//!
58//! ```text
59//! ( 00 +
60//!   00 + 01 +
61//!   00 + 01 + 02 +
62//!   00 + 01 + 02 + 03 ) / 10
63//!
64//!      ( 01 +
65//!   00 + 01 + 02 +
66//!   00 + 01 + 02 + 03 +
67//!   00 + 01 + 02 + 03 + 04 ) / 13
68//!
69//!           ( 02 +
70//!        01 + 02 + 03 +
71//!   00 + 01 + 02 + 03 + 04 +
72//!   00 + 01 + 02 + 03 + 04 + 05 ) / 15
73//!
74//!                ( 03 +
75//!             02 + 03 + 04 +
76//!        01 + 02 + 03 + 04 + 05 +
77//!   00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 16
78//! ```
79//!
80//! This is one of the improvements made to the Stackblur algorithm that is
81//! implemented by this crate.
82//!
83//! Now for #1 - the complexity problem. It is possible to make a streaming
84//! Stackblur that does not need to read any pixels out of order or more than
85//! once, or even know how long the input is. In fact, that's exactly what this
86//! crate does.
87//!
88//! First you fill up the cache with `radius + 1` pixels in order to be able to
89//! make the first sum, then you start producing results until you run out of
90//! pixels, then you produce the last `radius` results using what you have in
91//! cache. I don't have the skill to explain the algorithm in full detail, but
92//! it's open-source so you can look at it if you want.
93//!
94//! In this crate the "cache" is not actually the actual heap of values, as that
95//! would be too slow. Instead, the "cache" is a list of changes to make to the
96//! rate of change of the sum, and the denominator is updated incrementally in
97//! response to the number of leading and trailing values changing.
98//!
99//! The reason the whole rate of change thing exists is so that adding to the
100//! sum and registering it in the cache becomes *O*(1) instead of *O*(2n+1)
101//! (where `n` is the radius). It's basically the most important thing that
102//! makes the algorithm constant-time.
103
104use std::collections::VecDeque;
105
106use crate::traits::StackBlurrable;
107
108/// An iterator that implements an improved Stackblur algorithm.
109///
110/// For any [`StackBlurrable`] element `T` and any iterator `I` over items of
111/// type `T`, [`StackBlur`] will yield the same amount of elements as `I`,
112/// blurred together according to the improved Stackblur algorithm as described
113/// by the [`iter`][crate::iter] module documentation.
114///
115/// ## Usage
116///
117/// [`StackBlur`] just wraps another iterator with a radius and a cache (called
118/// `ops` by [`StackBlur::new`]):
119///
120/// ```compile_fail
121/// # use std::collections::VecDeque;
122/// # use std::num::Wrapping;
123/// # use stackblur_iter::iter::StackBlur;
124/// #
125/// let arr = [255u8, 0, 0, 0, 127, 0, 0, 0, 255u8];
126/// let iter = arr.iter().copied().map(|i| Wrapping(i as usize));
127/// let blur = StackBlur::new(iter, 2, VecDeque::new());
128/// ```
129///
130/// That example unfortunately doesn't compile because `Wrapping<usize>` doesn't
131/// implement `Mul<usize>` and `Div<usize>`, which are required for the
132/// averaging that [`StackBlur`] performs. It's recommended to create a newtype
133/// wrapper around whatever you plan on blurring, and implement all the traits
134/// required by [`StackBlurrable`].
135///
136/// A [`StackBlur`] always yields exactly as many items as its inner iterator
137/// does. Additionally, a non-fused iterator which repeats will cause the
138/// [`StackBlur`] to repeat as well.
139///
140/// After using the [`StackBlur`], you can retrieve the [`VecDeque`] back out
141/// of it by calling [`StackBlur::into_ops`].
142pub struct StackBlur<T: StackBlurrable, I: Iterator<Item = T>> {
143	iter: I,
144	radius: usize,
145	sum: T,
146	rate: T,
147	dnom: usize,
148	ops: VecDeque<T>,
149	leading: usize,
150	trailing: usize,
151	done: bool,
152	first: bool
153}
154
155impl<T: StackBlurrable, I: Iterator<Item = T>> StackBlur<T, I> {
156	/// Creates a new [`StackBlur`] from the provided iterator, radius, and
157	/// [`VecDeque`].
158	///
159	/// The iterator is not advanced until a call to [`StackBlur::next`].
160	pub fn new(iter: I, radius: usize, ops: VecDeque<T>) -> Self {
161		Self {
162			iter,
163			radius,
164			sum: T::default(),
165			rate: T::default(),
166			dnom: 0,
167			ops,
168			leading: 0,
169			trailing: 0,
170			done: true,
171			first: true
172		}
173	}
174
175	/// Consumes this [`StackBlur`] and returns the inner [`VecDeque`].
176	pub fn into_ops(self) -> VecDeque<T> {
177		self.ops
178	}
179
180	fn init(&mut self) {
181		self.done = false;
182
183		for sub in 0..=self.radius {
184			let item = match self.iter.next() {
185				Some(item) => item,
186				None => break
187			};
188
189			if sub == 0 {
190				let start = self.radius + 1;
191				let needed = start * 2 + 2;
192				self.ops.reserve(needed.saturating_sub(self.ops.capacity()));
193				self.ops.iter_mut().take(start).for_each(|place| *place = T::default());
194				self.ops.resize_with(start, T::default);
195
196				self.sum = T::default();
197				self.rate = T::default();
198				self.dnom = 0;
199				self.leading = 0;
200				self.trailing = 0;
201			}
202
203			let mul = self.radius + 1 - sub;
204			self.sum += item.clone() * mul;
205			self.rate += item.clone();
206			self.dnom += mul;
207
208			if self.dnom > mul {
209				self.trailing += 1;
210			}
211
212			self.ops[sub] -= item.clone() * 2;
213			self.ops.push_back(item);
214		}
215
216		if self.dnom == 0 {
217			self.done = true;
218		}
219	}
220}
221
222impl<T: StackBlurrable, I: Iterator<Item = T>> Iterator for StackBlur<T, I> {
223	type Item = T;
224
225	#[inline]
226	fn next(&mut self) -> Option<Self::Item> {
227		if self.done {
228			self.init();
229
230			if !std::mem::replace(&mut self.first, false) {
231				return None;
232			}
233		}
234
235		let result = self.sum.clone() / self.dnom;
236
237		self.rate += self.ops.pop_front().unwrap();
238		self.sum += self.rate.clone();
239
240		if self.leading < self.radius {
241			self.leading += 1;
242			self.dnom += self.radius + 1 - self.leading;
243		}
244
245		if self.radius == 0 || self.trailing == self.radius {
246			if let Some(item) = self.iter.next() {
247				self.sum += item.clone();
248				self.rate += item.clone();
249				self.ops[self.radius] -= item.clone() * 2;
250				self.ops.push_back(item);
251			} else if self.radius > 0 {
252				self.dnom -= self.radius + 1 - self.trailing;
253				self.trailing -= 1;
254			} else {
255				self.done = true;
256			}
257		} else if self.trailing > 0 {
258			self.dnom -= self.radius + 1 - self.trailing;
259			self.trailing -= 1;
260		} else if self.trailing == 0 {
261			self.done = true;
262		}
263
264		Some(result)
265	}
266}