1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
//! This module implements the [`StackBlur`] struct and the improved version of
//! the Stackblur algorithm.
//!
//! ## The improved Stackblur algorithm
//!
//! As previously stated, this crate implements a modified version of Stackblur,
//! and understanding the original algorithm is key to understanding the
//! improvements that have been made to it.
//!
//! The original Stackblur is essentially a weighted box-blur. Instead of taking
//! the average of a flat array of pixels, like this:
//!
//! ```text
//! ( 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 7
//! ```
//!
//! it takes a weighted average of the pixels:
//!
//! ```text
//!              ( 03 +
//!           02 + 03 + 04 +
//!      01 + 02 + 03 + 04 + 05 +
//! 00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 16
//! ```
//!
//! This is a rough approximation of a Gaussian blur, and in fact it's already
//! most of the way there to being a complete algorithm on its own. You can just
//! multiply each pixel by something like `radius + 1 - center_dist` when you
//! make your sum, then divide by `radius * (radius + 2) + 1`.
//!
//! But there are two problems with this:
//!
//! 1. That would be *O*(n(m * 2 + 1)²) complexity, where `n` is the number of
//!    pixels in the image, and `m` is the radius of the blur. This is basically
//!    just as expensive as running a proper convolution filter.
//!
//! 2. How do we handle pixels off the edge of the image?
//!
//! I'm scared of #1 so I'm going to handle #2 first. What most implementations
//! choose to do is just repeat the edge of the image:
//!
//! ```text
//!              ( 00 +
//!           00 + 00 + 01 +
//!      00 + 00 + 00 + 01 + 02 +
//! 00 + 00 + 00 + 00 + 01 + 02 + 03 ) / 16
//! ```
//!
//! But this creates even more problems, because the edge of the blur will be
//! quite biased towards the pixels that aren't even in the image. This is known
//! as "edge bleeding", where a single pixel at the edge of a blur can cause
//! very large and ugly artifacts to show up.
//!
//! The solution, of course, is to not calculate the denominator using that
//! equation from earlier, and instead incrementally update the denominator as
//! pixels are scanned, allowing you to sum a varying number of pixels:
//!
//! ```text
//! ( 00 +
//!   00 + 01 +
//!   00 + 01 + 02 +
//!   00 + 01 + 02 + 03 ) / 10
//!
//!      ( 01 +
//!   00 + 01 + 02 +
//!   00 + 01 + 02 + 03 +
//!   00 + 01 + 02 + 03 + 04 ) / 13
//!
//!           ( 02 +
//!        01 + 02 + 03 +
//!   00 + 01 + 02 + 03 + 04 +
//!   00 + 01 + 02 + 03 + 04 + 05 ) / 15
//!
//!                ( 03 +
//!             02 + 03 + 04 +
//!        01 + 02 + 03 + 04 + 05 +
//!   00 + 01 + 02 + 03 + 04 + 05 + 06 ) / 16
//! ```
//!
//! This is one of the improvements made to the Stackblur algorithm that is
//! implemented by this crate.
//!
//! Now for #1 - the complexity problem. It is possible to make a streaming
//! Stackblur that does not need to read any pixels out of order or more than
//! once, or even know how long the input is. In fact, that's exactly what this
//! crate does.
//!
//! First you fill up the cache with `radius + 1` pixels in order to be able to
//! make the first sum, then you start producing results until you run out of
//! pixels, then you produce the last `radius` results using what you have in
//! cache. I don't have the skill to explain the algorithm in full detail, but
//! it's open-source so you can look at it if you want.
//!
//! In this crate the "cache" is not actually the actual heap of values, as that
//! would be too slow. Instead, the "cache" is a list of changes to make to the
//! rate of change of the sum, and the denominator is updated incrementally in
//! response to the number of leading and trailing values changing.
//!
//! The reason the whole rate of change thing exists is so that adding to the
//! sum and registering it in the cache becomes *O*(1) instead of *O*(2n+1)
//! (where `n` is the radius). It's basically the most important thing that
//! makes the algorithm constant-time.

use std::collections::VecDeque;

use crate::traits::StackBlurrable;

/// An iterator that implements an improved Stackblur algorithm.
///
/// For any [`StackBlurrable`] element `T` and any iterator `I` over items of
/// type `T`, [`StackBlur`] will yield the same amount of elements as `I`,
/// blurred together according to the improved Stackblur algorithm as described
/// by the [`iter`][crate::iter] module documentation.
///
/// ## Usage
///
/// [`StackBlur`] just wraps another iterator with a radius and a cache (called
/// `ops` by [`StackBlur::new`]):
///
/// ```compile_fail
/// # use std::collections::VecDeque;
/// # use std::num::Wrapping;
/// # use stackblur_iter::iter::StackBlur;
/// #
/// let arr = [255u8, 0, 0, 0, 127, 0, 0, 0, 255u8];
/// let iter = arr.iter().copied().map(|i| Wrapping(i as usize));
/// let blur = StackBlur::new(iter, 2, VecDeque::new());
/// ```
///
/// That example unfortunately doesn't compile because `Wrapping<usize>` doesn't
/// implement `Mul<usize>` and `Div<usize>`, which are required for the
/// averaging that [`StackBlur`] performs. It's recommended to create a newtype
/// wrapper around whatever you plan on blurring, and implement all the traits
/// required by [`StackBlurrable`].
///
/// A [`StackBlur`] always yields exactly as many items as its inner iterator
/// does. Additionally, a non-fused iterator which repeats will cause the
/// [`StackBlur`] to repeat as well.
///
/// After using the [`StackBlur`], you can retrieve the [`VecDeque`] back out
/// of it by calling [`StackBlur::into_ops`].
pub struct StackBlur<T: StackBlurrable, I: Iterator<Item = T>> {
	iter: I,
	radius: usize,
	sum: T,
	rate: T,
	dnom: usize,
	ops: VecDeque<T>,
	leading: usize,
	trailing: usize,
	done: bool
}

impl<T: StackBlurrable, I: Iterator<Item = T>> StackBlur<T, I> {
	/// Creates a new [`StackBlur`] from the provided iterator, radius, and
	/// [`VecDeque`].
	///
	/// The iterator is not advanced until a call to [`StackBlur::next`].
	pub fn new(iter: I, radius: usize, ops: VecDeque<T>) -> Self {
		Self {
			iter,
			radius,
			sum: T::default(),
			rate: T::default(),
			dnom: 0,
			ops,
			leading: 0,
			trailing: 0,
			done: true
		}
	}

	/// Consumes this [`StackBlur`] and returns the inner [`VecDeque`].
	pub fn into_ops(self) -> VecDeque<T> {
		self.ops
	}

	fn init(&mut self) {
		self.done = false;

		for sub in 0..=self.radius {
			let item = match self.iter.next() {
				Some(item) => item,
				None => break
			};

			if sub == 0 {
				let start = self.radius + 1;
				let needed = start * 2 + 2;
				self.ops.reserve(needed.saturating_sub(self.ops.capacity()));
				self.ops.iter_mut().take(start).for_each(|place| *place = T::default());
				self.ops.resize_with(start, T::default);

				self.sum = T::default();
				self.rate = T::default();
				self.dnom = 0;
				self.leading = 0;
				self.trailing = 0;
			}

			let mul = self.radius + 1 - sub;
			self.sum += item.clone() * mul;
			self.rate += item.clone();
			self.dnom += mul;

			if self.dnom > mul {
				self.trailing += 1;
			}

			self.ops[sub] -= item.clone() * 2;
			self.ops.push_back(item);
		}

		if self.dnom == 0 {
			self.done = true;
		}
	}
}

impl<T: StackBlurrable, I: Iterator<Item = T>> Iterator for StackBlur<T, I> {
	type Item = T;

	#[inline]
	fn next(&mut self) -> Option<Self::Item> {
		if self.done {
			self.init();

			if self.done {
				return None;
			}
		}

		let result = self.sum.clone() / self.dnom;

		self.rate += self.ops.pop_front().unwrap();
		self.sum += self.rate.clone();

		if self.leading < self.radius {
			self.leading += 1;
			self.dnom += self.radius + 1 - self.leading;
		}

		if self.radius > 0 && self.trailing == self.radius {
			if let Some(item) = self.iter.next() {
				self.sum += item.clone();
				self.rate += item.clone();
				self.ops[self.radius] -= item.clone() * 2;
				self.ops.push_back(item);
			} else {
				self.dnom -= self.radius + 1 - self.trailing;
				self.trailing -= 1;
			}
		} else if self.trailing > 0 {
			self.dnom -= self.radius + 1 - self.trailing;
			self.trailing -= 1;
		} else if self.trailing == 0 {
			self.done = true;
		}

		Some(result)
	}
}