BucketQueue
A priority queue that efficiently stores and retrieves items whose priorities
are small integers. Items are stored in 'buckets' which are other data structres
such as
Vec
or
VecDeque
.
BucketQueue is designed to work with a variety of queueing semantics such as
First-In-First-Out,
Last-In-First-Out and
Double-Ended, but you can
extend it with your own.
This implementation is loosely based on the description from Wikipedia.
Basic Usage
extern crate bucket_queue;
use *;
use VecDeque;
Things to note:
- You need to
use bucket_queue::*
to pull in the required traits - You can
dequeue_max
instead, if your priorities are reversed - This example uses First-In-First-Out (FIFO) queueing semantics
Last-In-First-Out
extern crate bucket_queue;
use *;
Things to note:
- A
Vec
provides Last-In-First-Out (LIFO) queueing semantics - We use
push
andpop_min
instead ofenqueue
anddequeue_min
- The queueing semantics only affects the order of retrieval for items with equal priority
Double-Ended
extern crate bucket_queue;
use *;
use VecDeque;
Things to note:
-
A
VecDeque
(also) provides Double-Ended queueing semantics -
We can
push
andpop
from both the front and the back Cargo.toml |104 // Pop items, ord- Again, this priorities are still respected, this only affects ordering of items in buckets
Utility Functions
extern crate bucket_queue;
use *;
use VecDeque;
Things to note:
- You can
pop
/pop_front
andpop_back
an item for a specific priority, too - BucketQueue does not implement Iterator because there are too many different ways to retrieve items
Nested Queues
extern crate bucket_queue;
use *;
use VecDeque;
Things to note:
- BucketQueue can be arbitrarily nested to any number of levels
min_bucket
andmax_bucket
find the bucket with minimum or maximum priority- These are equivalent:
queue.bucket.enqueue;
queue.bucket.bucket.enqueue;
- So are these:
queue.prune;
queue.bucket.clear;
Tests
All tests for the crate are here. This can also be used as a reference.
Benchmarks
test benchmark_100_items_into_4_buckets ... bench: 1,272 ns/iter (+/- 28)
test benchmark_1_000_items_into_8_buckets ... bench: 12,103 ns/iter (+/- 1,157)
test benchmark_10_000_items_into_16_buckets ... bench: 121,042 ns/iter (+/- 3,095)
test benchmark_100_000_items_into_32_buckets ... bench: 1,214,780 ns/iter (+/- 24,987)
test benchmark_1_000_000_items_into_64_buckets ... bench: 14,487,399 ns/iter (+/- 881,656)
test benchmark_100_items_into_4x4_nested_buckets ... bench: 3,742 ns/iter (+/- 170)
test benchmark_1_000_items_into_8x8_nested_buckets ... bench: 38,916 ns/iter (+/- 3,270)
test benchmark_10_000_items_into_16x16_nested_buckets ... bench: 353,102 ns/iter (+/- 11,718)
test benchmark_100_000_items_into_32x32_nested_buckets ... bench: 3,842,643 ns/iter (+/- 71,892)
test benchmark_1_000_000_items_into_64x64_nested_buckets ... bench: 47,129,660 ns/iter (+/- 726,701)
Things to note:
- These benchmarks were run on an Intel Core i5-4430 CPU
- The slowest example (one million items into 64x64 nested buckets) took 47 milliseconds
- These benchmarks can be run with
cargo bench
Adding a new queueing semantic
In this example, we'll introduce a BiggestFirstQueue
. This will retrieve items
from buckets, biggest to smallest. BucketQueue's priorities will still be
respected, but when items have equal priority, the biggest will be returned
first.
There's quite a lot boilerplate required (sorry). This is mostly a result of trying to make things flexible. I've broken it down into steps.
Step 1: Define a new type of Bucket
:
use BinaryHeap;
Things to note:
- This example uses a
BinaryHeap
to store items - It needs to be wrapped in a struct due to Rust's orphan rules
- The
Ord
constraint is imposed byBinaryHeap
, notBucketQueue
- Most of this is boilerplate that proxies calls through to
BinaryHeap
Step 2: Define how the bucket works with items:
Things to note:
BiggestFirstBucket
has a supertrait ofBucket
- Items are added to the bucket with
insert
and retrieved withbiggest
- This trait is implemented for
Heap
, which callspush
andpop
on theBinaryHeap
Step 3: Define a new type of Queue
for our Bucket
:
Things to note:
bucket_for_adding
andbucket_for_removing
are internal functions that keep BucketQueue's index up to datebiggest
retrieves from the minimum priority bucket, but we could addbiggest_min
andbiggest_max
if we wanted- The last line adds support for this queueing semantic to
BucketQueue
Finally, we can use it:
Things to note:
BucketQueue
uses our customHeap
type- The queueing semantics are inferred from the type of
Bucket
used donkey
has a priority of1
so it appears at the end- This example can be seen in full
here
and can be run with
cargo run
(Optional) Step 4: Add support for nesting:
To make your new type of queue work when BucketQueue
is nested, you'll need an
extra bit of boilerplate:
Things to note:
- This is all boilerplate and calls through to functions already defined
adding
andremoving
are internal functions that keep BucketQueue's index up to date
Final example with nesting:
Things to note:
- This example uses the two equivalent ways to insert items (see above: search for 'equivalent')
Implementation Notes
As you've probably seen above, a lot of traits are used to make BucketQueue more flexible. This adds boilerplate, but it means custom queueing semantics can be added, or existing semantics can be built on different data structures.
There's also an Index
trait, which currently has a single implementation
called SimpleIndex
. This implements the lower- and upper-bounds optimisation
described on Wikipedia.
In theory, it would be possible to extend BucketQueue with better indexing
strategies, perhaps using a BinaryHeap
or HashMap
. To use a custom Index
,
you'd initialize BucketQueue
like so:
let queue = new;
For example:
let queue = new;
I considered exploring better indexing strategies, but decided against it to keep the scope of this project under control.
Finally, one last thing to point out is that, although these are functionally equivalent:
queue.bucket.bucket.enqueue;
queue.bucket.enqueue;
There's a small performance overhead in the former. This is because it
constructs a DeferredBucket
for every call to bucket
. In the most
time-consuming benchmark, this overhead slows things down by about 7%. For
time-critical use cases, you can do this instead:
queue.bucket_for_adding.enqueue
This bypasses the DeferredBucket
, but there's more danger the Index
becomes
out-of-sync, if you accidentally do this:
queue.bucket_for_adding.dequeue_min; // THIS IS WRONG
The problem is that you've informed the queue you'll be adding an item, then
removed one, putting the Index
into an inconsistent state. I thought about
whether the same flexibility could be granted, in a consistent way, without the
overhead, but didn't manage to find a way to do this. Perhaps someone with more
experience of Rust's generics and traits can.
Contribution
All contributions are welcome. At time of writing I've been using Rust for about six months so I'm sure there's plenty of area for improvement. Please open an issue or create a pull request. Ping me on Twitter if I'm unresponsive.