Expand description
A priority queue implemented with a d-ary heap.
Insertion and popping the largest element have O(log(n)) time complexity. Checking the largest element is O(1). Converting a vector to a d-ary heap can be done in-place, and has O(n) complexity. A d-ary heap can also be converted to a sorted vector in-place, allowing it to be used for an O(n * log(n)) in-place heapsort.
§Comparison to standard library
The standard library contains a 2-ary heap
(std::collections::BinaryHeap
). The BinaryHeap
of this crate
aims to be a drop-in replacement, both in API and in performance. Cargo
features are used in place of unstable Rust features. The advantage of this
crate over the standard library lies in the possibility of easily changing
the arity of the heap, which can increase performance.
The standard library binary heap can contain up to isize::MAX
elements;
this is the same for the binary heap of this crate, but other heaps in this
crate can hold less elements. In the general case, the maximum number of
elements is (usize::MAX
- 1) / d for an arity of d. On 64-bit systems
this should generally not be a concern when using reasonable arities. On
32-bit systems this may be a concern when using very large heaps with a
relatively high arity.
§Comparison of different arities d
The arity d is defined as the maximum number of children each node can
have. A higher number means the heap has less layers, but may require more
work per layer because there are more children present. This generally makes
methods adding elements to the heap such as push
faster, and methods
removing them such as pop
slower. However, due to higher cache locality
for higher d, the drop in pop
performance is often diminished. If you’re
unsure what value of d to choose, the QuaternaryHeap
with d = 4 is
usually a good start, but benchmarking is necessary to determine the best
value of d.
§Usage
Rust type interference cannot infer the desired heap arity (value of d)
automatically when using DaryHeap
directly. It is therefore more
ergonomic to use one of the type aliases to select the desired arity:
Name | Arity |
---|---|
BinaryHeap | d = 2 |
TernaryHeap | d = 3 |
QuaternaryHeap | d = 4 |
QuinaryHeap | d = 5 |
SenaryHeap | d = 6 |
SeptenaryHeap | d = 7 |
OctonaryHeap | d = 8 |
The difference in ergonomics illustrated in the following:
use dary_heap::{DaryHeap, TernaryHeap};
// Type parameter T can be inferred, but arity cannot
let mut heap1 = DaryHeap::<_, 3>::new();
heap1.push(42);
// Type alias removes need for explicit type
let mut heap2 = TernaryHeap::new();
heap2.push(42);
If a different arity is desired, you can use the former or a define a type alias yourself. It should be noted that d > 8 is rarely beneficial.
§Validity of arities in d-ary heaps
Only arities of two or greater are useful in d-ary heap, and are therefore
the only ones implemented by default. Lower arities are only possible if you
put in the effort to implement them yourself. An arity of one is possible,
but yields a heap where every element has one child. This essentially makes
it a sorted vector with poor performance. Regarding an arity of zero: this
is not statically prevented, but constructing a DaryHeap
with it and
using it may (and probably will) result in a runtime panic.
§Examples
This is a larger example that implements Dijkstra’s algorithm
to solve the shortest path problem on a directed graph.
It shows how to use DaryHeap
with custom types.
use std::cmp::Ordering;
use dary_heap::TernaryHeap;
#[derive(Copy, Clone, Eq, PartialEq)]
struct State {
cost: usize,
position: usize,
}
// The priority queue depends on `Ord`.
// Explicitly implement the trait so the queue becomes a min-heap
// instead of a max-heap.
impl Ord for State {
fn cmp(&self, other: &Self) -> Ordering {
// Notice that we flip the ordering on costs.
// In case of a tie we compare positions - this step is necessary
// to make implementations of `PartialEq` and `Ord` consistent.
other.cost.cmp(&self.cost)
.then_with(|| self.position.cmp(&other.position))
}
}
// `PartialOrd` needs to be implemented as well.
impl PartialOrd for State {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(other))
}
}
// Each node is represented as a `usize`, for a shorter implementation.
struct Edge {
node: usize,
cost: usize,
}
// Dijkstra's shortest path algorithm.
// Start at `start` and use `dist` to track the current shortest distance
// to each node. This implementation isn't memory-efficient as it may leave duplicate
// nodes in the queue. It also uses `usize::MAX` as a sentinel value,
// for a simpler implementation.
fn shortest_path(adj_list: &Vec<Vec<Edge>>, start: usize, goal: usize) -> Option<usize> {
// dist[node] = current shortest distance from `start` to `node`
let mut dist: Vec<_> = (0..adj_list.len()).map(|_| usize::MAX).collect();
let mut heap = TernaryHeap::new();
// We're at `start`, with a zero cost
dist[start] = 0;
heap.push(State { cost: 0, position: start });
// Examine the frontier with lower cost nodes first (min-heap)
while let Some(State { cost, position }) = heap.pop() {
// Alternatively we could have continued to find all shortest paths
if position == goal { return Some(cost); }
// Important as we may have already found a better way
if cost > dist[position] { continue; }
// For each node we can reach, see if we can find a way with
// a lower cost going through this node
for edge in &adj_list[position] {
let next = State { cost: cost + edge.cost, position: edge.node };
// If so, add it to the frontier and continue
if next.cost < dist[next.position] {
heap.push(next);
// Relaxation, we have now found a better way
dist[next.position] = next.cost;
}
}
}
// Goal not reachable
None
}
fn main() {
// This is the directed graph we're going to use.
// The node numbers correspond to the different states,
// and the edge weights symbolize the cost of moving
// from one node to another.
// Note that the edges are one-way.
//
// 7
// +-----------------+
// | |
// v 1 2 | 2
// 0 -----> 1 -----> 3 ---> 4
// | ^ ^ ^
// | | 1 | |
// | | | 3 | 1
// +------> 2 -------+ |
// 10 | |
// +---------------+
//
// The graph is represented as an adjacency list where each index,
// corresponding to a node value, has a list of outgoing edges.
// Chosen for its efficiency.
let graph = vec![
// Node 0
vec![Edge { node: 2, cost: 10 },
Edge { node: 1, cost: 1 }],
// Node 1
vec![Edge { node: 3, cost: 2 }],
// Node 2
vec![Edge { node: 1, cost: 1 },
Edge { node: 3, cost: 3 },
Edge { node: 4, cost: 1 }],
// Node 3
vec![Edge { node: 0, cost: 7 },
Edge { node: 4, cost: 2 }],
// Node 4
vec![]];
assert_eq!(shortest_path(&graph, 0, 1), Some(1));
assert_eq!(shortest_path(&graph, 0, 3), Some(3));
assert_eq!(shortest_path(&graph, 3, 0), Some(7));
assert_eq!(shortest_path(&graph, 0, 4), Some(5));
assert_eq!(shortest_path(&graph, 4, 0), None);
}
Structs§
- Dary
Heap - A priority queue implemented with a d-ary heap.
- Drain
- A draining iterator over the elements of a
DaryHeap
. - Drain
Sorted - A draining iterator over the elements of a
DaryHeap
. - Into
Iter - An owning iterator over the elements of a
DaryHeap
. - Into
Iter Sorted - Iter
- An iterator over the elements of a
DaryHeap
. - PeekMut
- Structure wrapping a mutable reference to the greatest item on a
DaryHeap
.
Type Aliases§
- Binary
Heap - A binary heap (d = 2).
- Octonary
Heap - An octonary heap (d = 8).
- Quaternary
Heap - A quaternary heap (d = 4).
- Quinary
Heap - A quinary heap (d = 5).
- Senary
Heap - A senary heap (d = 6).
- Septenary
Heap - A septenary heap (d = 7).
- Ternary
Heap - A ternary heap (d = 3).