Crate range_set_blaze

Crate range_set_blaze 

Source
Expand description

§range-set-blaze

github crates.io docs.rs

Integer sets as fast, sorted integer ranges; Maps with integer-range keys; Full set operations

Supports all of Rust’s integer-like types, u8 to u128, i8 to i128, char (Unicode characters), Ipv4Addr, and Ipv6Addr. Set operations—union, intersection, difference, symmetric difference, and complement— are available on both sets and maps.

The crate’s main structs are:

Unlike the standard BTreeSet/BTreeMap and HashSet/HashMap, RangeSetBlaze does not store every integer in the set. Rather, it stores sorted & disjoint ranges of integers in a cache-efficient BTreeMap. It differs from other interval libraries – that we know of – by offering full set operations and by being optimized for sets of clumpy integers.

We can construct a RangeSetBlaze or RangeMapBlaze from unsorted & redundant integers (or ranges). When the inputs are clumpy, construction will be linear in the number of inputs and set operations will be sped up quadratically.

⚠️ Warning: All RangeMapBlaze operations (into_iter, union, difference, symmetric difference, etc.) now give precedence to the right-hand map when ranges overlap. This change (from version 0.2.0) aligns with the behavior of BTreeMap and HashMap.

The crate’s main traits are

  • SortedDisjoint, implemented by iterators of sorted & disjoint ranges of integers. See documentation for details.
  • SortedDisjointMap, implemented by iterators of pairs, where the first item is a sorted & disjoint range of integers. The second item is a value. See documentation for details.

With any SortedDisjoint or SortedDisjointMap iterator we can perform set operations in one pass through the ranges and with minimal (constant) memory. The package enforces the “sorted & disjoint” constraint at compile time (making invalid states unrepresentable).

The crate supports no_std, WASM, and embedded (with alloc) projects. For no_std, etc., Use the command:

cargo add range-set-blaze --no-default-features

§Benchmarks

See the set benchmarks and map benchmarks for performance comparisons with other range-related crates.

Generally, for many tasks involving clumpy integers and ranges, RangeSetBlaze is much faster than alternatives.

The benchmarks are in the benches directory. To run them, use cargo bench.

§Articles

§Examples

Example 1: Set Operations


Here we take the union (operator “|”) of two RangeSetBlaze’s:

Example 1

use range_set_blaze::prelude::*;

 // a is the set of integers from 100 to 499 (inclusive) and 501 to 1000 (inclusive)
let a = RangeSetBlaze::from_iter([100..=499, 501..=999]);
 // b is the set of integers -20 and the range 400 to 599 (inclusive)
let b = RangeSetBlaze::from_iter([-20..=-20, 400..=599]);
// c is the union of a and b, namely -20 and 100 to 999 (inclusive)
let c = a | b;
assert_eq!(c, RangeSetBlaze::from_iter([-20..=-20, 100..=999]));

Example 2: Maps (and Network Addresses)


In networking, we can use RangeMapBlaze to visualize routing tables. It efficiently merges adjacent or overlapping routes with identical next-hops, removes overlaps (respecting priority), and provides fast lookups. While specialized prefix trees (tries) are typically used in production routers for maximum efficiency, this example shows how RangeMapBlaze makes the information more understandable. Similar concepts apply when working with other range-mappings like font tables.

use range_set_blaze::prelude::*;
use std::net::Ipv4Addr;

// A routing table, sorted by prefix length (so, highest priority last)
let routing = [
    // destination, prefix, next hop, interface
    ("0.0.0.0", 0, "152.10.0.0", "eth0"),
    ("10.0.0.0", 8, "10.3.4.2", "eth1"),
    ("10.0.1.12", 30, "10.1.1.0", "eth2"),
    ("10.0.1.8", 30, "10.1.1.0", "eth2"),
    ("10.0.1.7", 32, "10.1.1.0", "eth2"),
];

// Create a RangeMapBlaze from the routing table
let range_map = routing
    .iter()
    .map(|(dest, prefix_len, next_hop, interface)| {
        let dest: Ipv4Addr = dest.parse().unwrap();
        let next_hop: Ipv4Addr = next_hop.parse().unwrap();
        let mask = u32::MAX.checked_shr(*prefix_len).unwrap_or(0);
        let range_start = Ipv4Addr::from(u32::from(dest) & !mask);
        let range_end = Ipv4Addr::from(u32::from(dest) | mask);
        (range_start..=range_end, (next_hop, interface))
    })
    .collect::<RangeMapBlaze<_, _>>();

// Print the now disjoint, sorted ranges and their associated values
for (range, (next_hop, interface)) in range_map.range_values() {
    println!("{range:?} → ({next_hop}, {interface})");
}

// Look up an address
assert_eq!(
    range_map.get(Ipv4Addr::new(10, 0, 1, 6)),
    Some(&(Ipv4Addr::new(10, 3, 4, 2), &"eth1"))
);

Output:

0.0.0.0..=9.255.255.255 → (152.10.0.0, eth0)
10.0.0.0..=10.0.1.6 → (10.3.4.2, eth1)
10.0.1.7..=10.0.1.15 → (10.1.1.0, eth2)
10.0.1.16..=10.255.255.255 → (10.3.4.2, eth1)
11.0.0.0..=255.255.255.255 → (152.10.0.0, eth0)

Example 3: Biology


In biology, suppose we want to find the intron regions of a gene but we are given only the transcription region and the exon regions.

Example 3

We create a RangeSetBlaze for the transcription region and a RangeSetBlaze for all the exon regions. Then we take the difference between the transcription region and exon regions to find the intron regions.

use range_set_blaze::prelude::*;

let line = "chr15   29370   37380   29370,32358,36715   30817,32561,37380";

// split the line on white space
let mut iter = line.split_whitespace();
let chrom = iter.next().unwrap();

// Parse the start and end of the transcription region into a RangeSetBlaze
let trans_start: i32 = iter.next().unwrap().parse().unwrap();
let trans_end: i32 = iter.next().unwrap().parse().unwrap();
let trans = RangeSetBlaze::from_iter([trans_start..=trans_end]);
assert_eq!(trans, RangeSetBlaze::from_iter([29370..=37380]));

// Parse the start and end of the exons into a RangeSetBlaze
let exon_starts = iter.next().unwrap().split(',').map(|s| s.parse::<i32>());
let exon_ends = iter.next().unwrap().split(',').map(|s| s.parse::<i32>());
let exon_ranges = exon_starts
    .zip(exon_ends)
    .map(|(s, e)| s.unwrap()..=e.unwrap());
let exons = RangeSetBlaze::from_iter(exon_ranges);
assert_eq!(exons, RangeSetBlaze::from_iter([29370..=30817, 32358..=32561, 36715..=37380]));

// Use 'set difference' to find the introns
let intron = trans - exons;
assert_eq!(intron, RangeSetBlaze::from_iter([30818..=32357, 32562..=36714]));
for range in intron.ranges() {
    let (start, end) = range.into_inner();
    println!("{chrom}\t{start}\t{end}");
}

Modules§

prelude
This prelude module provides a convenient way to import the most commonly used types, traits, and functions.

Macros§

intersection_dyn
Intersects zero or more SortedDisjoint iterators, creating a new SortedDisjoint iterator. The input iterators need not be of the same type.
intersection_map_dyn
Intersects zero or more SortedDisjointMap iterators, creating a new SortedDisjointMap iterator. The input iterators need not be of the same type.
symmetric_difference_dyn
Computes the symmetric difference of zero or more SortedDisjoint iterators, creating a new SortedDisjoint iterator. The input iterators need not be of the same type.
symmetric_difference_map_dyn
Computes the symmetric difference of zero or more SortedDisjointMap iterators, creating a new SortedDisjointMap iterator. The input iterators need not be of the same type.
union_dyn
Unions zero or more SortedDisjoint iterators, creating a new SortedDisjoint iterator. The input iterators need not be of the same type.
union_map_dyn
Unions zero or more SortedDisjointMap iterators, creating a new SortedDisjointMap iterator. The input iterators need not be of the same type.

Structs§

AssumePrioritySortedStartsMap
Used internally by UnionIterMap and SymDiffIterMap.
AssumeSortedStarts
Gives any iterator of ranges the SortedStarts trait without any checking.
CheckSortedDisjoint
Gives the SortedDisjoint trait to any iterator of ranges. The iterator will panic if/when it finds that the ranges are not actually sorted and disjoint.
CheckSortedDisjointMap
Gives the SortedDisjointMap trait to any iterator of range-value pairs. Will panic if the trait is not satisfied.
DynSortedDisjoint
Gives SortedDisjoint iterators a uniform type. Used by the union_dyn, etc. macros to give all their input iterators the same type.
DynSortedDisjointMap
Gives SortedDisjointMap iterators a uniform type. Used by the union_map_dyn, etc. macros to give all their input iterators the same type.
IntersectionIterMap
This struct is created by the intersection and map_and_set_intersection methods on SortedDisjointMap. See the methods’ documentation for more.
IntoIter
An iterator over the integer elements of a RangeSetBlaze. Double-ended.
IntoIterMap
An iterator over the integer elements of a RangeMapBlaze. Double-ended.
IntoKeys
An iterator over the integer elements of a RangeMapBlaze. Double-ended.
IntoRangeValuesIter
This struct is created by the into_range_values method on RangeMapBlaze. See into_range_values’s documentation for more. Double-ended.
IntoRangesIter
This struct is created by the into_ranges method on RangeSetBlaze. See into_ranges’s documentation for more. Double-ended.
IntoValues
An iterator over the values of a RangeMapBlaze. Double-ended.
Iter
An iterator over the integer elements of a RangeSetBlaze. Double-ended.
IterMap
An iterator over the integer elements of a RangeMapBlaze. Double-ended.
KMerge
Used internally by UnionIter and SymDiffIter.
KMergeMap
Used internally by UnionIterMap and SymDiffIterMap.
Keys
An iterator over the integer elements of a RangeMapBlaze. Double-ended.
MapIntoRangesIter
This struct is created by the into_ranges method on RangeMapBlaze. See into_ranges’s documentation for more.
MapRangesIter
This struct is created by the ranges method on RangeMapBlaze. See ranges’s documentation for more.
Merge
Used internally by UnionIter and SymDiffIter.
MergeMap
Used internally by UnionIterMap and SymDiffIterMap.
NotIter
The output of SortedDisjoint::complement and SortedDisjointMap::complement_with.
RangeMapBlaze
A map from integers to values stored as a map of sorted & disjoint ranges to values.
RangeOnce
RangeOnce is an iterator which emits a single RangeInclusive value before fusing.
RangeSetBlaze
A set of integers stored as sorted & disjoint ranges.
RangeValuesIter
This struct is created by the range_values method on RangeMapBlaze. See range_values’s documentation for more. Double-ended.
RangeValuesToRangesIter
This struct is used internally.
RangesIter
This struct is created by the ranges method on RangeSetBlaze. See ranges’s documentation for more. Double-ended.
RogsIterDeprecated
Experimental: This struct is created by the rogs_range method on RangeSetBlaze. See rogs_range for more information.
SymDiffIter
This struct is created by the symmetric_difference method on SortedDisjoint. See symmetric_difference’s documentation for more.
SymDiffIterMap
This struct is created by the symmetric_difference method on SortedDisjointMap. See symmetric_difference’s documentation for more.
UnionIter
This struct is created by the union method on SortedStarts. See union’s documentation for more.
UnionIterMap
This struct is created by the union method. See union’s documentation for more.
Values
An iterator over the values of a RangeMapBlaze. Double-ended.

Enums§

RogDeprecated
Experimental: Represents an range or gap in a RangeSetBlaze.
UIntPlusOne
Represents values from 0 to u128::MAX + 1 (inclusive).

Traits§

Integer
Represents elements that can be used within RangeSetBlaze and as keys in RangeMapBlaze.
IntoString
Converts the implementing type into a String by consuming it.
MultiwayRangeMapBlaze
Provides methods on zero or more RangeMapBlaze’s, specifically union, intersection and symmetric_difference.
MultiwayRangeMapBlazeRef
Provides methods on zero or more RangeMapBlaze references, specifically union, intersection and symmetric_difference.
MultiwayRangeSetBlaze
Provides methods on zero or more RangeSetBlaze’s, specifically union, intersection and symmetric_difference.
MultiwayRangeSetBlazeRef
Provides methods on zero or more RangeSetBlaze references, specifically union, intersection, and symmetric_difference.
MultiwaySortedDisjoint
Provides methods on zero or more SortedDisjoint iterators, specifically union, intersection, and symmetric_difference.
MultiwaySortedDisjointMap
Provides methods on zero or more SortedDisjointMap iterators, specifically union, intersection, and symmetric_difference.
SortedDisjoint
Marks iterators that provide ranges that are sorted by start and disjoint. Set operations on iterators that implement this trait can be performed in linear time.
SortedDisjointMap
Marks iterators that provide (range, value) pairs that are sorted and disjoint. Set operations on iterators that implement this trait can be performed in linear time.
SortedStarts
Used internally. Marks iterators that provide ranges sorted by start, but that are not necessarily disjoint. The ranges are non-empty.
SortedStartsMap
Used internally. Marks iterators that provide (range, value) pairs that are sorted by the range’s start, but that are not necessarily disjoint.
ValueRef
A trait for cloneable references to Eq + Clone values, used by the SortedDisjointMap trait.