rle_vec
Usage
Put this in your Cargo.toml:
[]
= "0.2"
Then put this in your crate root:
extern crate rle_vec;
RleVec
This crate provides RleVec, a vector like structure that stores runs of identical values coded
by the value and the number of repeats.
If your data consists of long stretches of identical values is can be beneficial to only store the number of times each value occurs. This can result in significant space savings, but there is a cost. Accessing an arbitrary index requires a binary search over the stored runs resulting in a O(log n) complexity versus O(1) for a normal vector. Other complexities are in the table where n is equal to the number of runs, not the length of a comparable Vec.
|push|index |set with breaking a run|set without breaking a run|insert with breaking a run|insert without breaking a run|
--------|----|--------|-----------------------|--------------------------|--------------------------|-----------------------------|
RleVec|O(1)|O(log n)|O((log n) + 2n) |O(log n) |O((log n) + 2n) |O((log n) + n) |
Vec |O(1)|O(1) |O(1)* | |O(n) | |
- Benchmarks show that setting
vec[idx] = valueis a lot slower than gettingvec[idx]
The RleVec struct handles like a normal vector and supports a subset from the Vec methods.
Examples:
use RleVec;
let mut rle = new;
rle.push; rle.push; rle.push;
assert_eq!;
assert_eq!;
rle.insert;
assert_eq!;
rle.set;
assert_eq!;
RleVec can be constructed from Iterators and be iterated over just like a Vec.
use RleVec;
let v = vec!;
let mut rle: = v.into_iter.collect;
assert_eq!;
assert_eq!;
assert_eq!;
An RleVec can be indexed like a regular vector, but not mutated. Use RleVec::set to change the
value at an index.
use RleVec;
let v = vec!;
let mut rle: = v.into_iter.collect;
rle.set;
rle.insert;
assert_eq!;
RleVec::set and RleVec::insert require T: Clone.
Not all methods implemented on Vec are implemented for RleVec. All methods returning a slice
cannot work for RleVec.
Benchmarks
Cargo bench can be used to compare the real life difference of get/set/insert operations on a Vec
and RleVec. The test data is a vector of length 5000 that contains 1000 runs of length 5.
running 10 tests
test create_1000_runs_length_5 ... bench: 9,990 ns/iter (+/- 857)
test create_vec_1000_runs_length_5 ... bench: 7,553 ns/iter (+/- 385)
test index_100_from_test ... bench: 2,107 ns/iter (+/- 73)
test index_100_from_vec ... bench: 59 ns/iter (+/- 2)
test insert_100_vec ... bench: 50,248 ns/iter (+/- 1,945)
test insert_100_with_split ... bench: 57,934 ns/iter (+/- 4,008)
test insert_100_without_split ... bench: 27,416 ns/iter (+/- 801)
test set_100_vec ... bench: 7,453 ns/iter (+/- 286)
test set_100_with_split ... bench: 42,654 ns/iter (+/- 1,055)
test set_100_without_split ... bench: 12,094 ns/iter (+/- 234)
Inserting data is very competitive and can be faster if no run breaking is required. Indexing takes quite a big penalty, but mutable indexing is not that bad.