Expand description
§Big Lehmer
Big Lehmer is a lib for converting between arbitrarily sized sequence permutations into compact Lehmer codes and back to their uncompressed representation.
The number sequence must have similar properties as [0.N].shuffle. Basically sequential numbers in random order, starting at zero. The lib technically supports up to u32::MAX numbers, but performance will be the main issue beforehand.
§Usage
fn main() {
let sequence = [7, 2, 0, 6, 5, 1, 4, 3];
let lehmer_code = big_lehmer::encode(&sequence).unwrap();
let mut roundtrip = [0; 8];
big_lehmer::decode(&lehmer_code, &mut roundtrip).unwrap();
assert_eq!(sequence, roundtrip);
}§Benchmarks:
Measured on my “old system” (i7- 6700k). Not very accurate, just to showcase performance expectations.
| Sequence length | Lehmer code size | encode time | decode time |
|---|---|---|---|
| 512 | 4485 bytes | 470.70µs | 107.40µs |
| 10_000 | 14808 bytes = 15 KB | 2.40ms | 4.11ms |
| 100_000 | 189588 bytes = ~190 KB | 61.21ms | 205.44ms |
| 1_000_000 | 2311111 bytes = ~2.2 MiB | 2.15s | 11.39s |
The crate is mainly optimized for large sequences.
Performance for large sequences is dominated by the big integer math. A possible optimization is to replace Dashu with rug. Apparently rug (=GMP) is extremely well optimized, but it’s not native rust and not trivial to get working.
Functions§
- decode
- Decodes a Lehmer code generated by
big_lehmer::encode
theresultslice must have the same length as the sequence that was used to create the code - encode
- Encodes the number sequence into a Lehmer code.
Approximate size of the code can be computed withbig_lehmer::get_encode_sizeThe code can later be decoded withbig_lehmer::decode - get_
encode_ size - Estimate bounded byte size of the Lehmer code. Bit size = log2(N!) Byte size = Ceil(log2(N!) / 8)