Expand description
§Cacheline Elias-Fano
CachelineEf
is an integer encoding that packs chunks of 44 sorted 40-bit values into a single
cacheline, using 64/44*8 = 11.6
bits per value.
Each chunk can hold increasing values in a range of length 256*84=21504
.
If this range is exceeded, CachelineEfVec::new
will panic while CachelineEf::try_new
will return None
.
CachelineEfVec
stores a vector of CachelineEf
and provides CachelineEfVec::index
and CachelineEfVec::prefetch
.
This encoding is efficient when consecutive values differ by roughly 100, where using
Elias-Fano directly on the full list would use around 9
bits/value.
The main benefit is that CachelineEf only requires reading a single cacheline per query, where Elias-Fano encoding usually needs 3 reads.
Epserde is supported when the epserde
feature flag is enabled.
§Layout
The layout is described in detail in the PtrHash paper (arXiv, blog version).
In summary:
- First store a 4 byte offset, corresponding to the 32 high bits of the smallest value.
- Then store for each of 44 values the 8 low bits.
- Lastly we have 16 bytes (128 bits) to encode the high parts.
For the i’th value
x[i]
, we set the bit at positioni+(x[i]/256 - x[0]/256)
to1
.
Structs§
- Cacheline
Ef - A single cacheline that holds 44 Elias-Fano encoded 40-bit values in a range of size
256*84=21504
. - Cacheline
EfVec - A vector of
CachelineEf
.