Expand description
Welcome to lindel
, the linearisation and delinearisation crate! In here you will find functions implementing encoding and decoding operations for Z-indexing and Hilbert indexing, which help linearise data-points while preserving locality.
§Usage
As far as primitive integers are concerned, this crate offers functions that can work either as methods (eg x.z_index()
) or as stand-alone functions (eg morton_encode(x)
). All methods are defined within the Lineariseable
trait.
The program essentially offers two 1-on-1 mappings between arrays and integers; if the array data-type is known, the integer type is selected automatically. For encoding operations, the array data-type is the input, which means that the compiler automatically knows which data-type the output should be. For decoding operations, however, the same integer can be decoded to arrays of different sizes; for that reason, the user will need to signify the output data-type somehow.
Please find illustrating examples below:
§Z-indexing
let input = [5u8, 4, 12, 129]; // Input is known to be [u8; 4]
let z_index_1 = morton_encode(input); // Function style
let z_index_2 = input.z_index(); // Method style
assert_eq!(z_index_1, 268447241);
assert_eq!(z_index_1, z_index_2);
let reassembled_input_1: [u8; 4] = morton_decode(z_index_1);
// Output's data-type must be provided by the user
let reassembled_input_2 = <[u8; 4]>::from_z_index(z_index_2);
// Specifying the output data-type with method style is a little more involved.
assert_eq!(input, reassembled_input_1);
assert_eq!(input, reassembled_input_2);
§Hilbert indexing
let input = [0xDEADBEEFu32, 0xFACADE5]; // Input is known to be [u32; 2]
let hilbert_index_1 = hilbert_encode(input); // Function style
let hilbert_index_2 = input.hilbert_index(); // Method style
assert_eq!(hilbert_index_1, 17414049806762354884);
assert_eq!(hilbert_index_1, hilbert_index_2);
let reassembled_input_1: [u32; 2] = hilbert_decode(hilbert_index_1);
// Output's data-type must be provided by the user
let reassembled_input_2 = <[u32; 2]>::from_hilbert_index(hilbert_index_2);
// Specifying the output data-type with method style is a little more involved.
assert_eq!(input, reassembled_input_1);
assert_eq!(input, reassembled_input_2);
§Morton encoding (Z-indexing)
This crate re-exports everything from the morton_encoding
crate for this operation; the user is cordially invited to look there for further information.
§Hilbert encoding
§Algorithm details
The code for the Hilbert encoding is based on an algorithm by John Skilling, as implemented by Paul Chernoch.
§Implementation details
Our code is in essence a much-needed refactoring of mr Chernoch’s implementation, clarifying the original source code by a lot, leveraging the type-system to help with correctness, and improving the efficiency of the code. To wit:
- Instead of accepting arbitrarily-sized slices as input, we accept arrays, to help ensure that each key comes from the encoding of the same amount of dimensions.
- Since the total amount of bits in the input is now known statically, our implementation outputs primitive integers instead of dynamically-sized
BigUint
s. This also allows it to work inno_std
environments. - Seeing as we’ve already implemented
morton_encoding
, we leverage it to perform what mr Chernoch calls “transposition” and what mr Skilling doesn’t even deign to mention is necessary. - Given that the
nalgebra
crate exists, we opted to just implement this crate’s linearisation methods fornalgebra
’sPoint
data-types rather than re-implement them as mr Chernoch already did, so as to avail ourselves ofnalgebra
’s static correctness and sheer performance.
§Performance characteristics
Skilling’s algorithm has the disadvantage of only examining one bit at a time, but as a result it manages to avoid the very expensive computation of orientation that other algorithms have to perform. We don’t know what the theoretical fastest is, if any; this algorithm was selected on an “eh, good enough” basis. We’d like to bench-mark it against other algorithms in the future.
§Note about leading zeros
In contrast to Morton encoding, Hilbert encoding has the quirk of being dependent on the amount of leading zeros. As a result, because the amount of operations done by Skilling’s algorithm is linerarly dependent on the amount of bits examined, it’s imperative that one doesn’t waste time examining useless bits. It is the solution to this problem that presents a difference between the code we found and the one that ended up in our implementation.
Messrs Skilling and Chernoch solved this problem by taking the amount of bits to be examined as a parametre to the function. This was crucially important to them, because their implementation only accepts u32
s as its input, which would otherwise mean 32 bits to examine irrespective of the magnitude of the input.
Our solution, on the other hand, can accept coordinates as small as u8
s. As a result, any data-set which is statically known to contain small enough numbers can simply be modelled with a smaller coordinate data-type, solving the biggest part of this problem in one fell swoop. Nonetheless, we also examine the leading zeros of our coordinates, and skip any leading zeros that come in groups of D
, where D
the amount of dimensions; this outputs the exact same results as one would get by examining all bits from the beginning, and allows us to avoid taking the amount of bits as a parametre. Nonetheless, we admittedly haven’t benchmarked the cost of zero-counting to compare it to the other costs, because such micro-optimisations were deemed beyond the scope of this crate.
§Nalgebra
Hidden behind the nalg
feature, so as to avoid dragging unnecessary dependencies to people who don’t need them, is the
nalgebra_points
module, which offers all of those methods for Point
s.
Please refer there for more information, although there isn’t much to be said.
§Compact Hilbert encoding
Implemented by Chris Hamilton and copied with permission and gratitude into our own code. Information on implementation details and performance characteristics will have to wait until mr Hamilton can explain as much.
Modules§
- compact_
encoding - Implementation of Compact Hilbert Indices by Chris Hamilton, and (in the near future) of Compact Z-indices by the crate maintainer.
- nalgebra_
points - A few traits and functions for linearising
nalgebra
’sPoint
data-types. - new_
uints - A macro that creates a new
[usize; N]
data-type that behaves as an integer for the purposes of accepting a newKey
.
Macros§
Traits§
- Ideal
Key - Given a data type and a number, it yields the smallest unsigned
integer that’s at least
N
times larger. - Lineariseable
- General trait for lineariseable data-types. Those are generally arrays, or things that behave like arrays, such as geometric points.
- Valid
Key - Guarantees that a given data structure is suitable for a Morton Key.
Functions§
- hilbert_
decode - A free function for decoding a Hilbert Index into an array of primitive integers.
- hilbert_
encode - A free function for creating the Hilbert Index of an array of primitive integers.
- morton_
decode - The most ergonomic way to perform the Morton decoding operation.
- morton_
encode - The most ergonomic way to perform the Morton encoding operation.