Crate lindel

Expand description

Welcome to lindel, the linearisation and delinearisation crate! In here you will find functions implementing encoding and decoding operations for Z-indexing and Hilbert indexing, which help linearise data-points while preserving locality.

§Usage

As far as primitive integers are concerned, this crate offers functions that can work either as methods (eg x.z_index()) or as stand-alone functions (eg morton_encode(x)). All methods are defined within the Lineariseable trait.

The program essentially offers two 1-on-1 mappings between arrays and integers; if the array data-type is known, the integer type is selected automatically. For encoding operations, the array data-type is the input, which means that the compiler automatically knows which data-type the output should be. For decoding operations, however, the same integer can be decoded to arrays of different sizes; for that reason, the user will need to signify the output data-type somehow.

Please find illustrating examples below:

§Z-indexing

let input = [5u8, 4, 12, 129]; // Input is known to be [u8; 4]
let z_index_1 = morton_encode(input); // Function style
let z_index_2 = input.z_index(); // Method style
assert_eq!(z_index_1, 268447241);
assert_eq!(z_index_1, z_index_2);
 
let reassembled_input_1: [u8; 4] = morton_decode(z_index_1);
// Output's data-type must be provided by the user
 
let reassembled_input_2 = <[u8; 4]>::from_z_index(z_index_2);
// Specifying the output data-type with method style is a little more involved.
 
assert_eq!(input, reassembled_input_1);
assert_eq!(input, reassembled_input_2);

§Hilbert indexing

let input = [0xDEADBEEFu32, 0xFACADE5]; // Input is known to be [u32; 2]
let hilbert_index_1 = hilbert_encode(input); // Function style
let hilbert_index_2 = input.hilbert_index(); // Method style
assert_eq!(hilbert_index_1, 17414049806762354884);
assert_eq!(hilbert_index_1, hilbert_index_2);
 
let reassembled_input_1: [u32; 2] = hilbert_decode(hilbert_index_1);
// Output's data-type must be provided by the user

let reassembled_input_2 = <[u32; 2]>::from_hilbert_index(hilbert_index_2);
// Specifying the output data-type with method style is a little more involved.

assert_eq!(input, reassembled_input_1);
assert_eq!(input, reassembled_input_2);

§Morton encoding (Z-indexing)

This crate re-exports everything from the morton_encoding crate for this operation; the user is cordially invited to look there for further information.

§Hilbert encoding

§Algorithm details

The code for the Hilbert encoding is based on an algorithm by John Skilling, as implemented by Paul Chernoch.

§Implementation details

Our code is in essence a much-needed refactoring of mr Chernoch’s implementation, clarifying the original source code by a lot, leveraging the type-system to help with correctness, and improving the efficiency of the code. To wit:

Instead of accepting arbitrarily-sized slices as input, we accept arrays, to help ensure that each key comes from the encoding of the same amount of dimensions.
Since the total amount of bits in the input is now known statically, our implementation outputs primitive integers instead of dynamically-sized BigUints. This also allows it to work in no_std environments.
Seeing as we’ve already implemented morton_encoding, we leverage it to perform what mr Chernoch calls “transposition” and what mr Skilling doesn’t even deign to mention is necessary.
Given that the nalgebra crate exists, we opted to just implement this crate’s linearisation methods for nalgebra’s Point data-types rather than re-implement them as mr Chernoch already did, so as to avail ourselves of nalgebra’s static correctness and sheer performance.

§Performance characteristics

Skilling’s algorithm has the disadvantage of only examining one bit at a time, but as a result it manages to avoid the very expensive computation of orientation that other algorithms have to perform. We don’t know what the theoretical fastest is, if any; this algorithm was selected on an “eh, good enough” basis. We’d like to bench-mark it against other algorithms in the future.

§Note about leading zeros

In contrast to Morton encoding, Hilbert encoding has the quirk of being dependent on the amount of leading zeros. As a result, because the amount of operations done by Skilling’s algorithm is linerarly dependent on the amount of bits examined, it’s imperative that one doesn’t waste time examining useless bits. It is the solution to this problem that presents a difference between the code we found and the one that ended up in our implementation.

Messrs Skilling and Chernoch solved this problem by taking the amount of bits to be examined as a parametre to the function. This was crucially important to them, because their implementation only accepts u32s as its input, which would otherwise mean 32 bits to examine irrespective of the magnitude of the input.

Our solution, on the other hand, can accept coordinates as small as u8s. As a result, any data-set which is statically known to contain small enough numbers can simply be modelled with a smaller coordinate data-type, solving the biggest part of this problem in one fell swoop. Nonetheless, we also examine the leading zeros of our coordinates, and skip any leading zeros that come in groups of D, where D the amount of dimensions; this outputs the exact same results as one would get by examining all bits from the beginning, and allows us to avoid taking the amount of bits as a parametre. Nonetheless, we admittedly haven’t benchmarked the cost of zero-counting to compare it to the other costs, because such micro-optimisations were deemed beyond the scope of this crate.

§Nalgebra

Hidden behind the nalg feature, so as to avoid dragging unnecessary dependencies to people who don’t need them, is the nalgebra_points module, which offers all of those methods for Points. Please refer there for more information, although there isn’t much to be said.

§Compact Hilbert encoding

Implemented by Chris Hamilton and copied with permission and gratitude into our own code. Information on implementation details and performance characteristics will have to wait until mr Hamilton can explain as much.

Modules§

compact_encoding: Implementation of Compact Hilbert Indices by Chris Hamilton, and (in the near future) of Compact Z-indices by the crate maintainer.
nalgebra_points: A few traits and functions for linearising nalgebra’s Point data-types.
new_uints: A macro that creates a new [usize; N] data-type that behaves as an integer for the purposes of accepting a new Key.

Macros§

create_lineariseable_data_type

Traits§

IdealKey: Given a data type and a number, it yields the smallest unsigned integer that’s at least N times larger.
Lineariseable: General trait for lineariseable data-types. Those are generally arrays, or things that behave like arrays, such as geometric points.
ValidKey: Guarantees that a given data structure is suitable for a Morton Key.

Functions§

hilbert_decode: A free function for decoding a Hilbert Index into an array of primitive integers.
hilbert_encode: A free function for creating the Hilbert Index of an array of primitive integers.
morton_decode: The most ergonomic way to perform the Morton decoding operation.
morton_encode: The most ergonomic way to perform the Morton encoding operation.

Crate lindelCopy item path