Enum Canonical

Source
pub enum Canonical {
    Null(NullArray),
    Bool(BoolArray),
    Primitive(PrimitiveArray),
    Decimal(DecimalArray),
    Struct(StructArray),
    List(ListArray),
    VarBinView(VarBinViewArray),
    Extension(ExtensionArray),
}
Expand description

An enum capturing the default uncompressed encodings for each Vortex type.

Any array can be decoded into canonical form via the to_canonical trait method. This is the simplest encoding for a type, and will not be compressed but may contain compressed child arrays.

Canonical form is useful for doing type-specific compute where you need to know that all elements are laid out decompressed and contiguous in memory.

§Laziness

Canonical form is not recursive, so while a StructArray is the canonical format for any Struct type, individual column child arrays may still be compressed. This allows compute over Vortex arrays to push decoding as late as possible, and ideally many child arrays never need to be decoded into canonical form at all depending on the compute.

§Arrow interoperability

All of the Vortex canonical encodings have an equivalent Arrow encoding that can be built zero-copy, and the corresponding Arrow array types can also be built directly.

The full list of canonical types and their equivalent Arrow array types are:

Vortex uses a logical type system, unlike Arrow which uses physical encodings for its types. As an example, there are at least six valid physical encodings for a Utf8 array. This can create ambiguity. Thus, if you receive an Arrow array, compress it using Vortex, and then decompress it later to pass to a compute kernel, there are multiple suitable Arrow array variants to hold the data.

To disambiguate, we choose a canonical physical encoding for every Vortex DType, which will correspond to an arrow-rs arrow_schema::DataType.

§Views support

Binary and String views, also known as “German strings” are a better encoding format for nearly all use-cases. Variable-length binary views are part of the Apache Arrow spec, and are fully supported by the Datafusion query engine. We use them as our canonical string encoding for all Utf8 and Binary typed arrays in Vortex. They provide considerably faster filter execution than the core StringArray and BinaryArray types, at the expense of potentially needing garbage collection to clear unreferenced items from memory.

Variants§

Implementations§

Trait Implementations§

Source§

impl AsRef<dyn Array> for Canonical

Source§

fn as_ref(&self) -> &(dyn Array + 'static)

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl Clone for Canonical

Source§

fn clone(&self) -> Canonical

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Canonical

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl From<Canonical> for ArrayRef

Source§

fn from(value: Canonical) -> Self

Converts to this type from the input type.
Source§

impl IntoArray for Canonical

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointee for T

Source§

type Metadata = ()

The metadata type for pointers and references to this type.
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> ErasedDestructor for T
where T: 'static,