Skip to main content

Crate vortex_array

Crate vortex_array 

Source
Expand description

Vortex crate containing core logic for encoding and memory representation of arrays.

At the heart of Vortex are arrays.

Arrays are typed views of memory buffers that hold scalars. These buffers can be held in a number of physical encodings to perform lightweight compression that exploits the particular data distribution of the array’s values.

Every data type recognized by Vortex also has a canonical physical encoding format, which arrays can be canonicalized into for ease of access in compute functions.

§Core Handles

ArrayRef is the erased, shared handle used by most public APIs. It carries the logical DType, row count, encoding id, children, buffers, and statistics for an array tree. Use it when an API should accept any encoding.

Array<V> is the typed owned handle for a known encoding V: VTable. It wraps an ArrayRef and dereferences to the encoding-specific V::TypedArrayData.

ArrayView<V> is the lightweight typed borrow handed to vtable methods. It exposes both the shared ArrayRef metadata and the encoding-specific data without cloning the handle.

ArrayParts<V> is the construction boundary for typed arrays. It groups externally supplied logical metadata and encoding data, then Array::try_from_parts validates that they agree.

§Logical Types and Physical Encodings

A DType describes the logical values an array may hold. It does not describe the memory layout. For example, a DType::Primitive(I32, Nullable) can be stored as a canonical PrimitiveArray, a dictionary, a slice, or a compressed external encoding.

The Canonical enum names the default uncompressed encoding for each logical family. Execution normally moves an array tree toward canonical form, but canonicalization is shallow: children of canonical struct/list arrays may still be encoded.

§Built-in, Lazy, and Experimental Arrays

Built-in arrays live in arrays. Some are canonical (PrimitiveArray, StructArray, VarBinViewArray); others are utility or lazy arrays such as ChunkedArray, ConstantArray, FilterArray, SliceArray, and ScalarFnArray. Lazy arrays defer work so compute kernels can operate on encoded data or prune children before materialization.

Experimental arrays are public because they are used inside Vortex, but their storage contracts may still move. Prefer the higher-level constructors and accessors documented on each array module rather than relying on child slot order.

§Nulls and Scalars

Validity separates nullness from values. It can be a cheap constant state (NonNullable, AllValid, AllInvalid) or a boolean array that may itself be encoded. Scalar is the single-value counterpart: it pairs a DType with an optional ScalarValue.

§Extending Vortex

New array encodings implement VTable, usually through the local array_slots! and vtable! patterns used by built-ins. The important extension contracts are:

New logical extension dtypes implement ExtVTable and store values in an ordinary Vortex storage dtype.

Re-exports§

pub use smallvec;

Modules§

aggregate_fn
Aggregate function vtable machinery.
arrays
Built-in array encodings.
arrow
Utilities to work with Arrow data and types.
buffer
builders
Builders for Vortex arrays.
builtins
A collection of built-in common scalar functions.
compute
display
dtype
A type system for Vortex.
expr
Vortex’s expression language: scalar operations over arrays.
extension
Extension types.
flatbuffers
Re-exported autogenerated code from the core Vortex flatbuffer definitions.
iter
Iterator over slices of an array, and related utilities.
kernel
Parent kernels: child-driven fused execution of parent arrays.
mask
matcher
memory
Session-scoped memory allocation for host-side buffers.
normalize
optimizer
The optimizer applies metadata-only rewrite rules (reduce and reduce_parent) in a fixpoint loop until no more transformations are possible.
patches
scalar
Scalar values and types for the Vortex system.
scalar_fn
Scalar function vtable machinery.
search_sorted
serde
session
stats
Traits and utilities to compute and access array statistics.
stream
test_harness
validity
Array validity and nullability behavior, used by arrays and compute functions.
variants
This module defines extension functionality specific to each Vortex DType.
vtable
This module contains the VTable definitions for a Vortex encoding.

Macros§

assert_arrays_eq
assert_nth_scalar
Asserts that the scalar at position $n in array $arr equals $expected.
assert_nth_scalar_is_null
Asserts that the scalar at position $n in array $arr is null.
field_path
A helpful constructor for creating FieldPaths to nested struct fields of the format field_path!(x.y.z)
match_each_decimal_value
Matches over each decimal value variant, binding the inner value to a variable.
match_each_decimal_value_type
Macro to match over each decimal value type, binding the corresponding native type (from DecimalType)
match_each_float_ptype
Macro to match over each floating point type, binding the corresponding native type (from NativePType)
match_each_integer_ptype
Macro to match over each integer PType, binding the corresponding native type (from NativePType)
match_each_native_ptype
Macro to match over each PType, binding the corresponding native type (from NativePType)
match_each_native_simd_ptype
Macro to match over each SIMD capable PType, binding the corresponding native type (from NativePType)
match_each_pvalue
Utility macro that makes it easy to write expressions generic over the different PValue variants.
match_each_signed_integer_ptype
Macro to match over each signed integer type, binding the corresponding native type (from NativePType)
match_each_unsigned_integer_ptype
Macro to match over each unsigned integer type, binding the corresponding native type (from NativePType)
match_smallest_offset_type
Macro to match the smallest offset type for a given value
require_child
Require that a child array matches $M. If the child already matches, returns the same array unchanged. Otherwise, early-returns an ExecutionResult requesting execution of child $idx until it matches $M.
require_opt_child
Like require_child!, but for optional children. If the child is None, this is a no-op. If the child is Some but does not match $M, early-returns an ExecutionResult requesting execution of child $idx.
require_patches
Require that patch slots (indices, values, and optionally chunk_offsets) are Primitive. If no patches are present (slots are None), this is a no-op.
require_validity
Require that the validity slot is a Bool array. If validity is not array-backed (e.g. NonNullable or AllValid), this is a no-op. If it is array-backed but not Bool, early-returns an ExecutionResult requesting execution of the validity slot.
search_sorted_conformance_6371033076136957257
Apply #macro_name template to given body

Structs§

AnyCanonical
A matcher for any canonical array type.
AnyColumnar
Array
A typed owned handle to an array.
ArrayParts
Construction parameters for typed arrays.
ArrayRef
A reference-counted pointer to a type-erased array.
ArrayView
A lightweight, Copy-able typed view into an ArrayRef.
CanonicalValidity
Recursively execute the array until it reaches canonical form along with its validity.
DepthFirstArrayIterator
A depth-first pre-order iterator over an Array.
EmptyArrayData
Empty array metadata struct for encodings with no per-array metadata.
EmptyMetadata
Empty array metadata
ExecutionCtx
Execution context for batch CPU compute.
ExecutionResult
The result of a single execution step on an array encoding.
MaskFuture
A future that resolves to a mask.
NotSupported
Placeholder type used to indicate when a particular vtable is not supported by the encoding.
ProstMetadata
A utility wrapper for Prost metadata serialization.
RawMetadata
A utility wrapper for raw metadata serialization. This delegates the serialiation step to the arrays’ vtable.
RecursiveCanonical
Recursively execute the array until all of its children are canonical.
ValidityVTableFromChild
An implementation of the ValidityVTable for arrays that delegate validity entirely to a child array.
ValidityVTableFromChildSliceHelper
An implementation of the ValidityVTable for arrays that hold an unsliced validity and a slice into it.

Enums§

Canonical
An enum capturing the default uncompressed encodings for each Vortex type.
CanonicalView
A view into a canonical array type.
Columnar
Represents a columnnar array of data, either in canonical form or as a constant array.
ColumnarView
EqMode
The equality mode for structural equality and hashing of arrays.
ExecutionStep
Scheduler step indicator returned alongside an array in ExecutionResult.

Statics§

LEGACY_SESSION

Traits§

ArrayEq
An equality trait for arrays that represents structural equality with a configurable equality mode. This trait is used primarily to implement common subtree elimination and other array-based caching mechanisms.
ArrayHash
A hash trait for arrays that represents structural equality with a configurable equality mode. This trait is used primarily to implement common subtree elimination and other array-based caching mechanisms.
ArrayPlugin
Registry trait for ID-based deserialization of arrays.
ArrayVTable
The array VTable encapsulates logic for an Array type within Vortex.
DeserializeMetadata
Trait for deserializing Vortex metadata from a vector of unaligned bytes.
DynArrayDataEq
A dynamic version of ArrayEq.
DynArrayDataHash
A dynamic version of ArrayHash.
Executable
Marker trait for types that an ArrayRef can be executed into.
IntoArray
Trait for converting a type into a Vortex ArrayRef.
OperationsVTable
Element-level operations for an array encoding.
SerializeMetadata
Trait for serializing Vortex metadata to a vector of unaligned bytes.
ToCanonicalDeprecated
Trait for types that can be converted from an owned type into an owned array variant.
TypedArrayRef
Shared bound for helpers that should work over both owned Array<V> and borrowed ArrayView<V>.
VTable
The array VTable encapsulates logic for an Array type within Vortex.
ValidityChild
Helper trait for encodings whose validity is exactly one child slot.
ValidityChildSliceHelper
Helper for encodings that keep an unsliced validity child plus a local slice range.
ValidityVTable
Validity access for nullable instances of an encoding.
VortexSessionExecute
Extension trait for creating an execution context from a session.

Functions§

array_session
Builds a fresh VortexSession registered with all of vortex-array’s built-in session variables: arrays, dtypes, scalar functions, stats, optimizer kernels, aggregate functions, Arrow conversion, and memory.
child_to_validity
Reconstruct a Validity from an optional child array and nullability.
execute_into_builder
Execute array into the given builder.
initialize
Register vortex-array’s built-in session-scoped kernels into the active ArrayKernels registry.
patches_child
Returns the child at the given index within a patches component.
patches_child_name
Returns the name of the child at the given index within a patches component.
patches_nchildren
Returns the number of children produced by patches.
unsupported_buffer_replacement
Reject buffer replacement for encodings whose exposed buffers are not runtime backing buffers.
validity_nchildren
Returns 1 if validity produces a child, 0 otherwise.
validity_to_child
Returns the validity as a child array if it produces one.
with_empty_buffers
Rebuild an array that has no top-level buffers.

Type Aliases§

ArrayContext
ArrayId
ArrayId is a globally unique name for the array’s vtable.
ArrayPluginRef
Reference-counted array plugin.
ArraySlots
The slots of an array: a collection of optional child arrays.
DonePredicate
A predicate that determines when an array has reached a desired form during execution.

Attribute Macros§

array_slots
Generate slot index constants, a borrowed view struct, and a typed ext trait from a slot struct definition.