Crate datafusion_row

Source
Expand description

This module contains code to translate arrays back and forth to a row based format. The row based format is backed by raw bytes ([[u8]]) and used to optimize certain operations.

In general, DataFusion is a so called “vectorized” execution model, specifically it uses the optimized calculation kernels in arrow to amortize dispatch overhead.

However, as mentioned in this paper, there are some “row oriented” operations in a database that are not typically amenable to vectorization. The “classics” are: hash table updates in joins and hash aggregates, as well as comparing tuples in sort / merging.

Re-exports§

pub use layout::row_supported;

Modules§

accessor
RowAccessor provides a Read/Write/Modify access for row with all fixed-sized fields:
layout
Various row layouts for different use case
reader
read_as_batch converts raw bytes to RecordBatch
writer
RowWriter writes RecordBatches to Vec<u8> to stitch attributes together

Macros§

fn_add_idx
fn_get_idx
fn_get_idx_opt
fn_set_idx
get_idx
set_idx

Structs§

MutableRecordBatch
Columnar Batch buffer that assists creating RecordBatches