Skip to main content

Crate vortex_scan

Crate vortex_scan 

Source
Expand description

The Vortex Scan API implements an abstract table scan interface that can be used to read data from various data sources.

It supports arbitrary projection expressions, filter expressions, and limit pushdown as well as mechanisms for parallel and distributed execution via partitions.

The API is currently under development and may change in future releases, however we hope to stabilize into stable C ABI for use within foreign language bindings.

If you are looking to scan Vortex files or layouts, the Vortex implementation of the Scan API can be found in the vortex-layout crate.

§Open Issues

  • We probably want to make the DataSource serializable as well, so that we can share source-level state with workers, separately from partition serialization.
  • We should add a way for the client to negotiate capabilities with the data source, for example which encodings it knows about.

Modules§

row_mask
A mask over a range of rows.
selection
Defines a selection mask over a scan.

Structs§

ScanRequest
A request to scan a data source.

Traits§

DataSource
A data source represents a streamable dataset that can be scanned with projection and filter expressions. Each scan produces partitions that can be executed in parallel to read data. Each partition can be serialized for remote execution.
DataSourceOpener
Opens a Vortex DataSource from a URI.
DataSourceRemote
Supports deserialization of a Vortex DataSource on a remote worker.
DataSourceScan
A data source scan produces partitions that can be executed to read data from the source.
Partition
A partition represents a unit of work that can be executed to produce a stream of arrays.

Type Aliases§

DataSourceRef
A reference-counted data source.
DataSourceScanRef
A boxed data source scan.
PartitionRef
A reference-counted partition.
PartitionStream
A sendable stream of partitions.