Skip to main content

Module api

Module api 

Source
Expand description

The Vortex Scan API implements an abstract table scan interface that can be used to read data from various data sources.

It supports arbitrary projection expressions, filter expressions, and limit pushdown as well as mechanisms for parallel and distributed execution via partitions.

The API is currently under development and may change in future releases, however we hope to stabilize into stable C ABI for use within foreign language bindings.

§Open Issues

  • We probably want to make the DataSource serializable as well, so that we can share source-level state with workers, separately from partition serialization.
  • We should add a way for the client to negotiate capabilities with the data source, for example which encodings it knows about.

Structs§

ScanRequest
A request to scan a data source.

Traits§

DataSource
A data source represents a streamable dataset that can be scanned with projection and filter expressions. Each scan produces partitions that can be executed in parallel to read data. Each partition can be serialized for remote execution.
DataSourceOpener
Opens a Vortex DataSource from a URI.
DataSourceRemote
Supports deserialization of a Vortex DataSource on a remote worker.
DataSourceScan
A data source scan produces partitions that can be executed to read data from the source.
Partition
A partition represents a unit of work that can be executed to produce a stream of arrays.

Type Aliases§

DataSourceRef
A reference-counted data source.
DataSourceScanRef
A boxed data source scan.
PartitionRef
A reference-counted partition.
PartitionStream
A sendable stream of partitions.