Expand description
The Vortex Scan API implements an abstract table scan interface that can be used to read data from various data sources.
It supports arbitrary projection expressions, filter expressions, and limit pushdown as well as mechanisms for parallel and distributed execution via partitions.
The API is currently under development and may change in future releases, however we hope to stabilize into stable C ABI for use within foreign language bindings.
§Open Issues
- We probably want to make the DataSource serializable as well, so that we can share source-level state with workers, separately from partition serialization.
- We should add a way for the client to negotiate capabilities with the data source, for example which encodings it knows about.
Structs§
- Scan
Request - A request to scan a data source.
Traits§
- Data
Source - A data source represents a streamable dataset that can be scanned with projection and filter expressions. Each scan produces partitions that can be executed in parallel to read data. Each partition can be serialized for remote execution.
- Data
Source Opener - Opens a Vortex
DataSourcefrom a URI. - Data
Source Remote - Supports deserialization of a Vortex
DataSourceon a remote worker. - Data
Source Scan - A data source scan produces partitions that can be executed to read data from the source.
- Partition
- A partition represents a unit of work that can be executed to produce a stream of arrays.
Type Aliases§
- Data
Source Ref - A reference-counted data source.
- Data
Source Scan Ref - A boxed data source scan.
- Partition
Ref - A reference-counted partition.
- Partition
Stream - A sendable stream of partitions.