Expand description
The Vortex Scan API implements an abstract table scan interface that can be used to read data from various data sources.
It supports arbitrary projection expressions, filter expressions, and limit pushdown as well as mechanisms for parallel and distributed execution via splits.
The API is currently under development and may change in future releases, however we hope to stabilize into stable C ABI for use within foreign language bindings.
§Open Issues
- We probably want to make the DataSource serializable as well, so that we can share source-level state with workers, separately from split serialization.
- We should add a way for the client to negotiate capabilities with the data source, for example which encodings it knows about.
Structs§
- Scan
Request - A request to scan a data source.
Enums§
- Estimate
- An estimate that can be exact, an upper bound, or unknown.
Traits§
- Data
Source - A data source represents a streamable dataset that can be scanned with projection and filter expressions. Each scan produces splits that can be executed (potentially in parallel) to read data. Each split can be serialized for remote execution.
- Data
Source Provider - Create a Vortex source from serialized configuration.
- Data
Source Scan - A data source scan produces splits that can be executed to read data from the source.
- Split
- A split represents a unit of work that can be executed to produce a stream of arrays.
Type Aliases§
- Data
Source Ref - A reference-counted data source.
- Data
Source Scan Ref - A boxed data source scan.
- Split
Ref - A reference-counted split.
- Split
Stream - A stream of splits.