Crate datafusion_datasource

Source
Expand description

A table that uses the ObjectStore listing capability to get the list of files to process.

Re-exports§

pub use self::file::as_file_source;
pub use self::url::ListingTableUrl;

Modules§

decoder
Module containing helper methods for the various file formats See write.rs for write related helper methods
display
file
Common behaviors that every file format needs to implement
file_compression_type
File Compression type abstraction
file_format
Module containing helper methods for the various file formats See write.rs for write related helper methods
file_groups
Logic for managing groups of PartitionedFiles in DataFusion
file_meta
file_scan_config
FileScanConfig to configure scanning of possibly partitioned file sources.
file_sink_config
file_stream
A generic stream over file format readers that can be used by any file format that read its files from start to end.
memory
schema_adapter
SchemaAdapter and SchemaAdapterFactory to adapt file-level record batches to a table schema.
sink
Execution plan for writing data to DataSinks
source
DataSource and DataSourceExec
url
write
Module containing helper methods/traits related to enabling write support for the various file formats

Structs§

FileRange
Only scan a subset of Row Groups from the Parquet file whose data “midpoint” lies within the [start, end) byte offsets. This option can be used to scan non-overlapping sections of a Parquet file in parallel.
PartitionedFile
A single file or part of a file that should be read, along with its schema, statistics and partition column values that need to be appended to each row.

Enums§

RangeCalculation
Represents the possible outcomes of a range calculation.

Functions§

add_row_statsDeprecated
calculate_range
Calculates an appropriate byte range for reading from an object based on the provided metadata.
compute_all_files_statistics
Computes statistics for all files across multiple file groups.
generate_test_files
Generates test files with min-max statistics in different overlap patterns.
verify_sort_integrity
Used by tests and benchmarks

Type Aliases§

PartitionedFileStream
Stream of files get listed from object store