Skip to main content

Crate datafusion_datasource

Crate datafusion_datasource 

Source
Expand description

A table that uses the ObjectStore listing capability to get the list of files to process.

Re-exports§

pub use self::file::as_file_source;
pub use self::url::ListingTableUrl;
pub use table_schema::TableSchema;

Modules§

decoder
Module containing helper methods for the various file formats See write.rs for write related helper methods
display
file
Common behaviors that every file format needs to implement
file_compression_type
File Compression type abstraction
file_format
Module containing helper methods for the various file formats See write.rs for write related helper methods
file_groups
Logic for managing groups of PartitionedFiles in DataFusion
file_scan_config
FileScanConfig to configure scanning of possibly partitioned file sources.
file_sink_config
file_stream
A generic stream over file format readers that can be used by any file format that read its files from start to end.
memory
morsel
Structures for Morsel Driven IO.
projection
schema_adapter
Deprecated: SchemaAdapter and SchemaAdapterFactory have been removed.
sink
Execution plan for writing data to DataSinks
source
DataSource and DataSourceExec
table_schema
Helper struct to manage table schemas with partition columns
url
write
Module containing helper methods/traits related to enabling write support for the various file formats

Structs§

FileRange
Only scan a subset of Row Groups from the Parquet file whose data “midpoint” lies within the [start, end) byte offsets. This option can be used to scan non-overlapping sections of a Parquet file in parallel.
PartitionedFile
A single file or part of a file that should be read, along with its schema, statistics and partition column values that need to be appended to each row.

Enums§

RangeCalculation
Represents the possible outcomes of a range calculation.

Functions§

add_row_statsDeprecated
calculate_range
Calculates an appropriate byte range for reading from an object based on the provided metadata.
compute_all_files_statistics
Computes statistics for all files across multiple file groups.
generate_test_files
Generates test files with min-max statistics in different overlap patterns.
verify_sort_integrity
Used by tests and benchmarks

Type Aliases§

FileExtensions
User-defined per-file extension data, keyed by concrete Rust type.
PartitionedFileStreamDeprecated
Stream of files get listed from object store