Expand description
APIs to read from ORC
Reading from ORC is essentially composed by:
- Identify the column type based on the file’s schema
- Read the stripe (or part of it in projection pushdown)
- For each column, select the relevant region of the stripe
- Attach an Iterator to the region
Modules§
- decode
- Contains different iterators that receive a reader (
std::io::Read) and return values for each of ORC’s physical types (e.g. boolean). - decompress
- Contains
Decompressor
Structs§
- Column
- Helper struct used to access the streams associated to an ORC column.
Its main use
Column::get_stream, to get a stream. - File
Metadata - The file’s metadata.
Functions§
- read_
metadata - read_
stripe_ column - Reads
columnfrom the stripe into aColumn.scratchbecomes owned byColumn, which you can recover viainto_inner. - read_
stripe_ footer - Reads, decompresses and deserializes the stripe’s footer as
StripeFooterusingscratchas an intermediary memory region.