ar_row-rs
Row-oriented access to Apache Arrow
Currently, it only allows reading arrays, not building them.
Arrow is a column-oriented data storage format designed to be stored in memory. While a columnar is very efficient, it can be cumbersome to work with, so this crate provides a work to work on rows by "zipping" columns together into classic Rust structures.
This crate was forked from orcxx, an ORC parsing library, by removing the bindings to the underlying ORC C++ library and rewriting the high-level API to operate on Arrow instead of ORC-specific structures.
The ar_row_derive
crate provides a custom derive
macro.
extern crate ar_row;
extern crate ar_row_derive;
extern crate datafusion_orc;
use File;
use NonZeroU64;
use ProjectionMask;
use ;
use ;
use RowIterator;
use ArRowDeserialize;
// Define structure
// Open file
let orc_path = "../test_data/TestOrcFile.test1.orc";
let file = open.expect;
let builder = try_new.expect;
let projection = named_roots;
let reader = builder.with_projection.build;
let rows: = reader
.flat_map
.collect;
assert_eq!;
RowIterator
API
This API allows reusing the buffer between record batches, but needs RecordBatch
instead of Result<RecordBatch, _>
as input.
extern crate ar_row;
extern crate ar_row_derive;
extern crate datafusion_orc;
use File;
use NonZeroU64;
use ProjectionMask;
use ;
use ;
use RowIterator;
use ArRowDeserialize;
// Define structure
// Open file
let orc_path = "../test_data/TestOrcFile.test1.orc";
let file = open.expect;
let builder = try_new.expect;
let projection = named_roots;
let reader = builder.with_projection.build;
let mut rows: = new
.expect
.collect;
assert_eq!;
Nested structures
The above two examples also work with nested structures:
extern crate ar_row;
extern crate ar_row_derive;
use ArRowDeserialize;