Expand description
Welcome to arrow2’s documentation. Thanks for checking it out!
This is a library for efficient in-memory data operations with
Arrow in-memory format.
It is a re-write from the bottom up of the official arrow crate with soundness
and type safety in mind.
Check out the guide for an introduction. Below is an example of some of the things you can do with it:
use std::sync::Arc;
use arrow2::array::*;
use arrow2::datatypes::{Field, DataType, Schema};
use arrow2::compute::arithmetics;
use arrow2::error::Result;
use arrow2::io::parquet::write::*;
use arrow2::chunk::Chunk;
fn main() -> Result<()> {
    // declare arrays
    let a = Int32Array::from(&[Some(1), None, Some(3)]);
    let b = Int32Array::from(&[Some(2), None, Some(6)]);
    // compute (probably the fastest implementation of a nullable op you can find out there)
    let c = arithmetics::basic::mul_scalar(&a, &2);
    assert_eq!(c, b);
    // declare a schema with fields
    let schema = Schema::from(vec![
        Field::new("c1", DataType::Int32, true),
        Field::new("c2", DataType::Int32, true),
    ]);
    // declare chunk
    let chunk = Chunk::new(vec![a.arced(), b.arced()]);
    // write to parquet (probably the fastest implementation of writing to parquet out there)
    let options = WriteOptions {
        write_statistics: true,
        compression: CompressionOptions::Snappy,
        version: Version::V1,
    };
    let row_groups = RowGroupIterator::try_new(
        vec![Ok(chunk)].into_iter(),
        &schema,
        options,
        vec![vec![Encoding::Plain], vec![Encoding::Plain]],
    )?;
    // anything implementing `std::io::Write` works
    let mut file = vec![];
    let mut writer = FileWriter::try_new(file, schema, options)?;
    // Write the file.
    for group in row_groups {
        writer.write(group?)?;
    }
    let _ = writer.end(None)?;
    Ok(())
}Cargo features
This crate has a significant number of cargo features to reduce compilation
time and number of dependencies. The feature "full" activates most
functionality, such as:
io_ipc: to interact with the Arrow IPC formatio_ipc_compression: to read and write compressed Arrow IPC (v2)io_csvto read and write CSVio_jsonto read and write JSONio_flightto read and write to Arrow’s Flight protocolio_parquetto read and write parquetio_parquet_compressionto read and write compressed parquetio_printto write batches to formatted ASCII tablescomputeto operate on arrays (addition, sum, sort, etc.)
The feature simd (not part of full) produces more explicit SIMD instructions
via std::simd, but requires the 
nightly channel.
Modules
Contains the 
Array and MutableArray trait objects declaring arrays,
as well as concrete arrays (such as Utf8Array and MutableUtf8Array).contains a wide range of compute operations (e.g.
arithmetics, aggregate,
filter, comparison, and sort)contains FFI bindings to import and export 
Array via
Arrow’s C Data Interfacecontains the 
Scalar trait object representing individual items of Arrays,
as well as concrete implementations such as BooleanScalar.Conversion methods for dates and times.
Declares 
TrustedLen.Sealed traits and implementations to handle all physical types used in this crate.
Misc utilities used in different places in the crate.
Structs
Enums
The enum 
Either with variants Left and Right is a general purpose
sum type with two cases.