Crate minarrow

Crate minarrow 

Source
Expand description

§Minarrow – High-Performance Rust with Apache Arrow Compatibility

Modern Rust implementation of the Apache Arrow zero-copy memory layout, for high-performance computing, streaming, and embedded systems. Built for those who like it fast and simple.

§Key Features

  • Fast compile times – typically <1.5s for standard builds, <0.15s for rebuilds.
  • 64-byte SIMD alignment for optimal CPU utilisation.
  • High runtime performance – see benchmarks below.
  • Cohesive, well-documented API with extensive coverage.
  • Built-in FFI with simple to_apache_arrow() and to_polars() conversions.
  • MIT Licensed.

§Upcoming Additions

  1. Lightstream-IO – IPC streaming and Tokio async integration.
  2. SIMD Kernels – Large library of pre-optimised computation kernels.

§Compatibility

Implements Apache Arrow’s documented memory layouts while simplifying some APIs. Additional logical types are provided where they add practical value. Learn more about Apache Arrow at: https://arrow.apache.org/overview/.

Minarrow is not affiliated with Apache Arrow or the Apache Software Foundation. Apache Arrow is a registered trademark of the ASF, referenced under fair use.

§Acknowledgements

Thanks to the Apache Arrow community and contributors, with inspiration from Arrow2 and Polars.

§Requirements

Requires Rust nightly for features such as allocator_api.

§Benchmarks

Intel(R) Core(TM) Ultra 7 155H | x86_64 | 22 CPUs

§No SIMD

(n=1000, lanes=4, iters=1000)

CaseAvg time
Vec85 ns
Minarrow direct IntegerArray88 ns
arrow-rs struct Int64Array147 ns
Minarrow enum IntegerArray124 ns
arrow-rs dyn Int64Array181 ns
Vec475 ns
Minarrow direct FloatArray476 ns
arrow-rs struct Float64Array527 ns
Minarrow enum FloatArray507 ns
arrow-rs dyn Float64Array1.952 µs

§SIMD

(n=1000, lanes=4, iters=1000)

CaseAvg time
Vec64 ns
Vec6455 ns
Minarrow direct IntegerArray88 ns
arrow-rs struct Int64Array162 ns
Minarrow enum IntegerArray170 ns
arrow-rs dyn Int64Array173 ns
Vec57 ns
Vec6458 ns
Minarrow direct FloatArray91 ns
arrow-rs struct Float64Array181 ns
Minarrow enum FloatArray180 ns
arrow-rs dyn Float64Array196 ns

§SIMD + Rayon

(n=1,000,000,000, lanes=4)

CaseTime (ms)
SIMD + Rayon IntegerArray113.874
SIMD + Rayon FloatArray114.095

Construction time for Vec (87 ns) and Vec64 (84 ns) excluded from benchmarks.

Re-exports§

pub use aliases::BytesLength;
pub use aliases::DictLength;
pub use aliases::Length;
pub use aliases::Offset;
pub use aliases::ArrayVT;
pub use aliases::BitmaskVT;
pub use aliases::StringAVT;
pub use aliases::StringAVTExt;
pub use aliases::CategoricalAVT;
pub use aliases::CategoricalAVTExt;
pub use aliases::IntegerAVT;
pub use aliases::FloatAVT;
pub use aliases::BooleanAVT;
pub use aliases::DatetimeAVT;
pub use enums::time_units::IntervalUnit;
pub use enums::time_units::TimeUnit;
pub use enums::value::Value;
pub use enums::scalar::Scalar;
pub use enums::array::Array;
pub use enums::collections::numeric_array::NumericArray;
pub use enums::collections::temporal_array::TemporalArray;
pub use enums::collections::text_array::TextArray;
pub use structs::buffer::Buffer;
pub use structs::bitmask::Bitmask;
pub use structs::views::bitmask_view::BitmaskV;
pub use structs::chunked::super_array::SuperArray;
pub use structs::chunked::super_table::SuperTable;
pub use structs::views::chunked::super_array_view::SuperArrayV;
pub use structs::views::chunked::super_table_view::SuperTableV;
pub use structs::views::array_view::ArrayV;
pub use structs::views::collections::numeric_array_view::NumericArrayV;
pub use structs::views::collections::temporal_array_view::TemporalArrayV;
pub use structs::views::collections::text_array_view::TextArrayV;
pub use structs::field::Field;
pub use structs::field_array::FieldArray;
pub use structs::table::Table;
pub use structs::cube::Cube;
pub use structs::matrix::Matrix;
pub use structs::variants::boolean::BooleanArray;
pub use structs::variants::categorical::CategoricalArray;
pub use structs::variants::datetime::DatetimeArray;
pub use structs::variants::float::FloatArray;
pub use structs::variants::integer::IntegerArray;
pub use structs::variants::string::StringArray;
pub use structs::vec64::Vec64;
pub use structs::views::table_view::TableV;
pub use traits::masked_array::MaskedArray;
pub use traits::print::Print;
pub use traits::type_unions::Float;
pub use traits::type_unions::Integer;
pub use traits::type_unions::Numeric;
pub use traits::type_unions::Primitive;
pub use ffi::arrow_dtype::ArrowType;
pub use structs::shared_buffer::SharedBuffer;

Modules§

aliases
Aliases & View Tuples - Lightweight Tuple Views and Fast-To-Type Aliases
conversions
Conversions & Views - Most To/From Boilerplate Implements Here
enums
Array, TextArray, NumericArray…- All the High-Level Array containers are here.
ffi
Shared Memory - Sending data over FFI like a Pro? Look here.
macros
Internal MacrosAutomates boilerplate Array implementations
structs
Table, IntegerArray, FloatArray, Vec64 - All the Low-Level Control, Tables and Views.
traits
Type Standardisation - MaskedArray, View, Print traits + more,
utils
Utilities - Internal Helper Utilities

Macros§

arr_bool
arr_bool_opt
arr_cat8
arr_cat8_opt
arr_cat16
arr_cat32
arr_cat64
arr_cat16_opt
arr_cat32_opt
arr_cat64_opt
arr_f32
arr_f64
arr_f32_opt
arr_f64_opt
arr_i8
arr_i8_opt
arr_i16
arr_i32
arr_i64
arr_i16_opt
arr_i32_opt
arr_i64_opt
arr_str32
arr_str64
arr_str32_opt
arr_str64_opt
arr_u8
arr_u8_opt
arr_u16
arr_u32
arr_u64
arr_u16_opt
arr_u32_opt
arr_u64_opt
has_nulls
impl_arc_masked_array
Overview
impl_array_ref_deref
Implements AsRef, AsMut, Deref, and DerefMut for standard array types with .data: Vec64<T>. This macro is for value buffers only, not dictionary arrays or string offset views.
impl_from_vec_primitive
Implement from_vec + from_std_vec for the “numeric-shaped” arrays (IntegerArray / FloatArray / DatetimeArray).
impl_masked_array
Implements the MaskedArray trait for the given struct and bound.
impl_numeric_array_constructors
Implements standard constructors for columnar array types with data, null_mask, and PhantomData fields.
impl_usize_conversions
Implements usize conversions for integers
match_array
Reduces matching boilerplate when all positive paths share the outcome
vec64