Expand description
The central type in Apache Arrow are arrays, represented
by the Array
trait.
An array represents a known-length sequence of values all
having the same type.
Internally, those values are represented by one or several
buffers, the number and meaning
of which depend on the array’s data type, as documented in
the Arrow data layout specification.
For example, the type Int16Array
represents an Apache
Arrow array of 16-bit integers.
Those buffers consist of the value data itself and an optional bitmap buffer that indicates which array entries are null values. The bitmap buffer can be entirely omitted if the array is known to have zero null values.
There are concrete implementations of this trait for each data type, that help you access individual values of the array.
Building an Array
Arrow’s Arrays
are immutable, but there is the trait
ArrayBuilder
that helps you with constructing new Arrays
. As with the
Array
trait, there are builder implementations for all
concrete array types.
Example
use arrow::array::Int16Array;
// Create a new builder with a capacity of 100
let mut builder = Int16Array::builder(100);
// Append a single primitive value
builder.append_value(1).unwrap();
// Append a null value
builder.append_null().unwrap();
// Append a slice of primitive values
builder.append_slice(&[2, 3, 4]).unwrap();
// Build the array
let array = builder.finish();
assert_eq!(
5,
array.len(),
"The array has 5 values, counting the null value"
);
assert_eq!(2, array.value(2), "Get the value with index 2");
assert_eq!(
&array.values()[3..5],
&[3, 4],
"Get slice of len 2 starting at idx 3"
)
Structs
An generic representation of Arrow array data which encapsulates common attributes and
operations for Arrow array. Specific operations for different arrays types (e.g.,
primitive, list, struct) are implemented in Array
.
Builder for ArrayData
type
Array of bools
Array builder for fixed-width primitive types
an iterator that returns Some(bool) or None.
Builder for creating a Buffer
object.
DecimalArray
stores fixed width decimal numbers,
with a fixed precision and scale.
Array Builder for DecimalArray
an iterator that returns Some(i128)
or None
, that can be used on a
DecimalArray
A dictionary array where each element is a single value indexed by an integer key. This is mostly used to represent strings or a limited set of primitive types as integers, for example when doing NLP analysis or representing chromosomes by name.
An array where each element is a fixed-size sequence of bytes.
A list array where each element is a fixed-size sequence of values with the same type whose maximum length is represented by a i32.
Array builder for ListArray
See BinaryArray
and LargeBinaryArray
for storing
binary data.
an iterator that returns Some(&[u8])
or None
, for binary arrays
Generic struct for a variable-size list array.
Array builder for ListArray
Generic struct for [Large]StringArray
an iterator that returns Some(&str)
or None
, for string arrays
A nested array type where each record is a key-value map. Keys should always be non-null, but values can be null.
An Array where all elements are nulls
Array whose elements are of primitive types.
Array builder for fixed-width primitive types
Array builder for DictionaryArray
. For example to map a set of byte indices
to f32 values. Note that the use of a HashMap
here will not scale to very large
arrays or result in an ordered dictionary.
an iterator that returns Some(T) or None, that can be used on any PrimitiveArray
Array builder for DictionaryArray
that stores Strings. For example to map a set of byte indices
to String values. Note that the use of a HashMap
here will not scale to very large
arrays or result in an ordered dictionary.
A nested array type where each child (called field) is represented by a separate array.
Array builder for Struct types.
An Array that can represent slots of varying types.
Builder type for creating a new UnionArray
.
Enums
Define capacities of child data or data buffers.
Traits
Trait for dealing with different types of array at runtime when the type of the array is not known in advance.
Trait for dealing with different array builders at runtime
Trait for comparing arrow array with json array
trait declaring an offset size, relevant for i32 vs i64 array types.
Functions
Force downcast ArrayRef to BooleanArray
Force downcast ArrayRef to DecimalArray
Force downcast ArrayRef to DictionaryArray
Force downcast ArrayRef to GenericBinaryArray
Force downcast ArrayRef to GenericListArray
Force downcast ArrayRef to LargeListArray
Force downcast ArrayRef to LargeStringArray
Force downcast ArrayRef to ListArray
Force downcast ArrayRef to MapArray
Force downcast ArrayRef to NullArray
Force downcast ArrayRef to PrimitiveArray
Force downcast ArrayRef to StringArray
Force downcast ArrayRef to StructArray
Force downcast ArrayRef to UnionArray
returns a comparison function that compares two values at two different positions between the two arrays. The arrays’ types must be equal.
Exports an array to raw pointers of the C Data Interface provided by the consumer.
Constructs an array using the input data
.
Returns a reference-counted Array
instance.
Creates a new array from two FFI pointers. Used to import arrays from the C Data Interface
Returns a builder with capacity capacity
that corresponds to the datatype DataType
This function is useful to construct arrays from an arbitrary vectors with known/expected
schema.
Creates a new empty array
Creates a new array of data_type
of length length
filled
entirely of NULL
values
Type Definitions
A reference-counted reference to a generic Array
.
An array where each element contains 0 or more bytes. The byte length of each element is represented by an i32.
Compare the values at two arbitrary indices in two arrays.
Example: Using collect
Example: Using collect
Example: Using collect
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
An array where each element contains 0 or more bytes. The byte length of each element is represented by an i64.
A list array where each element is a variable-sized sequence of values with the same type whose memory offsets between elements are represented by a i64.
An array where each element is a variable-sized sequence of bytes representing a string whose maximum length (in bytes) is represented by a i64.
A list array where each element is a variable-sized sequence of values with the same type whose memory offsets between elements are represented by a i32.
An array where each element is a variable-sized sequence of bytes representing a string whose maximum length (in bytes) is represented by a i32.
A primitive array where each element is of type TimestampMicrosecondType.
See examples for TimestampSecondArray.
A primitive array where each element is of type TimestampMillisecondType.
See examples for TimestampSecondArray.
A primitive array where each element is of type TimestampNanosecondType.
See examples for TimestampSecondArray.
A primitive array where each element is of type TimestampSecondType.
See also Timestamp
.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.
Example: Using collect
A dictionary array where each element is a single value indexed by an integer key.