Expand description

Support for the Arrow IPC Format

The Arrow IPC format defines how to read and write RecordBatches to/from a file or stream of bytes. This format can be used to serialize and deserialize data to files and over the network.

There are two variants of the IPC format:

  1. IPC Streaming Format: Supports streaming data sources, implemented by StreamReader and StreamWriter

  2. IPC File Format: Supports random access, implemented by FileReader and FileWriter.

See the reader and writer modules for more information.

Modules§

convert
Utilities for converting between IPC types and native Arrow types
gen
Generated code
reader
Arrow IPC File and Stream Readers
writer
Arrow IPC File and Stream Writers

Structs§

Binary
Opaque binary data
BinaryArgs
BinaryBuilder
BinaryView
Logically the same as Binary, but the internal representation uses a view struct that contains the string length and either the string’s entire data inline (for small strings) or an inlined prefix, an index of another buffer, and an offset pointing to a slice in that buffer (for non-small strings).
BinaryViewArgs
BinaryViewBuilder
Block
BodyCompression
Optional compression for the memory buffers constituting IPC message bodies. Intended for use with RecordBatch but could be used for other message types
BodyCompressionArgs
BodyCompressionBuilder
BodyCompressionMethod
Provided for forward compatibility in case we need to support different strategies for compressing the IPC message body (like whole-body compression rather than buffer-level) in the future
Bool
BoolArgs
BoolBuilder
Buffer

CompressionType
Date
Date is either a 32-bit or 64-bit signed integer type representing an elapsed time since UNIX epoch (1970-01-01), stored in either of two units:
DateArgs
DateBuilder
DateUnit
Decimal
Exact decimal value represented as an integer value in two’s complement. Currently only 128-bit (16-byte) and 256-bit (32-byte) integers are used. The representation uses the endianness indicated in the Schema.
DecimalArgs
DecimalBuilder
DictionaryBatch
For sending dictionary encoding information. Any Field can be dictionary-encoded, but in this case none of its children may be dictionary-encoded. There is one vector / column per dictionary, but that vector / column may be spread across multiple dictionary batches by using the isDelta flag
DictionaryBatchArgs
DictionaryBatchBuilder
DictionaryEncoding
DictionaryEncodingArgs
DictionaryEncodingBuilder
DictionaryKind

Duration
DurationArgs
DurationBuilder
Endianness

Feature
Represents Arrow Features that might not have full support within implementations. This is intended to be used in two scenarios:
Field

FieldArgs
FieldBuilder
FieldNode

FixedSizeBinary
FixedSizeBinaryArgs
FixedSizeBinaryBuilder
FixedSizeList
FixedSizeListArgs
FixedSizeListBuilder
FloatingPoint
FloatingPointArgs
FloatingPointBuilder
Footer

FooterArgs
FooterBuilder
Int
IntArgs
IntBuilder
Interval
IntervalArgs
IntervalBuilder
IntervalUnit
KeyValue

KeyValueArgs
KeyValueBuilder
LargeBinary
Same as Binary, but with 64-bit offsets, allowing to represent extremely large data values.
LargeBinaryArgs
LargeBinaryBuilder
LargeList
Same as List, but with 64-bit offsets, allowing to represent extremely large data values.
LargeListArgs
LargeListBuilder
LargeListView
Same as ListView, but with 64-bit offsets and sizes, allowing to represent extremely large data values.
LargeListViewArgs
LargeListViewBuilder
LargeUtf8
Same as Utf8, but with 64-bit offsets, allowing to represent extremely large data values.
LargeUtf8Args
LargeUtf8Builder
List
ListArgs
ListBuilder
ListView
Represents the same logical types that List can, but contains offsets and sizes allowing for writes in any order and sharing of child values among list values.
ListViewArgs
ListViewBuilder
Map
A Map is a logical nested type that is represented as
MapArgs
MapBuilder
Message
MessageArgs
MessageBuilder
MessageHeader

MessageHeaderUnionTableOffset
MetadataVersion
Null
These are stored in the flatbuffer in the Type union below
NullArgs
NullBuilder
Precision
RecordBatch
A data header describing the shared memory layout of a “record” or “row” batch. Some systems call this a “row batch” internally and others a “record batch”.
RecordBatchArgs
RecordBatchBuilder
RunEndEncoded
Contains two child arrays, run_ends and values. The run_ends child array must be a 16/32/64-bit integer array which encodes the indices at which the run with the value in each corresponding index in the values child array ends. Like list/struct types, the value array can be of any type.
RunEndEncodedArgs
RunEndEncodedBuilder
Schema

SchemaArgs
SchemaBuilder
SparseMatrixCompressedAxis
SparseMatrixIndexCSX
Compressed Sparse format, that is matrix-specific.
SparseMatrixIndexCSXArgs
SparseMatrixIndexCSXBuilder
SparseTensor
SparseTensorArgs
SparseTensorBuilder
SparseTensorIndex
SparseTensorIndexCOO

SparseTensorIndexCOOArgs
SparseTensorIndexCOOBuilder
SparseTensorIndexCSF
Compressed Sparse Fiber (CSF) sparse tensor index.
SparseTensorIndexCSFArgs
SparseTensorIndexCSFBuilder
SparseTensorIndexUnionTableOffset
Struct_
A Struct_ in the flatbuffer metadata is the same as an Arrow Struct (according to the physical memory layout). We used Struct_ here as Struct is a reserved word in Flatbuffers
Struct_Args
Struct_Builder
Tensor
TensorArgs
TensorBuilder
TensorDim

TensorDimArgs
TensorDimBuilder
Time
Time is either a 32-bit or 64-bit signed integer type representing an elapsed time since midnight, stored in either of four units: seconds, milliseconds, microseconds or nanoseconds.
TimeArgs
TimeBuilder
TimeUnit
Timestamp
Timestamp is a 64-bit signed integer representing an elapsed time since a fixed epoch, stored in either of four units: seconds, milliseconds, microseconds or nanoseconds, and is optionally annotated with a timezone.
TimestampArgs
TimestampBuilder
Type

TypeUnionTableOffset
Union
A union is a complex type with children in Field By default ids in the type vector refer to the offsets in the children optionally typeIds provides an indirection between the child offset and the type id for each child typeIds[offset] is the id used in the type vector
UnionArgs
UnionBuilder
UnionMode
Utf8
Unicode with UTF-8 encoding
Utf8Args
Utf8Builder
Utf8View
Logically the same as Utf8, but the internal representation uses a view struct that contains the string length and either the string’s entire data inline (for small strings) or an inlined prefix, an index of another buffer, and an offset pointing to a slice in that buffer (for non-small strings).
Utf8ViewArgs
Utf8ViewBuilder

Enums§

BinaryOffset
BinaryViewOffset
BodyCompressionOffset
BoolOffset
DateOffset
DecimalOffset
DictionaryBatchOffset
DictionaryEncodingOffset
DurationOffset
FieldOffset
FixedSizeBinaryOffset
FixedSizeListOffset
FloatingPointOffset
FooterOffset
IntOffset
IntervalOffset
KeyValueOffset
LargeBinaryOffset
LargeListOffset
LargeListViewOffset
LargeUtf8Offset
ListOffset
ListViewOffset
MapOffset
MessageOffset
NullOffset
RecordBatchOffset
RunEndEncodedOffset
SchemaOffset
SparseMatrixIndexCSXOffset
SparseTensorIndexCOOOffset
SparseTensorIndexCSFOffset
SparseTensorOffset
Struct_Offset
TensorDimOffset
TensorOffset
TimeOffset
TimestampOffset
UnionOffset
Utf8Offset
Utf8ViewOffset

Constants§

ENUM_MAX_BODY_COMPRESSION_METHODDeprecated
ENUM_MAX_COMPRESSION_TYPEDeprecated
ENUM_MAX_DATE_UNITDeprecated
ENUM_MAX_DICTIONARY_KINDDeprecated
ENUM_MAX_ENDIANNESSDeprecated
ENUM_MAX_FEATUREDeprecated
ENUM_MAX_INTERVAL_UNITDeprecated
ENUM_MAX_MESSAGE_HEADERDeprecated
ENUM_MAX_METADATA_VERSIONDeprecated
ENUM_MAX_PRECISIONDeprecated
ENUM_MAX_SPARSE_MATRIX_COMPRESSED_AXISDeprecated
ENUM_MAX_SPARSE_TENSOR_INDEXDeprecated
ENUM_MAX_TIME_UNITDeprecated
ENUM_MAX_TYPEDeprecated
ENUM_MAX_UNION_MODEDeprecated
ENUM_MIN_BODY_COMPRESSION_METHODDeprecated
ENUM_MIN_COMPRESSION_TYPEDeprecated
ENUM_MIN_DATE_UNITDeprecated
ENUM_MIN_DICTIONARY_KINDDeprecated
ENUM_MIN_ENDIANNESSDeprecated
ENUM_MIN_FEATUREDeprecated
ENUM_MIN_INTERVAL_UNITDeprecated
ENUM_MIN_MESSAGE_HEADERDeprecated
ENUM_MIN_METADATA_VERSIONDeprecated
ENUM_MIN_PRECISIONDeprecated
ENUM_MIN_SPARSE_MATRIX_COMPRESSED_AXISDeprecated
ENUM_MIN_SPARSE_TENSOR_INDEXDeprecated
ENUM_MIN_TIME_UNITDeprecated
ENUM_MIN_TYPEDeprecated
ENUM_MIN_UNION_MODEDeprecated
ENUM_VALUES_BODY_COMPRESSION_METHODDeprecated
ENUM_VALUES_COMPRESSION_TYPEDeprecated
ENUM_VALUES_DATE_UNITDeprecated
ENUM_VALUES_DICTIONARY_KINDDeprecated
ENUM_VALUES_ENDIANNESSDeprecated
ENUM_VALUES_FEATUREDeprecated
ENUM_VALUES_INTERVAL_UNITDeprecated
ENUM_VALUES_MESSAGE_HEADERDeprecated
ENUM_VALUES_METADATA_VERSIONDeprecated
ENUM_VALUES_PRECISIONDeprecated
ENUM_VALUES_SPARSE_MATRIX_COMPRESSED_AXISDeprecated
ENUM_VALUES_SPARSE_TENSOR_INDEXDeprecated
ENUM_VALUES_TIME_UNITDeprecated
ENUM_VALUES_TYPEDeprecated
ENUM_VALUES_UNION_MODEDeprecated

Functions§

finish_footer_buffer
finish_message_buffer
finish_schema_buffer
finish_size_prefixed_footer_buffer
finish_size_prefixed_message_buffer
finish_size_prefixed_schema_buffer
finish_size_prefixed_sparse_tensor_buffer
finish_size_prefixed_tensor_buffer
finish_sparse_tensor_buffer
finish_tensor_buffer
root_as_footer
Verifies that a buffer of bytes contains a Footer and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_footer_unchecked.
root_as_footer_unchecked
Assumes, without verification, that a buffer of bytes contains a Footer and returns it.
root_as_footer_with_opts
Verifies, with the given options, that a buffer of bytes contains a Footer and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_footer_unchecked.
root_as_message
Verifies that a buffer of bytes contains a Message and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_message_unchecked.
root_as_message_unchecked
Assumes, without verification, that a buffer of bytes contains a Message and returns it.
root_as_message_with_opts
Verifies, with the given options, that a buffer of bytes contains a Message and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_message_unchecked.
root_as_schema
Verifies that a buffer of bytes contains a Schema and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_schema_unchecked.
root_as_schema_unchecked
Assumes, without verification, that a buffer of bytes contains a Schema and returns it.
root_as_schema_with_opts
Verifies, with the given options, that a buffer of bytes contains a Schema and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_schema_unchecked.
root_as_sparse_tensor
Verifies that a buffer of bytes contains a SparseTensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_sparse_tensor_unchecked.
root_as_sparse_tensor_unchecked
Assumes, without verification, that a buffer of bytes contains a SparseTensor and returns it.
root_as_sparse_tensor_with_opts
Verifies, with the given options, that a buffer of bytes contains a SparseTensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_sparse_tensor_unchecked.
root_as_tensor
Verifies that a buffer of bytes contains a Tensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_tensor_unchecked.
root_as_tensor_unchecked
Assumes, without verification, that a buffer of bytes contains a Tensor and returns it.
root_as_tensor_with_opts
Verifies, with the given options, that a buffer of bytes contains a Tensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_tensor_unchecked.
size_prefixed_root_as_footer
Verifies that a buffer of bytes contains a size prefixed Footer and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use size_prefixed_root_as_footer_unchecked.
size_prefixed_root_as_footer_unchecked
Assumes, without verification, that a buffer of bytes contains a size prefixed Footer and returns it.
size_prefixed_root_as_footer_with_opts
Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed Footer and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_footer_unchecked.
size_prefixed_root_as_message
Verifies that a buffer of bytes contains a size prefixed Message and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use size_prefixed_root_as_message_unchecked.
size_prefixed_root_as_message_unchecked
Assumes, without verification, that a buffer of bytes contains a size prefixed Message and returns it.
size_prefixed_root_as_message_with_opts
Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed Message and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_message_unchecked.
size_prefixed_root_as_schema
Verifies that a buffer of bytes contains a size prefixed Schema and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use size_prefixed_root_as_schema_unchecked.
size_prefixed_root_as_schema_unchecked
Assumes, without verification, that a buffer of bytes contains a size prefixed Schema and returns it.
size_prefixed_root_as_schema_with_opts
Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed Schema and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_schema_unchecked.
size_prefixed_root_as_sparse_tensor
Verifies that a buffer of bytes contains a size prefixed SparseTensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use size_prefixed_root_as_sparse_tensor_unchecked.
size_prefixed_root_as_sparse_tensor_unchecked
Assumes, without verification, that a buffer of bytes contains a size prefixed SparseTensor and returns it.
size_prefixed_root_as_sparse_tensor_with_opts
Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed SparseTensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_sparse_tensor_unchecked.
size_prefixed_root_as_tensor
Verifies that a buffer of bytes contains a size prefixed Tensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use size_prefixed_root_as_tensor_unchecked.
size_prefixed_root_as_tensor_unchecked
Assumes, without verification, that a buffer of bytes contains a size prefixed Tensor and returns it.
size_prefixed_root_as_tensor_with_opts
Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed Tensor and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_tensor_unchecked.