Skip to main content

SuperArray

Struct SuperArray 

Source
pub struct SuperArray {
    pub chunks: Vec<Array>,
    pub field: Option<Arc<Field>>,
    pub null_counts: Option<Vec<usize>>,
}
Expand description

§SuperArray

Higher-order container for multiple immutable Array chunks with optional shared field metadata.

§Description

  • Stores an ordered sequence of Array chunks with a single optional Field for all.
  • Equivalent to Apache Arrow’s ChunkedArray when sent over FFI, where it is treated as a single logical column.
  • It can also serve as an unbounded or continuously growing collection of segments, making it useful for streaming ingestion and partitioned storage.
  • Chunk lengths may vary without restriction.

§Field Metadata

  • Field metadata is stored once at the SuperArray level.
  • For streaming consolidation (e.g., Dam output), field may be None.
  • Use field() to access metadata optionally, field_ref() when metadata is required.

§Example

// From raw arrays without field metadata
let sa = SuperArray::from_arrays(vec![arr1, arr2]);
assert!(sa.field().is_none());

// From arrays with field metadata
let sa = SuperArray::from_arrays_with_field(
    vec![arr1, arr2],
    Field::new("col", ArrowType::Int32, false, None)
);
assert_eq!(sa.field().unwrap().name, "col");

// From FieldArrays (extracts field from first)
let sa = SuperArray::from_field_array_chunks(vec![fa1, fa2]);
assert_eq!(sa.field().unwrap().name, fa1.field.name);

Fields§

§chunks: Vec<Array>

The underlying array chunks.

§field: Option<Arc<Field>>

Optional field metadata, shared by all chunks.

§null_counts: Option<Vec<usize>>

Optional null counts per chunk. If present, must have same length as chunks.

Implementations§

Source§

impl SuperArray

Source

pub fn new() -> SuperArray

Constructs an empty SuperArray with no field metadata.

Source

pub fn from_arrays(chunks: Vec<Array>) -> SuperArray

Constructs a SuperArray from raw Array chunks without field metadata.

Use this for streaming consolidation patterns where field metadata is not needed.

§Panics

Panics if chunks have mismatched types.

Source

pub fn from_arrays_with_field( chunks: Vec<Array>, field: impl Into<Arc<Field>>, ) -> SuperArray

Constructs a SuperArray from raw Array chunks with field metadata.

The field metadata applies to all chunks (they represent the same logical column).

§Panics

Panics if chunks have mismatched types or don’t match the field type.

Source

pub fn from_arrays_nc(chunks: Vec<Array>, null_counts: Vec<usize>) -> SuperArray

Constructs a SuperArray from raw Array chunks with null counts.

§Panics

Panics if chunks have mismatched types or null_counts length doesn’t match chunks length.

Source

pub fn from_field_array_chunks(chunks: Vec<FieldArray>) -> SuperArray

Constructs a SuperArray from FieldArray chunks.

Extracts field metadata and null counts from the chunks.

§Panics

Panics if chunks is empty or metadata/type/nullable mismatch is found.

Source

pub fn from_chunks(chunks: Vec<FieldArray>) -> SuperArray

Construct from Vec<FieldArray>.

Alias for from_field_array_chunks.

Source

pub fn from_slices(slices: &[ArrayV], field: Arc<Field>) -> SuperArray

Materialises a SuperArray from an existing slice of ArrayView tuples, using the provided field metadata (applied to all slices).

Panics if the slice list is empty, or if any slice’s type or nullability does not match the provided field.

Source

pub fn slice(&self, offset: usize, len: usize) -> SuperArrayV

Returns a zero-copy view of this chunked array over the window [offset..offset+len).

If the chunks are fragmented in memory, access patterns may result in degraded cache locality and reduced SIMD optimisation.

§Panics

Panics if field metadata is not present.

Source

pub fn field(&self) -> Option<&Field>

Returns the field metadata if present.

Source

pub fn field_ref(&self) -> &Field

Returns the field metadata, panicking if not present.

Use this when field metadata is required (e.g., for schema operations).

Source

pub fn has_field(&self) -> bool

Returns true if this SuperArray has field metadata.

Source

pub fn field_arc(&self) -> Option<&Arc<Field>>

Returns the Arc-wrapped field if present.

Source

pub fn arrow_type(&self) -> ArrowType

Returns the Arrow physical type from the first chunk.

Falls back to field metadata if no chunks present.

§Panics

Panics if both chunks and field are empty/None.

Source

pub fn is_nullable(&self) -> bool

Returns the nullability flag.

§Panics

Panics if field metadata is not present.

Source

pub fn n_chunks(&self) -> usize

Returns the number of logical chunks.

Source

pub fn len(&self) -> usize

Returns total logical length (sum of all chunk lengths).

Source

pub fn is_empty(&self) -> bool

Returns true if the array has no chunks or all chunks are empty.

Source

pub fn chunk(&self, idx: usize) -> Option<&Array>

Returns a reference to a specific chunk, if it exists.

Source

pub fn chunk_null_count(&self, idx: usize) -> Option<usize>

Returns the null count for a specific chunk, if available.

Source

pub fn push(&mut self, chunk: Array)

Appends a raw array chunk.

If null counts are being tracked, the null count is computed from the array’s null_mask. If you already know the null count, use push_with_null_count() to avoid recomputation.

§Panics

Panics if the chunk type doesn’t match existing chunks or field.

Source

pub fn push_with_null_count(&mut self, chunk: Array, null_count: usize)

Appends a raw array chunk with its null count.

When the null count is already known this is slightly faster than push

Source

pub fn push_field_array(&mut self, chunk: FieldArray)

Validates and appends a FieldArray chunk.

If this SuperArray has no field metadata yet, it will be set from the chunk.

§Panics

If the chunk does not match the expected type, nullability, or field name.

Source

pub fn insert_rows( &mut self, index: usize, other: impl Into<SuperArray>, ) -> Result<(), MinarrowError>

Inserts rows from another SuperArray (or Array) at the specified index.

This is an O(n) operation where n is the number of elements in the chunk containing the insertion point.

§Arguments
  • index - Global row position before which to insert (0 = prepend, len() = append)
  • other - SuperArray or Array to insert (via Into<SuperArray>)
§Requirements
  • Array types must match
  • index must be <= self.len()
§Strategy

Finds the chunk containing the insertion point and inserts all of other’s data into that chunk. This may cause the target chunk to grow significantly.

§Errors
  • IndexError if index > len()
  • Type mismatch if array types don’t match
Source

pub fn rechunk( &mut self, strategy: RechunkStrategy, ) -> Result<(), MinarrowError>

Rechunks the array according to the specified strategy.

Redistributes data across chunks using an efficient incremental approach that avoids full materialisation:

  • Count(n): Creates chunks of n elements. The last chunk may be smaller.
  • Auto: Uses a default size of 8192 elements
  • Memory(bytes): Targets a specific memory size per chunk
§Arguments
  • strategy - The rechunking strategy to use
§Errors
  • Returns IndexError if Count(0) is specified
  • Returns IndexError if memory-based calculation results in 0 chunk size
§Example
// Rechunk into 1024-element chunks
array.rechunk(RechunkStrategy::Count(1024))?;

// Rechunk with default size
array.rechunk(RechunkStrategy::Auto)?;

// Target 64KB per chunk
array.rechunk(RechunkStrategy::Memory(65536))?;
Source

pub fn rechunk_to( &mut self, up_to_index: usize, strategy: RechunkStrategy, ) -> Result<(), MinarrowError>

Rechunks only the first up_to_index elements, leaving the rest untouched.

This is useful for streaming scenarios where new data is being appended and you want to rechunk stable data while leaving recent additions alone.

§Arguments
  • up_to_index - Rechunk only elements before this index
  • strategy - The rechunking strategy to use
§Errors
  • Returns IndexError if up_to_index is greater than array length
  • Returns same errors as rechunk() for invalid strategies
§Example
// Rechunk first 1000 elements, leave the rest untouched
array.rechunk_to(1000, RechunkStrategy::Count(512))?;
Source

pub fn into_chunks(self) -> Vec<Array>

Consumes the SuperArray and returns the underlying chunks.

Source

pub fn chunks(&self) -> &[Array]

Returns a reference to the underlying chunks.

Trait Implementations§

Source§

impl AsRef<SuperArray> for PyChunkedArray

Source§

fn as_ref(&self) -> &SuperArray

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl Clone for SuperArray

Source§

fn clone(&self) -> SuperArray

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Concatenate for SuperArray

Source§

fn concat(self, other: SuperArray) -> Result<SuperArray, MinarrowError>

Concatenates two SuperArrays by appending all chunks from other to self.

§Requirements
  • Both SuperArrays must have compatible types
§Returns

A new SuperArray containing all chunks from self followed by all chunks from other

§Errors
  • IncompatibleTypeError if array types don’t match
Source§

impl Consolidate for SuperArray

Source§

fn consolidate(self) -> Array

Consolidates all chunks into a single contiguous Array.

Materialises all rows from all chunks into one contiguous buffer. Use this when you need contiguous memory for operations or APIs that require single buffers.

When the arena feature is enabled, all buffers are written into a single allocation then sliced into typed views. The resulting buffers are SharedBuffer-backed; mutations trigger copy-on-write.

Without the arena feature, falls back to concat fold.

§Panics

Panics if the SuperArray is empty.

Source§

type Output = Array

The type produced after consolidation.
Source§

impl Debug for SuperArray

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl Default for SuperArray

Source§

fn default() -> SuperArray

Returns the “default value” for a type. Read more
Source§

impl Display for SuperArray

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl From<Array> for SuperArray

Source§

fn from(array: Array) -> SuperArray

Converts to this type from the input type.
Source§

impl From<FieldArray> for SuperArray

Source§

fn from(fa: FieldArray) -> SuperArray

Converts to this type from the input type.
Source§

impl From<PyChunkedArray> for SuperArray

Source§

fn from(value: PyChunkedArray) -> Self

Converts to this type from the input type.
Source§

impl From<SuperArray> for PyChunkedArray

Source§

fn from(array: SuperArray) -> Self

Converts to this type from the input type.
Source§

impl From<Vec<Array>> for SuperArray

Source§

fn from(arrays: Vec<Array>) -> SuperArray

Converts to this type from the input type.
Source§

impl From<Vec<FieldArray>> for SuperArray

Source§

fn from(arrays: Vec<FieldArray>) -> SuperArray

Converts to this type from the input type.
Source§

impl FromIterator<Array> for SuperArray

Source§

fn from_iter<T>(iter: T) -> SuperArray
where T: IntoIterator<Item = Array>,

Creates a value from an iterator. Read more
Source§

impl FromIterator<FieldArray> for SuperArray

Source§

fn from_iter<T>(iter: T) -> SuperArray
where T: IntoIterator<Item = FieldArray>,

Creates a value from an iterator. Read more
Source§

impl PartialEq for SuperArray

Source§

fn eq(&self, other: &SuperArray) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Shape for SuperArray

Source§

fn shape(&self) -> ShapeDim

Returns arbitrary Shape dimension for any data shape
Source§

fn shape_1d(&self) -> usize

Returns the first dimension shape Read more
Source§

fn shape_2d(&self) -> (usize, usize)

Returns the first and second dimension shapes Read more
Source§

fn shape_3d(&self) -> (usize, usize, usize)

Returns the first, second and third dimension shapes Read more
Source§

fn shape_4d(&self) -> (usize, usize, usize, usize)

Returns the first, second, third and fourth dimension shapes Read more
Source§

impl StructuralPartialEq for SuperArray

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> CustomValue for T
where T: Any + Send + Sync + Clone + PartialEq + Debug,

Source§

fn as_any(&self) -> &(dyn Any + 'static)

Downcasts the type as Any
Source§

fn deep_clone(&self) -> Arc<dyn CustomValue>

Returns a deep clone of the object. Read more
Source§

fn eq_box(&self, other: &(dyn CustomValue + 'static)) -> bool

Performs semantic equality on the boxed object. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Print for T
where T: Display,

Source§

fn print(&self)
where Self: Display,

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Ungil for T
where T: Send,