Struct arrow_array::array::RunArray  
source · pub struct RunArray<R: RunEndIndexType> { /* private fields */ }Expand description
A run-end encoding (REE) is a variation of run-length encoding (RLE).
This encoding is good for representing data containing same values repeated consecutively.
RunArray contains run_ends array and values array of same length.
The run_ends array stores the indexes at which the run ends. The values array
stores the value of each run. Below example illustrates how a logical array is represented in
RunArray
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┐
  ┌─────────────────┐  ┌─────────┐       ┌─────────────────┐
│ │        A        │  │    2    │ │     │        A        │     
  ├─────────────────┤  ├─────────┤       ├─────────────────┤
│ │        D        │  │    3    │ │     │        A        │    run length of 'A' = runs_ends[0] - 0 = 2
  ├─────────────────┤  ├─────────┤       ├─────────────────┤
│ │        B        │  │    6    │ │     │        D        │    run length of 'D' = run_ends[1] - run_ends[0] = 1
  └─────────────────┘  └─────────┘       ├─────────────────┤
│        values          run_ends  │     │        B        │     
                                         ├─────────────────┤
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┘     │        B        │     
                                         ├─────────────────┤
               RunArray                  │        B        │    run length of 'B' = run_ends[2] - run_ends[1] = 3
              length = 3                 └─────────────────┘
  
                                            Logical array
                                               Contents
Implementations§
source§impl<R: RunEndIndexType> RunArray<R>
 
impl<R: RunEndIndexType> RunArray<R>
sourcepub fn logical_len(run_ends: &PrimitiveArray<R>) -> usize
 
pub fn logical_len(run_ends: &PrimitiveArray<R>) -> usize
Calculates the logical length of the array encoded by the given run_ends array.
sourcepub fn try_new(
    run_ends: &PrimitiveArray<R>,
    values: &dyn Array
) -> Result<Self, ArrowError>
 
pub fn try_new( run_ends: &PrimitiveArray<R>, values: &dyn Array ) -> Result<Self, ArrowError>
Attempts to create RunArray using given run_ends (index where a run ends) and the values (value of the run). Returns an error if the given data is not compatible with RunEndEncoded specification.
sourcepub fn run_ends(&self) -> &RunEndBuffer<R::Native>
 
pub fn run_ends(&self) -> &RunEndBuffer<R::Native>
Returns a reference to RunEndBuffer
sourcepub fn values(&self) -> &ArrayRef
 
pub fn values(&self) -> &ArrayRef
Returns a reference to values array
Note: any slicing of this RunArray array is not applied to the returned array
and must be handled separately
sourcepub fn get_start_physical_index(&self) -> usize
 
pub fn get_start_physical_index(&self) -> usize
Returns the physical index at which the array slice starts.
sourcepub fn get_end_physical_index(&self) -> usize
 
pub fn get_end_physical_index(&self) -> usize
Returns the physical index at which the array slice ends.
sourcepub fn downcast<V: 'static>(&self) -> Option<TypedRunArray<'_, R, V>>
 
pub fn downcast<V: 'static>(&self) -> Option<TypedRunArray<'_, R, V>>
Downcast this RunArray to a TypedRunArray
use arrow_array::{Array, ArrayAccessor, RunArray, StringArray, types::Int32Type};
let orig = [Some("a"), Some("b"), None];
let run_array = RunArray::<Int32Type>::from_iter(orig);
let typed = run_array.downcast::<StringArray>().unwrap();
assert_eq!(typed.value(0), "a");
assert_eq!(typed.value(1), "b");
assert!(typed.values().is_null(2));sourcepub fn get_physical_index(&self, logical_index: usize) -> usize
 
pub fn get_physical_index(&self, logical_index: usize) -> usize
Returns index to the physical array for the given index to the logical array.
This function adjusts the input logical index based on ArrayData::offset
Performs a binary search on the run_ends array for the input index.
The result is arbitrary if logical_index >= self.len()
sourcepub fn get_physical_indices<I>(
    &self,
    logical_indices: &[I]
) -> Result<Vec<usize>, ArrowError>where
    I: ArrowNativeType,
 
pub fn get_physical_indices<I>( &self, logical_indices: &[I] ) -> Result<Vec<usize>, ArrowError>where I: ArrowNativeType,
Returns the physical indices of the input logical indices. Returns error if any of the logical
index cannot be converted to physical index. The logical indices are sorted and iterated along
with run_ends array to find matching physical index. The approach used here was chosen over
finding physical index for each logical index using binary search using the function
get_physical_index. Running benchmarks on both approaches showed that the approach used here
scaled well for larger inputs.
See https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727 for more details.
Trait Implementations§
source§impl<T: RunEndIndexType> Array for RunArray<T>
 
impl<T: RunEndIndexType> Array for RunArray<T>
source§fn data(&self) -> &ArrayData
 
fn data(&self) -> &ArrayData
source§fn slice(&self, offset: usize, length: usize) -> ArrayRef
 
fn slice(&self, offset: usize, length: usize) -> ArrayRef
source§fn nulls(&self) -> Option<&NullBuffer>
 
fn nulls(&self) -> Option<&NullBuffer>
source§fn data_ref(&self) -> &ArrayData
 
fn data_ref(&self) -> &ArrayData
source§fn offset(&self) -> usize
 
fn offset(&self) -> usize
0. Read moresource§fn is_null(&self, index: usize) -> bool
 
fn is_null(&self, index: usize) -> bool
index is null.
When using this function on a slice, the index is relative to the slice. Read moresource§fn is_valid(&self, index: usize) -> bool
 
fn is_valid(&self, index: usize) -> bool
index is not null.
When using this function on a slice, the index is relative to the slice. Read moresource§fn null_count(&self) -> usize
 
fn null_count(&self) -> usize
source§fn get_buffer_memory_size(&self) -> usize
 
fn get_buffer_memory_size(&self) -> usize
source§fn get_array_memory_size(&self) -> usize
 
fn get_array_memory_size(&self) -> usize
get_buffer_memory_size() and
includes the overhead of the data structures that contain the pointers to the various buffers.source§impl<R: RunEndIndexType> Clone for RunArray<R>
 
impl<R: RunEndIndexType> Clone for RunArray<R>
source§impl<R: RunEndIndexType> Debug for RunArray<R>
 
impl<R: RunEndIndexType> Debug for RunArray<R>
source§impl<'a, T: RunEndIndexType> FromIterator<&'a str> for RunArray<T>
 
impl<'a, T: RunEndIndexType> FromIterator<&'a str> for RunArray<T>
Constructs a RunArray from an iterator of strings.
Example:
use arrow_array::{RunArray, PrimitiveArray, StringArray, types::Int16Type};
let test = vec!["a", "a", "b", "c"];
let array: RunArray<Int16Type> = test.into_iter().collect();
assert_eq!(
    "RunArray {run_ends: [2, 3, 4], values: StringArray\n[\n  \"a\",\n  \"b\",\n  \"c\",\n]}\n",
    format!("{:?}", array)
);source§impl<'a, T: RunEndIndexType> FromIterator<Option<&'a str>> for RunArray<T>
 
impl<'a, T: RunEndIndexType> FromIterator<Option<&'a str>> for RunArray<T>
Constructs a RunArray from an iterator of optional strings.
Example:
use arrow_array::{RunArray, PrimitiveArray, StringArray, types::Int16Type};
let test = vec!["a", "a", "b", "c", "c"];
let array: RunArray<Int16Type> = test
    .iter()
    .map(|&x| if x == "b" { None } else { Some(x) })
    .collect();
assert_eq!(
    "RunArray {run_ends: [2, 3, 5], values: StringArray\n[\n  \"a\",\n  null,\n  \"c\",\n]}\n",
    format!("{:?}", array)
);