Struct polars_core::chunked_array::ChunkedArray

source ·

pub struct ChunkedArray<T: PolarsDataType> { /* private fields */ }

Expand description

ChunkedArray

Every Series contains a ChunkedArray<T>. Unlike Series, ChunkedArray’s are typed. This allows us to apply closures to the data and collect the results to a ChunkedArray of the same type T. Below we use an apply to use the cosine function to the values of a ChunkedArray.

fn apply_cosine(ca: &Float32Chunked) -> Float32Chunked {
    ca.apply(|v| v.cos())
}

If we would like to cast the result we could use a Rust Iterator instead of an apply method. Note that Iterators are slightly slower as the null values aren’t ignored implicitly.

fn apply_cosine_and_cast(ca: &Float32Chunked) -> Float64Chunked {
    ca.into_iter()
        .map(|opt_v| {
        opt_v.map(|v| v.cos() as f64)
    }).collect()
}

Another option is to first cast and then use an apply.

fn apply_cosine_and_cast(ca: &Float32Chunked) -> Float64Chunked {
    ca.apply_cast_numeric(|v| v.cos() as f64)
}

Conversion between Series and ChunkedArray’s

Conversion from a Series to a ChunkedArray is effortless.

fn to_chunked_array(series: &Series) -> PolarsResult<&Int32Chunked>{
    series.i32()
}

fn to_series(ca: Int32Chunked) -> Series {
    ca.into_series()
}

Iterators

ChunkedArrays fully support Rust native Iterator and DoubleEndedIterator traits, thereby giving access to all the excellent methods available for Iterators.


fn iter_forward(ca: &Float32Chunked) {
    ca.into_iter()
        .for_each(|opt_v| println!("{:?}", opt_v))
}

fn iter_backward(ca: &Float32Chunked) {
    ca.into_iter()
        .rev()
        .for_each(|opt_v| println!("{:?}", opt_v))
}

Memory layout

ChunkedArray’s use Apache Arrow as backend for the memory layout. Arrows memory is immutable which makes it possible to make multiple zero copy (sub)-views from a single array.

To be able to append data, Polars uses chunks to append new memory locations, hence the ChunkedArray<T> data structure. Appends are cheap, because it will not lead to a full reallocation of the whole array (as could be the case with a Rust Vec).

However, multiple chunks in a ChunkArray will slow down many operations that need random access because we have an extra indirection and indexes need to be mapped to the proper chunk. Arithmetic may also be slowed down by this. When multiplying two ChunkArray's with different chunk sizes they cannot utilize SIMD for instance.

If you want to have predictable performance (no unexpected re-allocation of memory), it is advised to call the ChunkedArray::rechunk after multiple append operations.

See also ChunkedArray::extend for appends within a chunk.

Struct polars_core::chunked_array::ChunkedArray

Implementations§

impl<T: PolarsNumericType> ChunkedArray<T>where T::Native: Signed,

pub fn abs(&self) -> Self

impl<T> ChunkedArray<T>where T: PolarsNumericType,

pub fn append(&mut self, other: &Self)

impl<T: PolarsNumericType> ChunkedArray<T>

pub fn cast_and_apply_in_place<F, S>(&self, f: F) -> ChunkedArray<S>where F: Fn(S::Native) -> S::Native + Copy, S: PolarsNumericType,

impl<T: PolarsNumericType> ChunkedArray<T>

pub fn apply_mut<F>(&mut self, f: F)where F: Fn(T::Native) -> T::Native + Copy,

impl<T: PolarsDataType> ChunkedArray<T>

pub fn len(&self) -> usize

pub fn is_empty(&self) -> bool

pub fn rechunk(&self) -> Self

pub fn slice(&self, offset: i64, length: usize) -> Self

pub fn limit(&self, num_elements: usize) -> Selfwhere Self: Sized,

pub fn head(&self, length: Option<usize>) -> Selfwhere Self: Sized,

pub fn tail(&self, length: Option<usize>) -> Selfwhere Self: Sized,

impl<T> ChunkedArray<T>where T: PolarsNumericType,

pub fn extend(&mut self, other: &Self)

impl ChunkedArray<ListType>

pub fn full_null_with_dtype( name: &str, length: usize, inner_dtype: &DataType) -> ListChunked

impl<T> ChunkedArray<T>where ChunkedArray<T>: IntoSeries, T: PolarsFloatType, T::Native: Float + IsFloat + SubAssign + Pow<T::Native, Output = T::Native>,

pub fn rolling_apply_float<F>( &self, window_size: usize, f: F) -> PolarsResult<Self>where F: FnMut(&mut ChunkedArray<T>) -> Option<T::Native>,

impl ChunkedArray<BooleanType>

pub fn all(&self) -> bool

pub fn any(&self) -> bool

impl<T> ChunkedArray<T>where T: PolarsFloatType, T::Native: Float,

pub fn is_nan(&self) -> BooleanChunked

pub fn is_not_nan(&self) -> BooleanChunked

pub fn is_finite(&self) -> BooleanChunked

pub fn is_infinite(&self) -> BooleanChunked

pub fn none_to_nan(&self) -> Self

impl ChunkedArray<ListType>

pub fn par_iter(&self) -> impl ParallelIterator<Item = Option<Series>> + '_

pub fn par_iter_indexed( &mut self) -> impl IndexedParallelIterator<Item = Option<Series>> + '_

impl ChunkedArray<Utf8Type>

pub fn par_iter_indexed( &self) -> impl IndexedParallelIterator<Item = Option<&str>>

pub fn par_iter(&self) -> impl ParallelIterator<Item = Option<&str>> + '_

impl<T> ChunkedArray<T>where T: PolarsNumericType,

pub fn to_ndarray(&self) -> PolarsResult<ArrayView1<'_, T::Native>>

impl ChunkedArray<ListType>

pub fn to_ndarray<N>(&self) -> PolarsResult<Array2<N::Native>>where N: PolarsNumericType,

impl<T> ChunkedArray<T>where T: PolarsDataType,

pub fn from_chunks(name: &str, chunks: Vec<ArrayRef>) -> Self

impl<T> ChunkedArray<T>where T: PolarsNumericType,

pub fn from_vec(name: &str, v: Vec<T::Native>) -> Self

pub fn new_from_owned_with_null_bitmap( name: &str, values: Vec<T::Native>, buffer: Option<Bitmap>) -> Self

impl ChunkedArray<ListType>

pub fn amortized_iter( &self) -> AmortizedListIter<'_, impl Iterator<Item = Option<ArrayBox>> + '_>

pub fn apply_amortized<'a, F>(&'a self, f: F) -> Selfwhere F: FnMut(UnstableSeries<'a>) -> Series,

pub fn try_apply_amortized<'a, F>(&'a self, f: F) -> PolarsResult<Self>where F: FnMut(UnstableSeries<'a>) -> PolarsResult<Series>,

impl ChunkedArray<ListType>

pub fn set_fast_explode(&mut self)

pub fn _can_fast_explode(&self) -> bool

pub fn to_logical(&mut self, inner_dtype: DataType)

impl<T> ChunkedArray<ObjectType<T>>where T: PolarsObject,

pub fn new_from_vec(name: &str, v: Vec<T>) -> Self

impl<T> ChunkedArray<ObjectType<T>>where T: PolarsObject,

pub unsafe fn get_object_unchecked( &self, index: usize) -> Option<&dyn PolarsObjectSafe>

pub fn get_object(&self, index: usize) -> Option<&dyn PolarsObjectSafe>

impl<T> ChunkedArray<T>where T: PolarsNumericType, Standard: Distribution<T::Native>,

pub fn init_rand(size: usize, null_density: f32, seed: Option<u64>) -> Self

impl<T> ChunkedArray<T>where T: PolarsDataType, ChunkedArray<T>: ChunkTake,

pub fn sample_n( &self, n: usize, with_replacement: bool, shuffle: bool, seed: Option<u64>) -> PolarsResult<Self>

pub fn sample_frac( &self, frac: f64, with_replacement: bool, shuffle: bool, seed: Option<u64>) -> PolarsResult<Self>

impl<T> ChunkedArray<T>where T: PolarsNumericType, T::Native: Float,

pub fn rand_normal( name: &str, length: usize, mean: f64, std_dev: f64) -> PolarsResult<Self>

pub fn rand_standard_normal(name: &str, length: usize) -> Self

pub fn rand_uniform(name: &str, length: usize, low: f64, high: f64) -> Self

impl ChunkedArray<BooleanType>

pub fn rand_bernoulli(name: &str, length: usize, p: f64) -> PolarsResult<Self>

impl ChunkedArray<Utf8Type>

pub fn hex_decode(&self, strict: Option<bool>) -> PolarsResult<Utf8Chunked>

pub fn hex_encode(&self) -> Utf8Chunked

pub fn base64_decode(&self, strict: Option<bool>) -> PolarsResult<Utf8Chunked>

pub fn base64_encode(&self) -> Utf8Chunked

impl<T: PolarsDataType> ChunkedArray<T>

pub fn set_sorted(&mut self, reverse: bool)

pub fn is_sorted2(&self) -> IsSorted

impl<T: PolarsNumericType> ChunkedArray<T>where
T::Native: Signed,

impl<T> ChunkedArray<T>where
T: PolarsNumericType,

pub fn cast_and_apply_in_place<F, S>(&self, f: F) -> ChunkedArray<S>where
F: Fn(S::Native) -> S::Native + Copy,
S: PolarsNumericType,

pub fn apply_mut<F>(&mut self, f: F)where
F: Fn(T::Native) -> T::Native + Copy,

pub fn limit(&self, num_elements: usize) -> Selfwhere
Self: Sized,

pub fn head(&self, length: Option<usize>) -> Selfwhere
Self: Sized,

pub fn tail(&self, length: Option<usize>) -> Selfwhere
Self: Sized,

impl<T> ChunkedArray<T>where
T: PolarsNumericType,

pub fn full_null_with_dtype(
name: &str,
length: usize,
inner_dtype: &DataType
) -> ListChunked

impl<T> ChunkedArray<T>where
ChunkedArray<T>: IntoSeries,
T: PolarsFloatType,
T::Native: Float + IsFloat + SubAssign + Pow<T::Native, Output = T::Native>,

pub fn rolling_apply_float<F>(
&self,
window_size: usize,
f: F
) -> PolarsResult<Self>where
F: FnMut(&mut ChunkedArray<T>) -> Option<T::Native>,

impl<T> ChunkedArray<T>where
T: PolarsFloatType,
T::Native: Float,

pub fn par_iter_indexed(
&mut self
) -> impl IndexedParallelIterator<Item = Option<Series>> + '_

pub fn par_iter_indexed(
&self
) -> impl IndexedParallelIterator<Item = Option<&str>>

impl<T> ChunkedArray<T>where
T: PolarsNumericType,

pub fn to_ndarray<N>(&self) -> PolarsResult<Array2<N::Native>>where
N: PolarsNumericType,

impl<T> ChunkedArray<T>where
T: PolarsDataType,

impl<T> ChunkedArray<T>where
T: PolarsNumericType,

pub fn new_from_owned_with_null_bitmap(
name: &str,
values: Vec<T::Native>,
buffer: Option<Bitmap>
) -> Self

pub fn amortized_iter(
&self
) -> AmortizedListIter<'_, impl Iterator<Item = Option<ArrayBox>> + '_>

pub fn apply_amortized<'a, F>(&'a self, f: F) -> Selfwhere
F: FnMut(UnstableSeries<'a>) -> Series,

pub fn try_apply_amortized<'a, F>(&'a self, f: F) -> PolarsResult<Self>where
F: FnMut(UnstableSeries<'a>) -> PolarsResult<Series>,

impl<T> ChunkedArray<ObjectType<T>>where
T: PolarsObject,

impl<T> ChunkedArray<ObjectType<T>>where
T: PolarsObject,

pub unsafe fn get_object_unchecked(
&self,
index: usize
) -> Option<&dyn PolarsObjectSafe>

impl<T> ChunkedArray<T>where
T: PolarsNumericType,
Standard: Distribution<T::Native>,

impl<T> ChunkedArray<T>where
T: PolarsDataType,
ChunkedArray<T>: ChunkTake,

pub fn sample_n(
&self,
n: usize,
with_replacement: bool,
shuffle: bool,
seed: Option<u64>
) -> PolarsResult<Self>

pub fn sample_frac(
&self,
frac: f64,
with_replacement: bool,
shuffle: bool,
seed: Option<u64>
) -> PolarsResult<Self>

impl<T> ChunkedArray<T>where
T: PolarsNumericType,
T::Native: Float,

pub fn rand_normal(
name: &str,
length: usize,
mean: f64,
std_dev: f64
) -> PolarsResult<Self>

pub fn iter_validities(
&self
) -> Map<Iter<'_, ArrayRef>, fn(_: &ArrayRef) -> Option<&Bitmap>>

pub fn unpack_series_matching_type(
&self,
series: &Series
) -> PolarsResult<&ChunkedArray<T>>