Modules
arrow_ndjson
io_json
APIs to read from and write to NDJSON
binary
dtype-binary
cat
dtype-categorical
Traits and utilities for temporal data.
Data types supported by Polars.
json
io_json
Convert data between the Arrow memory format and JSON line-delimited records.
nan_propagating_aggregate
propagate_nans
string
strings
utils
private
zip
zip_with
Macros
Structs
A thread-safe reference-counting pointer. ‘Arc’ stands for ‘Atomically
Reference Counted’.
Represents Arrow’s metadata of a “column”.
BinaryTakeRandom
dtype-binary
BinaryTakeRandomSingleChunk
dtype-binary
BinaryType
dtype-binary
Specialized expressions for Categorical dtypes.
ChunkedArray
Create a new DataFrame by reading a csv file.
Write a DataFrame to csv.
A contiguous growable collection of
Series
that have the same length.Returned by a groupby operation on a DataFrame. This struct supports
several aggregations.
Indexes of the groups, the first index is stored separately.
this make sorting fast.
Int128Type
dtype-i128
Read Arrows IPC format into a DataFrame
Read Arrows Stream IPC format into a DataFrame
Write a DataFrame to Arrow’s Streaming IPC format
Write a DataFrame to Arrow’s IPC format
LazyCsvReader
csv-file
Lazy abstraction over an eager
DataFrame
.
It really is an abstraction over a logical plan. The methods of this struct will incrementally
modify a logical plan until output is requested (via collect)Utility struct for lazy groupby operation.
ListBinaryChunkedBuilder
dtype-binary
Specialized expressions for
Series
of DataType::List
.Maps a logical type to a a chunked array implementation of the physical type.
This saves a lot of compiler bloat and allows us to reuse functionality.
Arguments for
[DataFrame::melt]
functionJust a wrapper structure. Useful for certain impl specializations
This is for instance use to implement
impl<T> FromIterator<T::Native> for NoNull<ChunkedArray<T>>
as Option<T::Native>
was already implemented:
impl<T> FromIterator<Option<T::Native>> for ChunkedArray<T>
The literal Null
ObjectTakeRandom
object
ObjectType
object
State of the allowed optimizations
OwnedObject
object
Read Apache parquet format into a DataFrame.
ParquetWriteOptions
parquet
Write a DataFrame to parquet format
Wrapper struct that allow us to use a PhysicalExpr in polars-io.
RollingOptions
rolling_window
RollingOptionsImpl
rolling_window
Series
Wrapper type that has special equality properties
depending on the inner type specialization
A
StructArray
is a nested Array
with an optional validity representing
multiple Array
with the same number of rows.This is logical type
StructChunked
that
dispatches most logic to the fields
implementationsIntermediate state of
when(..).then(..).otherwise(..)
expr.Intermediate state of
when(..).then(..).otherwise(..)
expr.Intermediate state of chain when then exprs.
Represents a window in time
Represents a valid zstd compression level.
Enums
The set of supported logical types in this crate.
The time units defined in Arrow.
Queries consists of multiple expressions.
Compression codec
One of the three arguments allowed in unchecked_take
Constants
Traits
Argmin/ Argmax
Aggregation operations
Aggregations that return Series of unit length. Those can be used in broadcasting operations.
Fastest way to do elementwise operations on a ChunkedArray when the operation is cheaper than
branching due to null checking
Apply kernels on the arrow array chunks in a ChunkedArray.
Cast
ChunkedArray<T>
to ChunkedArray<N>
ChunkCumAgg
cum_agg
Create a new ChunkedArray filled with values at that index.
Explode/ flatten a List or Utf8 Series
Replace None values with a value
Filter values by a boolean mask.
Fill a ChunkedArray with one value.
Find local minima/ maxima
Quantile and median aggregation
Reverse a ChunkedArray
ChunkRollApply
rolling_window
This differs from ChunkWindowCustom and ChunkWindow
by not using a fold aggregator, but reusing a
Series
wrapper and calling Series
aggregators.
This likely is a bit slower than ChunkWindowCreate a
ChunkedArray
with new values by index or by boolean mask.
Note that these operations clone data. This is however the only way we can modify at mask or
index level as the underlying Arrow arrays are immutable.Shift the values of a ChunkedArray by a number of periods.
Sort operations on
ChunkedArray
.Fast access by index.
Traverse and collect every nth element
Get unique values in a
ChunkedArray
Variance and standard deviation aggregation.
Combine 2 ChunkedArrays based on some predicate.
IndexOfSchema
private
This trait exists to be unify the API of polars Schema and arrows Schema
Used to create the tuples for a groupby operation.
Create a type that implements a faster
TakeRandom
.IsFirst
is_first
Mask the first unique values as
true
Safety
IsIn
is_in
Check if element is member of list array
IsLast
is_first
Mask the last unique values as
true
Take a DataFrame and evaluate the expressions.
Implement this for Column, lt, eq, etc
A
PolarsIterator
is an iterator over a ChunkedArray
which contains polars types. A PolarsIterator
must implement ExactSizeIterator
and DoubleEndedIterator
.Values need to implement this so that they can be stored into a Series and DataFrame
Any type that is not nested
RepeatBy
repeat_by
Repeat the values
n
times.RollingAgg
rolling_window
A wrapper trait for any binary closure
Fn(Series, Series) -> PolarsResult<Series>
A wrapper trait for any closure
Fn(Vec<Series>) -> PolarsResult<Series>
StrConcat
concat_str
Concat the values into a string array.
Random access
Ensure that the same hash is used as with
VecHash
.Functions
Selects all columns
Evaluate all the expressions with a bitwise and
Evaluate all the expressions with a bitwise or
Apply a function/closure over the groups of multiple columns. This should only be used in a groupby aggregation.
arange
arange
Create list entries that are range arrays
arg_where
arg_where
Get the indices where
condition
evaluates true
.Find the indexes that would sort these series in order of appearance.
That means that the first
Series
will be used to determine the ordering
until duplicates are found. Once duplicates are found, the next Series
will
be used and so on.as_struct
dtype-struct
Take several expressions and collect them into a
StructChunked
.Find the mean of all the values in this Expression.
Folds the expressions from left to right keeping the first no null values.
Create a Column Expression based on a column name.
Collect all
LazyFrame
computations.Select multiple columns by name
Concat multiple
Concat lists entries.
concat_str
concat_str
and strings
Horizontally concat string columns in linear time
Count expression
Compute the covariance between two columns.
cumfold_exprs
dtype-struct
Accumulate over multiple columns horizontally / row wise.
cumreduce_exprs
dtype-struct
Accumulate over multiple columns horizontally / row wise.
datetime
temporal
datetime_to_timestamp_ms
private
datetime_to_timestamp_ns
private
datetime_to_timestamp_us
private
diag_concat_lf
diagonal_concat
Concat LazyFrames diagonally.
Calls [concat] internally.
Select multiple columns by dtype.
Select multiple columns by dtype.
duration
temporal
First column in DataFrame
Accumulate over multiple columns horizontally / row wise.
format_str
concat_str
and strings
Format the results of an array of expressions using a format string
Different from
groupby_windows
, where define window buckets and search which values fit that
pre-defined bucket, this function defines every window based on the:
- timestamp (lower bound)
- timestamp + period (upper bound)
where timestamps are the individual values in the array time
Based on the given
Window
, which has anIsNotNull expression.
Last column in DataFrame
Create a Literal Expression from
L
Apply a closure on the two columns that are evaluated from
Expr
a and Expr
b.Apply a function/closure over multiple columns once the logical plan get executed.
Apply a function/closure over multiple columns once the logical plan get executed.
Find the maximum of all the values in this Expression.
Get the the maximum value per row
Find the mean of all the values in this Expression.
Find the median of all the values in this Expression.
Find the minimum of all the values in this Expression.
Compute the pearson correlation between two columns.
Find a specific quantile of all the values in this Expression.
Create a range literal.
Repeat a literal
value
n
times.spearman_rank_corr
rank
and propagate_nans
Compute the spearman rank correlation between two columns.
Missing data will be excluded from the computation.
Sum all the values in this Expression.
Get the the sum of the values per row
Start a when-then-otherwise expression
Type Definitions
AllowedOptimizations
Typedef for a
std::result::Result
of an Error
.BinaryChunked
dtype-binary
Dummy type, we need to instantiate all generic types, so we fill one with a dummy.
Every group is indicated by an array where the
IdxArrNon-
bigidx
IdxCaNon-
bigidx
IdxSizeNon-
bigidx
The type used by polars to index data.
IdxTypeNon-
bigidx
Int128Chunked
dtype-i128
ObjectChunked
object
PlHashMap
private
PlHashSet
private
PlIdHashMap
private
This hashmap has the uses an IdHasher
PlIndexMap
private
PlIndexSet
private