Modules

Traits and utilities for temporal data.

Data types supported by Polars.

Convert data between the Arrow memory format and JSON line-delimited records.

APIs to read from and write to NDJSON

Macros

Structs

A thread-safe reference-counting pointer. ‘Arc’ stands for ‘Atomically Reference Counted’.

Represents Arrow’s metadata of a “column”.

An ordered sequence of Fields with associated Metadata.

Represents a valid brotli compression level.

Specialized expressions for Categorical dtypes.

ChunkedArray

Create a new DataFrame by reading a csv file.

Write a DataFrame to csv.

A contiguous growable collection of Series that have the same length.

Characterizes the name and the DataType of a column.

Returned by a groupby operation on a DataFrame. This struct supports several aggregations.

Indexes of the groups, the first index is stored separately. this make sorting fast.

Represents a valid gzip compression level.

Read Arrows IPC format into a DataFrame

Read Arrows Stream IPC format into a DataFrame

Write a DataFrame to Arrow’s Streaming IPC format

Write a DataFrame to Arrow’s IPC format

Lazy abstraction over an eager DataFrame. It really is an abstraction over a logical plan. The methods of this struct will incrementally modify a logical plan until output is requested (via collect)

Utility struct for lazy groupby operation.

Maps a logical type to a a chunked array implementation of the physical type. This saves a lot of compiler bloat and allows us to reuse functionality.

Arguments for [DataFrame::melt] function

Just a wrapper structure. Useful for certain impl specializations This is for instance use to implement impl<T> FromIterator<T::Native> for NoNull<ChunkedArray<T>> as Option<T::Native> was already implemented: impl<T> FromIterator<Option<T::Native>> for ChunkedArray<T>

The literal Null

State of the allowed optimizations

Read Apache parquet format into a DataFrame.

Write a DataFrame to parquet format

Wrapper struct that allow us to use a PhysicalExpr in polars-io.

Series

Wrapper type that has special equality properties depending on the inner type specialization

A StructArray is a nested Array with an optional validity representing multiple Array with the same number of rows.

This is logical type StructChunked that dispatches most logic to the fields implementations

Intermediate state of when(..).then(..).otherwise(..) expr.

Intermediate state of when(..).then(..).otherwise(..) expr.

Intermediate state of chain when then exprs.

Represents a window in time

Represents a valid zstd compression level.

Enums

Constants

Traits

Argmin/ Argmax

Aggregation operations

Aggregations that return Series of unit length. Those can be used in broadcasting operations.

Fastest way to do elementwise operations on a ChunkedArray when the operation is cheaper than branching due to null checking

Apply kernels on the arrow array chunks in a ChunkedArray.

Cast ChunkedArray<T> to ChunkedArray<N>

Compare Series and ChunkedArray’s and get a boolean mask that can be used to filter rows.

Create a new ChunkedArray filled with values at that index.

Explode/ flatten a List or Utf8 Series

Replace None values with various strategies

Replace None values with a value

Filter values by a boolean mask.

Fill a ChunkedArray with one value.

Find local minima/ maxima

Quantile and median aggregation

Reverse a ChunkedArray

This differs from ChunkWindowCustom and ChunkWindow by not using a fold aggregator, but reusing a Series wrapper and calling Series aggregators. This likely is a bit slower than ChunkWindow

Create a ChunkedArray with new values by index or by boolean mask. Note that these operations clone data. This is however the only way we can modify at mask or index level as the underlying Arrow arrays are immutable.

Shift the values of a ChunkedArray by a number of periods.

Sort operations on ChunkedArray.

Fast access by index.

Traverse and collect every nth element

Get unique values in a ChunkedArray

Variance and standard deviation aggregation.

Combine 2 ChunkedArrays based on some predicate.

Executors will evaluate physical expressions and collect them in a DataFrame.

This trait exists to be unify the API of polars Schema and arrows Schema

Used to create the tuples for a groupby operation.

Create a type that implements a faster TakeRandom.

Mask the first unique values as true

Safety

Check if element is member of list array

Mask the last unique values as true

Take a DataFrame and evaluate the expressions. Implement this for Column, lt, eq, etc

A type that implements this transforms a LogicalPlan to a physical plan.

A PolarsIterator is an iterator over a ChunkedArray which contains polars types. A PolarsIterator must implement ExactSizeIterator and DoubleEndedIterator.

Values need to implement this so that they can be stored into a Series and DataFrame

Any type that is not nested

Repeat the values n times.

A wrapper trait for any binary closure Fn(Series, Series) -> Result<Series>

A wrapper trait for any closure Fn(Vec<Series>) -> Result<Series>

Concat the values into a string array.

Random access

Functions

Selects all columns

Evaluate all the expressions with a bitwise and

Evaluate all the expressions with a bitwise or

Apply a function/closure over the groups of multiple columns. This should only be used in a groupby aggregation.

Create list entries that are range arrays

Get the indices where condition evaluates true.

Find the indexes that would sort these series in order of appearance. That means that the first Series will be used to determine the ordering until duplicates are found. Once duplicates are found, the next Series will be used and so on.

Take several expressions and collect them into a StructChunked.

Find the mean of all the values in this Expression.

Cast expression.

Create a Column Expression based on a column name.

Collect all LazyFrame computations.

Select multiple columns by name

Concat multiple

Concat lists entries.

Horizontally concat string columns in linear time

Count expression

Compute the covariance between two columns.

Create a DatetimeChunked from a given start and stop date and a given every interval.

Select multiple columns by dtype.

Select multiple columns by dtype.

First column in DataFrame

Accumulate over multiple columns horizontally / row wise.

Different from groupby_windows, where define window buckets and search which values fit that pre-defined bucket, this function defines every window based on the: - timestamp (lower bound) - timestamp + period (upper bound) where timestamps are the individual values in the array time

Based on the given Window, which has an

IsNull expression

Last column in DataFrame

Create a Literal Expression from L

Apply a closure on the two columns that are evaluated from Expr a and Expr b.

Apply a function/closure over multiple columns once the logical plan get executed.

Apply a function/closure over multiple columns once the logical plan get executed.

Find the maximum of all the values in this Expression.

Get the the maximum value per row

Find the mean of all the values in this Expression.

Find the median of all the values in this Expression.

Find the minimum of all the values in this Expression.

Get the the minimum value per row

Not expression.

Compute the pearson correlation between two columns.

Find a specific quantile of all the values in this Expression.

Create a range literal.

Repeat a literal value n times.

Compute the spearman rank correlation between two columns.

Sum all the values in this Expression.

Get the the sum of the values per row

Start a when-then-otherwise expression

Type Definitions