[][src]Crate polars

Polars: DataFrames in Rust

Polars is a DataFrame library for Rust. It is based on Apache Arrows memory model. This means that operations on Polars array's (called Series or ChunkedArray<T> {if the type T is known}) are optimally aligned cache friendly operations and SIMD. Sadly, Apache Arrow needs nightly Rust, which means that Polars cannot run on stable.

Polars supports an eager and a lazy api. The eager api is similar to pandas, the lazy api is similar to Spark.

Eager

Read more in the pages of the DataFrame, Series, and ChunkedArray data structures.

Lazy

Read more in the lazy module

Read and write CSV/ JSON

use polars::prelude::*;
use std::fs::File;

fn example() -> Result<DataFrame> {
    let file = File::open("iris.csv")
                    .expect("could not read file");

    CsvReader::new(file)
            .infer_schema(None)
            .has_header(true)
            .finish()
}

For more IO examples see:

Joins

use polars::prelude::*;

fn join() -> Result<DataFrame> {
    // Create first df.
    let temp = df!("days" => &[0, 1, 2, 3, 4],
                   "temp" => &[22.1, 19.9, 7., 2., 3.])?;

    // Create second df.
    let rain = df!("days" => &[1, 2],
                   "rain" => &[0.1, 0.2])?;

    // Left join on days column.
    temp.left_join(&rain, "days", "days")
}

println!("{}", join().unwrap())
+------+------+------+
| days | temp | rain |
| ---  | ---  | ---  |
| i32  | f64  | f64  |
+======+======+======+
| 0    | 22.1 | null |
+------+------+------+
| 1    | 19.9 | 0.1  |
+------+------+------+
| 2    | 7    | 0.2  |
+------+------+------+
| 3    | 2    | null |
+------+------+------+
| 4    | 3    | null |
+------+------+------+

Groupby's | aggregations | pivots | melts

use polars::prelude::*;
fn groupby_sum(df: &DataFrame) -> Result<DataFrame> {
    df.groupby("column_name")?
    .select("agg_column_name")
    .sum()
}

Arithmetic

use polars::prelude::*;
let s = Series::new("foo", [1, 2, 3]);
let s_squared = &s * &s;

Rust iterators

use polars::prelude::*;

let s: Series = [1, 2, 3].iter().collect();
let s_squared: Series = s.i32()
     .expect("datatype mismatch")
     .into_iter()
     .map(|optional_v| {
         match optional_v {
             Some(v) => Some(v * v),
             None => None, // null value
         }
 }).collect();

Apply custom closures

Besides running custom iterators, custom closures can be applied on the values of ChunkedArray by using the apply method. This method accepts a closure that will be applied on all values of Option<T> that are non null. Note that this is the fastest way to apply a custom closure on ChunkedArray's.

let s: Series = Series::new("values", [Some(1.0), None, Some(3.0)]);
// null values are ignored automatically
let squared = s.f64()
    .unwrap()
    .apply(|value| value.powf(2.0))
    .into_series();

assert_eq!(Vec::from(squared.f64().unwrap()), &[Some(1.0), None, Some(9.0)])

Comparisons

use polars::prelude::*;
use itertools::Itertools;
let s = Series::new("dollars", &[1, 2, 3]);
let mask = s.eq(1);

assert_eq!(Vec::from(mask), &[Some(true), Some(false), Some(false)]);

Temporal data types

let dates = &[
"2020-08-21",
"2020-08-21",
"2020-08-22",
"2020-08-23",
"2020-08-22",
];
// date format
let fmt = "%Y-%m-%d";
// create date series
let s0 = Date32Chunked::parse_from_str_slice("date", dates, fmt)
        .into_series();

And more...

Features

Additional cargo features:

  • pretty (default)
    • pretty printing of DataFrames
  • temporal (default)
    • Conversions between Chrono and Polars for temporal data
  • simd (default)
    • SIMD operations
  • parquet
    • Read Apache Parquet format
  • random
    • Generate array's with randomly sampled values
  • ndarray
    • Convert from DataFrame to ndarray
  • parallel
    • Parallel variants of operation
  • lazy
    • Lazy api
  • strings
    • String utilities for Utf8Chunked

Modules

chunked_array

The typed heart of every Series column.

datatypes

Data types supported by Polars.

doc

Other documentation

error
frame

DataFrame module.

lazy

Lazy API of Polars

prelude

Everything you need to get started with Polars.

series

Type agnostic columnar data structure.

testing

Testing utilities.

Macros

apply_method_all_series
apply_method_numeric_series
apply_method_numeric_series_and_return
apply_operand_on_chunkedarray_by_iter
as_result
df
match_arrow_data_type_apply_macro