polars 0.12.0-beta.0

DataFrame library
docs.rs failed to build polars-0.12.0-beta.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: polars-0.39.2

Polars: DataFrames in Rust

Polars is a DataFrame library for Rust. It is based on Apache Arrows memory model. This means that operations on Polars array's (called Series or ChunkedArray<T> {if the type T is known}) are optimally aligned cache friendly operations and SIMD. Sadly, Apache Arrow needs nightly Rust, which means that Polars cannot run on stable.

Polars supports an eager and a lazy api. The eager api is similar to pandas, the lazy api is similar to Spark.

Eager

Read more in the pages of the following data structures /traits.

Lazy

Read more in the lazy module

Read and write CSV/ JSON

use polars::prelude::*;
use std::fs::File;

fn example() -> Result<DataFrame> {
let file = File::open("iris.csv")
.expect("could not read file");

CsvReader::new(file)
.infer_schema(None)
.has_header(true)
.finish()
}

For more IO examples see:

Joins

# #[macro_use] extern crate polars;
# fn main() {
use polars::prelude::*;

fn join() -> Result<DataFrame> {
// Create first df.
let temp = df!("days" => &[0, 1, 2, 3, 4],
"temp" => &[22.1, 19.9, 7., 2., 3.])?;

// Create second df.
let rain = df!("days" => &[1, 2],
"rain" => &[0.1, 0.2])?;

// Left join on days column.
temp.left_join(&rain, "days", "days")
}

println!("{}", join().unwrap())
# }
+------+------+------+
| days | temp | rain |
| ---  | ---  | ---  |
| i32  | f64  | f64  |
+======+======+======+
| 0    | 22.1 | null |
+------+------+------+
| 1    | 19.9 | 0.1  |
+------+------+------+
| 2    | 7    | 0.2  |
+------+------+------+
| 3    | 2    | null |
+------+------+------+
| 4    | 3    | null |
+------+------+------+

Groupby's | aggregations | pivots | melts

use polars::prelude::*;
fn groupby_sum(df: &DataFrame) -> Result<DataFrame> {
df.groupby("column_name")?
.select("agg_column_name")
.sum()
}

Arithmetic

use polars::prelude::*;
let s = Series::new("foo", [1, 2, 3]);
let s_squared = &s * &s;

Rust iterators

use polars::prelude::*;

let s: Series = [1, 2, 3].iter().collect();
let s_squared: Series = s.i32()
.expect("datatype mismatch")
.into_iter()
.map(|optional_v| {
match optional_v {
Some(v) => Some(v * v),
None => None, // null value
}
}).collect();

Apply custom closures

Besides running custom iterators, custom closures can be applied on the values of ChunkedArray by using the apply method. This method accepts a closure that will be applied on all values of Option<T> that are non null. Note that this is the fastest way to apply a custom closure on ChunkedArray's.

# use polars::prelude::*;
let s: Series = Series::new("values", [Some(1.0), None, Some(3.0)]);
// null values are ignored automatically
let squared = s.f64()
.unwrap()
.apply(|value| value.powf(2.0))
.into_series();

assert_eq!(Vec::from(squared.f64().unwrap()), &[Some(1.0), None, Some(9.0)])

Comparisons

use polars::prelude::*;
let s = Series::new("dollars", &[1, 2, 3]);
let mask = s.eq(1);

assert_eq!(Vec::from(mask), &[Some(true), Some(false), Some(false)]);

Temporal data types

# use polars::prelude::*;
let dates = &[
"2020-08-21",
"2020-08-21",
"2020-08-22",
"2020-08-23",
"2020-08-22",
];
// date format
let fmt = "%Y-%m-%d";
// create date series
let s0 = Date32Chunked::parse_from_str_slice("date", dates, fmt)
.into_series();

And more...

Features

Additional cargo features:

  • temporal (default)
  • Conversions between Chrono and Polars for temporal data
  • simd (default)
  • SIMD operations
  • parquet
  • Read Apache Parquet format
  • json
  • Json serialization
  • ipc
  • Arrow's IPC format serialization
  • random
  • Generate array's with randomly sampled values
  • ndarray
  • Convert from DataFrame to ndarray
  • parallel
  • Parallel variants of operations
  • lazy
  • Lazy api
  • strings
  • String utilities for Utf8Chunked
  • object
  • Support for generic ChunkedArray's called ObjectChunked<T> (generic over T). These will downcastable from Series through the Any trait.