pub struct CsvReader<'a, R> where
    R: MmapBytesReader
{ /* private fields */ }
Available on crate feature csv-file only.
Expand description

Create a new DataFrame by reading a csv file.

Example

use polars_core::prelude::*;
use polars_io::prelude::*;
use std::fs::File;

fn example() -> Result<DataFrame> {
    CsvReader::from_path("iris_csv")?
            .has_header(true)
            .finish()
}

Implementations

Skip these rows after the header

Add a row_count column.

Sets the chunk size used by the parser. This influences performance

Try to stop parsing when n rows are parsed. During multithreaded parsing the upper bound n cannot be guaranteed.

Continue with next batch when a ParserError is encountered.

Set the CSV file’s schema. This only accepts datatypes that are implemented in the csv parser and expects a complete Schema.

It is recommended to use with_dtypes instead.

Skip the first n rows during parsing. The header will be parsed an n lines.

Rechunk the DataFrame to contiguous memory after the CSV is parsed.

Set whether the CSV file has headers

Set the CSV file’s column delimiter as a byte character

Set the comment character. Lines starting with this character will be ignored.

Set values that will be interpreted as missing/ null. Note that any value you set as null value will not be escaped, so if quotation marks are part of the null value you should include them.

Overwrite the schema with the dtypes in this given Schema. The given schema may be a subset of the total schema.

Overwrite the dtypes in the schema in the order of the slice that’s given. This is useful if you don’t know the column names beforehand

Set the CSV reader to infer the schema of the file

Arguments
  • max_records - Maximum number of rows read for schema inference. Setting this to None will do a full table scan (slow).

Set the reader’s column projection. This counts from 0, meaning that vec![0, 4] would select the 1st and 5th column.

Columns to select/ project

Set the number of threads used in CSV reading. The default uses the number of cores of your cpu.

Note that this only works if this is initialized with CsvReader::from_path. Note that the number of cores is the maximum allowed number of threads.

The preferred way to initialize this builder. This allows the CSV file to be memory mapped and thereby greatly increases parsing performance.

Sets the size of the sample taken from the CSV file. The sample is used to get statistic about the file. These statistics are used to try to optimally allocate up front. Increasing this may improve performance.

Reduce memory consumption at the expense of performance

Set the char used as quote char. The default is b'"'. If set to [None] quoting is disabled.

Automatically try to parse dates/ datetimes and time. If parsing fails, columns remain of dtype [DataType::Utf8].

This is the recommended way to create a csv reader as this allows for fastest parsing.

Trait Implementations

Create a new CsvReader from a file/ stream

Read the file and create the DataFrame.

Rechunk to a single chunk after Reading file.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.

The type for initializers.

Initializes a with the given initializer. Read more

Dereferences the given pointer. Read more

Mutably dereferences the given pointer. Read more

Drops the object pointed to by the given pointer. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.