Struct csv::ReaderBuilder

source ·
pub struct ReaderBuilder { /* private fields */ }
Expand description

Builds a CSV reader with various configuration knobs.

This builder can be used to tweak the field delimiter, record terminator and more. Once a CSV Reader is built, its configuration cannot be changed.

Implementations§

source§

impl ReaderBuilder

source

pub fn new() -> ReaderBuilder

Create a new builder for configuring CSV parsing.

To convert a builder into a reader, call one of the methods starting with from_.

Example
use std::error::Error;
use csv::{ReaderBuilder, StringRecord};

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,United States,4628910
Concord,United States,42695
";
    let mut rdr = ReaderBuilder::new().from_reader(data.as_bytes());

    let records = rdr
        .records()
        .collect::<Result<Vec<StringRecord>, csv::Error>>()?;
    assert_eq!(records, vec![
        vec!["Boston", "United States", "4628910"],
        vec!["Concord", "United States", "42695"],
    ]);
    Ok(())
}
source

pub fn from_path<P: AsRef<Path>>(&self, path: P) -> Result<Reader<File>>

Build a CSV parser from this configuration that reads data from the given file path.

If there was a problem opening the file at the given path, then this returns the corresponding error.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let mut rdr = ReaderBuilder::new().from_path("foo.csv")?;
    for result in rdr.records() {
        let record = result?;
        println!("{:?}", record);
    }
    Ok(())
}
source

pub fn from_reader<R: Read>(&self, rdr: R) -> Reader<R>

Build a CSV parser from this configuration that reads data from rdr.

Note that the CSV reader is buffered automatically, so you should not wrap rdr in a buffered reader like io::BufReader.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,United States,4628910
Concord,United States,42695
";
    let mut rdr = ReaderBuilder::new().from_reader(data.as_bytes());
    for result in rdr.records() {
        let record = result?;
        println!("{:?}", record);
    }
    Ok(())
}
source

pub fn delimiter(&mut self, delimiter: u8) -> &mut ReaderBuilder

The field delimiter to use when parsing CSV.

The default is b','.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city;country;pop
Boston;United States;4628910
";
    let mut rdr = ReaderBuilder::new()
        .delimiter(b';')
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States", "4628910"]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn has_headers(&mut self, yes: bool) -> &mut ReaderBuilder

Whether to treat the first row as a special header row.

By default, the first row is treated as a special header row, which means the header is never returned by any of the record reading methods or iterators. When this is disabled (yes set to false), the first row is not treated specially.

Note that the headers and byte_headers methods are unaffected by whether this is set. Those methods always return the first record.

Example

This example shows what happens when has_headers is disabled. Namely, the first row is treated just like any other row.

use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,United States,4628910
";
    let mut rdr = ReaderBuilder::new()
        .has_headers(false)
        .from_reader(data.as_bytes());
    let mut iter = rdr.records();

    // Read the first record.
    if let Some(result) = iter.next() {
        let record = result?;
        assert_eq!(record, vec!["city", "country", "pop"]);
    } else {
        return Err(From::from(
            "expected at least two records but got none"));
    }

    // Read the second record.
    if let Some(result) = iter.next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States", "4628910"]);
    } else {
        return Err(From::from(
            "expected at least two records but got one"))
    }
    Ok(())
}
source

pub fn flexible(&mut self, yes: bool) -> &mut ReaderBuilder

Whether the number of fields in records is allowed to change or not.

When disabled (which is the default), parsing CSV data will return an error if a record is found with a number of fields different from the number of fields in a previous record.

When enabled, this error checking is turned off.

Example: flexible records enabled
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    // Notice that the first row is missing the population count.
    let data = "\
city,country,pop
Boston,United States
";
    let mut rdr = ReaderBuilder::new()
        .flexible(true)
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States"]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
Example: flexible records disabled

This shows the error that appears when records of unequal length are found and flexible records have been disabled (which is the default).

use std::error::Error;
use csv::{ErrorKind, ReaderBuilder};

fn example() -> Result<(), Box<dyn Error>> {
    // Notice that the first row is missing the population count.
    let data = "\
city,country,pop
Boston,United States
";
    let mut rdr = ReaderBuilder::new()
        .flexible(false)
        .from_reader(data.as_bytes());

    if let Some(Err(err)) = rdr.records().next() {
        match *err.kind() {
            ErrorKind::UnequalLengths { expected_len, len, .. } => {
                // The header row has 3 fields...
                assert_eq!(expected_len, 3);
                // ... but the first row has only 2 fields.
                assert_eq!(len, 2);
                Ok(())
            }
            ref wrong => {
                Err(From::from(format!(
                    "expected UnequalLengths error but got {:?}",
                    wrong)))
            }
        }
    } else {
        Err(From::from(
            "expected at least one errored record but got none"))
    }
}
source

pub fn trim(&mut self, trim: Trim) -> &mut ReaderBuilder

Whether fields are trimmed of leading and trailing whitespace or not.

By default, no trimming is performed. This method permits one to override that behavior and choose one of the following options:

  1. Trim::Headers trims only header values.
  2. Trim::Fields trims only non-header or “field” values.
  3. Trim::All trims both header and non-header values.

A value is only interpreted as a header value if this CSV reader is configured to read a header record (which is the default).

When reading string records, characters meeting the definition of Unicode whitespace are trimmed. When reading byte records, characters meeting the definition of ASCII whitespace are trimmed. ASCII whitespace characters correspond to the set [\t\n\v\f\r ].

Example

This example shows what happens when all values are trimmed.

use std::error::Error;
use csv::{ReaderBuilder, StringRecord, Trim};

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city ,   country ,  pop
Boston,\"
   United States\",4628910
Concord,   United States   ,42695
";
    let mut rdr = ReaderBuilder::new()
        .trim(Trim::All)
        .from_reader(data.as_bytes());
    let records = rdr
        .records()
        .collect::<Result<Vec<StringRecord>, csv::Error>>()?;
    assert_eq!(records, vec![
        vec!["Boston", "United States", "4628910"],
        vec!["Concord", "United States", "42695"],
    ]);
    Ok(())
}
source

pub fn terminator(&mut self, term: Terminator) -> &mut ReaderBuilder

The record terminator to use when parsing CSV.

A record terminator can be any single byte. The default is a special value, Terminator::CRLF, which treats any occurrence of \r, \n or \r\n as a single record terminator.

Example: $ as a record terminator
use std::error::Error;
use csv::{ReaderBuilder, Terminator};

fn example() -> Result<(), Box<dyn Error>> {
    let data = "city,country,pop$Boston,United States,4628910";
    let mut rdr = ReaderBuilder::new()
        .terminator(Terminator::Any(b'$'))
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States", "4628910"]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn quote(&mut self, quote: u8) -> &mut ReaderBuilder

The quote character to use when parsing CSV.

The default is b'"'.

Example: single quotes instead of double quotes
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,'United States',4628910
";
    let mut rdr = ReaderBuilder::new()
        .quote(b'\'')
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States", "4628910"]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn escape(&mut self, escape: Option<u8>) -> &mut ReaderBuilder

The escape character to use when parsing CSV.

In some variants of CSV, quotes are escaped using a special escape character like \ (instead of escaping quotes by doubling them).

By default, recognizing these idiosyncratic escapes is disabled.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,\"The \\\"United\\\" States\",4628910
";
    let mut rdr = ReaderBuilder::new()
        .escape(Some(b'\\'))
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec![
            "Boston", "The \"United\" States", "4628910",
        ]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn double_quote(&mut self, yes: bool) -> &mut ReaderBuilder

Enable double quote escapes.

This is enabled by default, but it may be disabled. When disabled, doubled quotes are not interpreted as escapes.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,\"The \"\"United\"\" States\",4628910
";
    let mut rdr = ReaderBuilder::new()
        .double_quote(false)
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec![
            "Boston", "The \"United\"\" States\"", "4628910",
        ]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn quoting(&mut self, yes: bool) -> &mut ReaderBuilder

Enable or disable quoting.

This is enabled by default, but it may be disabled. When disabled, quotes are not treated specially.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
Boston,\"The United States,4628910
";
    let mut rdr = ReaderBuilder::new()
        .quoting(false)
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec![
            "Boston", "\"The United States", "4628910",
        ]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn comment(&mut self, comment: Option<u8>) -> &mut ReaderBuilder

The comment character to use when parsing CSV.

If the start of a record begins with the byte given here, then that line is ignored by the CSV parser.

This is disabled by default.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city,country,pop
#Concord,United States,42695
Boston,United States,4628910
";
    let mut rdr = ReaderBuilder::new()
        .comment(Some(b'#'))
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States", "4628910"]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn ascii(&mut self) -> &mut ReaderBuilder

A convenience method for specifying a configuration to read ASCII delimited text.

This sets the delimiter and record terminator to the ASCII unit separator (\x1F) and record separator (\x1E), respectively.

Example
use std::error::Error;
use csv::ReaderBuilder;

fn example() -> Result<(), Box<dyn Error>> {
    let data = "\
city\x1Fcountry\x1Fpop\x1EBoston\x1FUnited States\x1F4628910";
    let mut rdr = ReaderBuilder::new()
        .ascii()
        .from_reader(data.as_bytes());

    if let Some(result) = rdr.records().next() {
        let record = result?;
        assert_eq!(record, vec!["Boston", "United States", "4628910"]);
        Ok(())
    } else {
        Err(From::from("expected at least one record but got none"))
    }
}
source

pub fn buffer_capacity(&mut self, capacity: usize) -> &mut ReaderBuilder

Set the capacity (in bytes) of the buffer used in the CSV reader. This defaults to a reasonable setting.

Trait Implementations§

source§

impl Debug for ReaderBuilder

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Default for ReaderBuilder

source§

fn default() -> ReaderBuilder

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for Twhere T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for Twhere T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for Twhere T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for Twhere U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.