Expand description
This csv-sniffer crate provides methods to infer CSV file details (delimiter choice, quote
character, number of fields, field data types, etc.).
§Overview
The Sniffer type is the primary entry point for using this crate. Its
Sniffer::open_path and
Sniffer::open_reader methods return a configured
csv::Reader.
Alternatively, the Sniffer::sniff_path and
Sniffer::sniff_reader methods return a
Metadata object containing the deduced details about the
underlying CSV input.
This sniffer detects the following metadata about a CSV file:
- Delimiter – byte character between fields in a record
- Has a header row? – whether or not the first row of the data file provdes column headers
- Number of preamble rows – number of rows in a CSV file before the data starts (occasionally used in data files to introduce the data)
- Quote – byte character (either “, ’, or `) used to quote fields, or that the file has no quotes
- Flexible – whether or not records are all of the same length
- Is utf8-encoded? – whether the file is utf-8 encoded
- Number of delimiter/fields – maximum number of delimiters in each row (and therefore number of fields in each row)
- Field names - the name of each field
- Types – the inferred data type of each field in the data table
See Metadata for full information about what the sniffer returns.
§Setup
Add this to your Cargo.toml:
[dependencies]
csv-sniffer = "0.1"and this to your crate root:
extern crate qsv_sniffer;§Example
This example shows how to write a simple command-line tool for discovering the metadata of a CSV file:
extern crate qsv_sniffer;
use std::env;
fn main() {
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
eprintln!("Usage: {} <file>", args[0]);
::std::process::exit(1);
}
// sniff the path provided by the first argument
match qsv_sniffer::Sniffer::new().sniff_path(&args[1]) {
Ok(metadata) => {
println!("{}", metadata);
},
Err(err) => {
eprintln!("ERROR: {}", err);
}
}
}This example is provided as the primary binary for this crate. In the source directory, this can be run as:
$ cargo run -- tests/data/library-visitors.csvModules§
Structs§
- Sniffer
- A CSV sniffer.
Enums§
- Date
Preference - Argument used when calling
date_preferenceonSniffer. - Sample
Size - Argument used when calling
sample_sizeonSniffer. - Type
- The valid field types for fields in a CSV record.