1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
/*!
This `csv-sniffer` crate provides methods to infer CSV file details (delimiter choice, quote
character, number of fields, field data types, etc.).
# Overview
The [`Sniffer`](struct.Sniffer.html) type is the primary entry point for using this crate. Its
[`Sniffer::open_path`](struct.Sniffer.html#method.open_path) and
[`Sniffer::open_reader`](struct.Sniffer.html#method.open_reader) methods return a configured
[`csv::Reader`](https://docs.rs/csv).
Alternatively, the [`Sniffer::sniff_path`](struct.Sniffer.html#method.sniff_path) and
[`Sniffer::sniff_reader`](struct.Sniffer.html#method.sniff_reader) methods return a
[`Metadata`](metadata/struct.Metadata.html) object containing the deduced details about the
underlying CSV input.
This sniffer detects the following metadata about a CSV file:
* Delimiter -- byte character between fields in a record
* Number of preamble rows -- number of rows in a CSV file before the data starts (occasionally used
in data files to introduce the data)
* Has a header row? -- whether or not the first row of the data file provdes column headers
* Quote -- byte character (either ", ', or `) used to quote fields, or that the file has no quotes
* Flexible -- whether or not records are all of the same length
* Delimiter count -- maximum number of delimiters in each row (and therefore number of fields in
each row)
* Types -- the inferred data type of each field in the data table
See [`Metadata`](metadata/struct.Metadata.html) for full information about what the sniffer returns.
# Setup
Add this to your `Cargo.toml`:
```toml
[dependencies]
csv-sniffer = "0.1"
```
and this to your crate root:
```rust
extern crate csv_sniffer;
```
# Example
This example shows how to write a simple command-line tool for discovering the metadata of a CSV
file:
```no_run
extern crate csv_sniffer;
use std::env;
fn main() {
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
eprintln!("Usage: {} <file>", args[0]);
::std::process::exit(1);
}
// sniff the path provided by the first argument
match csv_sniffer::Sniffer::new().sniff_path(&args[1]) {
Ok(metadata) => {
println!("{}", metadata);
},
Err(err) => {
eprintln!("ERROR: {}", err);
}
}
}
```
This example is provided as the primary binary for this crate. In the source directory, this can be
run as:
```ignore
$ cargo run -- tests/data/library-visitors.csv
```
*/
#![warn(missing_docs)]
extern crate csv;
extern crate csv_core;
extern crate regex;
#[macro_use]
extern crate bitflags;
extern crate memchr;
pub(crate) mod chain;
pub mod error;
pub mod metadata;
mod sniffer;
pub use sniffer::Sniffer;
mod sample;
pub use sample::SampleSize;
pub(crate) mod field_type;
pub use field_type::Type;
mod snip;