1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
/*!
This `csv-sniffer` crate provides methods to infer CSV file details (delimiter choice, quote
character, number of fields, field data types, etc.).

# Overview

The [`Sniffer`](struct.Sniffer.html) type is the primary entry point for using this crate. Its
[`Sniffer::open_path`](struct.Sniffer.html#method.open_path) and
[`Sniffer::open_reader`](struct.Sniffer.html#method.open_reader) methods return a configured
[`csv::Reader`](https://docs.rs/csv).

Alternatively, the [`Sniffer::sniff_path`](struct.Sniffer.html#method.sniff_path) and
[`Sniffer::sniff_reader`](struct.Sniffer.html#method.sniff_reader) methods return a
[`Metadata`](metadata/struct.Metadata.html) object containing the deduced details about the
underlying CSV input.

This sniffer detects the following metadata about a CSV file:

* Delimiter -- byte character between fields in a record
* Number of preamble rows -- number of rows in a CSV file before the data starts (occasionally used
in data files to introduce the data)
* Has a header row? -- whether or not the first row of the data file provdes column headers
* Quote -- byte character (either ", ', or `) used to quote fields, or that the file has no quotes
* Flexible -- whether or not records are all of the same length
* Delimiter count -- maximum number of delimiters in each row (and therefore number of fields in
each row)
* Types -- the inferred data type of each field in the data table

See [`Metadata`](metadata/struct.Metadata.html) for full information about what the sniffer returns.

# Setup

Add this to your `Cargo.toml`:

```toml
[dependencies]
csv-sniffer = "0.1"
```

and this to your crate root:

```rust
extern crate csv_sniffer;
```

# Example

This example shows how to write a simple command-line tool for discovering the metadata of a CSV
file:

```no_run
extern crate csv_sniffer;

use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();
    if args.len() != 2 {
        eprintln!("Usage: {} <file>", args[0]);
        ::std::process::exit(1);
    }

    // sniff the path provided by the first argument
    match csv_sniffer::Sniffer::new().sniff_path(&args[1]) {
        Ok(metadata) => {
            println!("{}", metadata);
        },
        Err(err) => {
            eprintln!("ERROR: {}", err);
        }
    }
}
```

This example is provided as the primary binary for this crate. In the source directory, this can be
run as:

```ignore
$ cargo run -- tests/data/library-visitors.csv
```

*/

#![warn(missing_docs)]

extern crate csv;
extern crate csv_core;
extern crate regex;
#[macro_use]
extern crate bitflags;
extern crate memchr;

pub(crate) mod chain;
pub mod error;
pub mod metadata;

mod sniffer;
pub use sniffer::Sniffer;

mod sample;
pub use sample::SampleSize;

pub(crate) mod field_type;
pub use field_type::Type;

mod snip;