csv_index/
lib.rs

1/*!
2The `csv-index` crate provides data structures for indexing CSV data.
3
4# Usage
5
6This crate is
7[on crates.io](https://crates.io/crates/csv-index)
8and can be used by adding `csv-index` to your dependencies in your project's
9`Cargo.toml`
10
11```toml
12[dependencies]
13csv-index = "0.2"
14```
15
16# Example: build a simple random access index
17
18The `RandomAccessSimple` index is a simple data structure that maps record
19indices to the byte offset corresponding to the start of that record in CSV
20data. This example shows how to save this index to disk for a particular CSV
21file.
22
23Note that this indexing data structure cannot be updated. That means that if
24your CSV data has changed since the index was created, then the index will need
25to be regenerated.
26
27```no_run
28use std::error::Error;
29use std::fs::File;
30use std::io::{self, Write};
31use csv_index::RandomAccessSimple;
32
33# fn main() { example().unwrap(); }
34fn example() -> Result<(), Box<dyn Error>> {
35    // Open a normal CSV reader.
36    let mut rdr = csv::Reader::from_path("data.csv")?;
37
38    // Create an index for the CSV data in `data.csv` and write it
39    // to `data.csv.idx`.
40    let mut wtr = io::BufWriter::new(File::create("data.csv.idx")?);
41    RandomAccessSimple::create(&mut rdr, &mut wtr)?;
42    wtr.flush()?;
43
44    // Open the index we just created, get the position of the last
45    // record and seek the CSV reader to the last record.
46    let mut idx = RandomAccessSimple::open(File::open("data.csv.idx")?)?;
47    if idx.is_empty() {
48        return Err(From::from("expected a non-empty CSV index"));
49    }
50    let last = idx.len() - 1;
51    let pos = idx.get(last)?;
52    rdr.seek(pos)?;
53
54    // Read the next record.
55    if let Some(result) = rdr.records().next() {
56        let record = result?;
57        println!("{:?}", record);
58        Ok(())
59    } else {
60        Err(From::from("expected at least one record but got none"))
61    }
62}
63```
64
65# Future work
66
67The full scope of this crate hasn't been determined yet. For example, it's not
68clear whether this crate should support data structures more amenable to
69in-memory indexing. (Where the current set of indexing data structures are all
70amenable to serializing to disk.)
71*/
72
73#![deny(missing_docs)]
74
75pub use crate::simple::RandomAccessSimple;
76
77mod simple;