Expand description
The csv-index
crate provides data structures for indexing CSV data.
§Usage
This crate is
on crates.io
and can be used by adding csv-index
to your dependencies in your project’s
Cargo.toml
[dependencies]
csv-index = "0.2"
§Example: build a simple random access index
The RandomAccessSimple
index is a simple data structure that maps record
indices to the byte offset corresponding to the start of that record in CSV
data. This example shows how to save this index to disk for a particular CSV
file.
Note that this indexing data structure cannot be updated. That means that if your CSV data has changed since the index was created, then the index will need to be regenerated.
use std::error::Error;
use std::fs::File;
use std::io::{self, Write};
use csv_index::RandomAccessSimple;
fn example() -> Result<(), Box<dyn Error>> {
// Open a normal CSV reader.
let mut rdr = csv::Reader::from_path("data.csv")?;
// Create an index for the CSV data in `data.csv` and write it
// to `data.csv.idx`.
let mut wtr = io::BufWriter::new(File::create("data.csv.idx")?);
RandomAccessSimple::create(&mut rdr, &mut wtr)?;
wtr.flush()?;
// Open the index we just created, get the position of the last
// record and seek the CSV reader to the last record.
let mut idx = RandomAccessSimple::open(File::open("data.csv.idx")?)?;
if idx.is_empty() {
return Err(From::from("expected a non-empty CSV index"));
}
let last = idx.len() - 1;
let pos = idx.get(last)?;
rdr.seek(pos)?;
// Read the next record.
if let Some(result) = rdr.records().next() {
let record = result?;
println!("{:?}", record);
Ok(())
} else {
Err(From::from("expected at least one record but got none"))
}
}
§Future work
The full scope of this crate hasn’t been determined yet. For example, it’s not clear whether this crate should support data structures more amenable to in-memory indexing. (Where the current set of indexing data structures are all amenable to serializing to disk.)
Structs§
- Random
Access Simple - A simple index for random access to CSV records.