[−][src]Crate rust_warc
A high performance Web Archive (WARC) file parser
The WarcReader iterates over WarcRecords from a BufRead input.
Perfomance should be quite good, about ~500MiB/s on a single CPU core.
Usage
use rust_warc::WarcReader; fn main() { // we're taking input from stdin here, but any BufRead will do let stdin = std::io::stdin(); let handle = stdin.lock(); let mut warc = WarcReader::new(handle); let mut response_counter = 0; for item in warc { let record = item.expect("IO/malformed error"); // header names are case insensitive if record.header.get(&"WARC-Type".into()) == Some(&"response".into()) { response_counter += 1; } } println!("# response records: {}", response_counter); }
Structs
CaseString | Case insensitive string |
WarcReader | WARC reader instance |
WarcRecord | WARC Record |
Enums
WarcError | WARC Processing error |