rust_warc 1.1.0

A high performance and easy to use Web Archive (WARC) file reader
Documentation
  • Coverage
  • 56.25%
    9 out of 16 items documented4 out of 10 items with examples
  • Size
  • Source code size: 14.39 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 2.79 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Links
  • orottier/rust-warc
    9 3 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • orottier

Rust-Warc

A high performance and easy to use Web Archive (WARC) file reader

use rust_warc::WarcReader;

use std::io;

fn main() {
    // we're taking input from stdin here, but any BufRead will do
    let stdin = io::stdin();
    let handle = stdin.lock();

    let warc = WarcReader::new(handle);

    let mut response_counter = 0;
    let mut response_size = 0;

    for item in warc {
        let record = item.unwrap(); // could be IO/malformed error

        // header names are case insensitive
        if record.header.get(&"WARC-Type".into()) == Some(&"response".into()) {
            response_counter += 1;
            response_size += record.content.len();
        }
    }

    println!("response records: {}", response_counter);
    println!("response size: {} MiB", response_size >> 20);
}