nom-gzip 0.1.0

nom parser for the GZIP file format
Documentation
  • Coverage
  • 0%
    0 out of 65 items documented0 out of 12 items with examples
  • Size
  • Source code size: 32.58 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 3.48 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Links
  • nharward/nom-gzip
    1 0 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • nharward

nom-gzip

nom parser for the GZIP file format, as documented in RFC 1952.

Installation

nom-gzip is available on crates.io and can be used in your project by adding the following to your Cargo.toml file:

[dependencies]
nom-gzip = "0.1.0"

Usage

Three functions are available:

  • gzip_file
  • gzip_header
  • gzip_footer

Once the GZIP header has been parsed, the remaining data are the compressed blocks and an 8-byte footer. If using a seekable stream it's recommended to parse the header with gzip_header, grab the remaining bytes - minus the 8 at the end - as the compressed blocks, then call parse_footer on the remaining 8 bytes. This should be considerably faster than parsing byte-by-byte looking for the end of stream.

Notes on this parser

TL;DR

This parser assumes the GZIP stream contains only a single compressed file that goes until EOF.

Details

While in theory multiple files can be in a single GZIP stream by simply concatenating multiple GZIP files together (see section 2.2 of the RFC), in practice it appears that at least GNU GZIP and 7z do not correctly support this. For two files cat'd together they both report the header of the first file with the footer (uncompressed size of the file) from the second. Decompression of such a file with the gzip utility results in the uncompressed contents of both files concatenated together in a single file instead of two files with separated content. IMHO if this feature of the GZIP format can't be used in any practical sense there is no point in spending time writing a theoretically correct but far more involved (and slower!) parser here.