[][src]Crate cshannon

A library of some early compression algorithms based on replacement schemes.

This library implements the standard Huffman coding scheme and two precursors to the Huffman scheme often called Shannon-Fano coding.

Usage

cshannon provides a binary that can be used for compression / decompression at the command line and a library that can be integrated into other projects.

Run cshannon --help to see the command-line options for the binary.

The easiest way to use cshannon library is:

use cshannon::{Args, run};

run(Args{
    command: "compress",
    input_file: "/path/to/input_file",
    output_file: "/path/to/output_file",
    tokenizer: "byte",
    encoding: "fano",
});

Crate layout

The abstract steps in compression are as follows:

Input --> Tokens --> Model --> Encoding -+
  |                                      |
  +-----> Tokens ------------------------+--> Compressed
                                                Output

Different modules in the crate correspond to each of these steps.

The abstract steps for decompression are as follows:

Compressed --> extract prefix --> Encoding
  Input                              |
   |                                 |
   +--> remaining data --------------+--> Output

Decompression is conceptually simpler because there are no choices (of tokenizer and encoding). The encoding is included as a prefix in-band in the compressed data. Most of the decompression logic resides in the code module.

Modules

code

Provides facilities to read & write data encoded with a prefix code.

encoding

Defines the Encoding struct that maps a Token to a Letter.

model

Exports Model, a statically computed zero order model over a Token stream.

tokens

This module provides traits for tokenizing text.

Structs

Args

Functions

compress

Document me. TODO: Convert to use AsRef

decompress

Document me. TODO: Convert to use AsRef

run