dir-signature 0.1.0

This crate allows to create a signature (index) of a directory with files.
Documentation
==================================
Directory Signature File Format v1
==================================

The format is a text file with ascii-only characters.

Limitations of the current format:

* Only stores executable bit for files, no permissions and ownership
  support (this also means files can be replicated without privileges)
* File modification times are not checked and not replicated
* It's ascii text, so potentially 2x larger than what binary file could be

While these limitations are not enough for generic backup purposes they
are fine for deploying configs and read-only images to production servers
in 99% use cases. Latter was a primary use case for the library. We will
probably make a more featureful format as v2 and later as deemed necessary.

Design of the format features the following things:

* Reproducible (does not depend on order of file scan or phase of the moon)
* Easy to check even using a bash script (sans edge cases)
* Usable for file synchonization
* Can be produced and checked without loading full index into memory


Header
======

File starts with a header line, which looks like::

    DIRSIGNATURE.v1 sha512/256 block_size=32768

It consists of three parts separated by a single space, all of them are
case sensitive and order is fixed. Parts have the following meaning:

1. ``DIRSIGNATURE.v1 `` is a signature of a file format and version.
   Only ``v1` format is defined by this specification.

2. Is a hash type (must be all lower case). This specification defines two
   hash types ``blake2b/256`` (which is blake2b with 256 bit hash) and
   ``sha512/256`` which means ``sha512`` truncated to a 256 bits.
   Other hash kinds might be added in future. It's expected that sha512/256
   will be supported by every implementation and others are optional.

3. Space separated key value pairs. This specification defines only
   ``block_size``. It must be the first key in the header. This specification
   requires to support only ``32768`` block size. Other block sizes can be
   added in future. Additional key value pairs may exists and may be skipped
   by the parser (but must be accounted in final hash, see below).


File List
=========

Directories and files follow header and go in the following format. Example::

    /
      file1.txt f 0
    /dir
      file2.txt f 1 a4abd4448c49562d828115d13a1fccea927f52b4d5459297f8b43e42da89238b
      symlink s ../file1.txt
    /dir/subdir

The rules are:

* Directory lines start with slash ``/``, directory path is specified relative
  to the root of the scanned directory
* Files and symlinks start with exactly two space ``  `` and followed by a name
  relative to the directory, followed by attributes (see below)
* No other kinds of entries are allowed
* Directory does not have any attributes and should be recreated with
  appropriate umask or use mode ``755`` when created
* All fields in file entries are space-separated
* In file (and directory) names all control characters, non-ascii,
  non-printable characters and space are escaped using hex escapes (e.g. space
  is ``\x20``), unicode characters are first serialized to utf-8 then escaped
  (specifically all chars with code <= 0x20 and >= 0x7F are escaped)
* Line-endings are always ``\n``
* Directory paths are sorted as utf-8-encoded binary strings
* File names are sorted locally inside the directory as utf-8-encoded binary
  strings


File Entries
============

Files can be of the following types:

* ``f`` -- regular file (recommended mode ``644``)
* ``x`` -- executable (recommended mode ``755``)
* ``s`` -- a symlink

A symlink is a stored in the index as name followed by ``s`` followed by a
symlink's destination (obtained by ``readlink()``).

Files (both executables and not) are indexed as name followed by ``f`` or
``x``, followed by file size, followed by a lowercase hex-encoded hashes for
each block.  If last block of file is less than ``block_size`` it's not padded
only bytes that exist in file are hashed.

Files with the size of zero do not have any hashes (finish line by zero file
length). In general number of hashes may be calculates as
``ceil(file_size / block_size)``.


Footer
======

Footer consists of a hash of the all lines above, including header line as
written in the file hashed with the same hash function (and serialized as a
lowercase hex value). Footer ends with a newline. And this is the final line
of the file.

If you're writing a parser any line except the first that does not start with
a slash ``/`` or a space `` `` must be considerered a footer.


Full Example
============

Here is an example of the simple directory::

    DIRSIGNATURE.v1 sha512/256 block_size=32768
    /
      file2.txt f 18 c4cadd1e2e2aded1cdb2ba48fdfe8a831d9236042aec16472725d45b001c1ad5
    /sub2
      hello.txt f 6 e0494295cc1dfdd443d09f81913881a112745174778cc0c224ccc7137024fe41
    /subdir
      bigdata.bin f 81920 768007e06b0cd9e62d50f458b9435c6dda0a6d272f0b15550f97c478394b7433 768007e06b0cd9e62d50f458b9435c6dda0a6d272f0b15550f97c478394b7433 6eb7f16cf7afcabe9bdea88bdab0469a7937eb715ada9dfd8f428d9d38d86133
      file3.txt f 12 b130fa20a2ba5a3d9976e6c15e8a59ad9e5cbbc52536a4458952872cda5c218d
    c23f2579827456818fc855c458d1ad7339d144b57ee247a6628e4fc8e39958bb