file_checker 0.3.1

A tool to check that files and directories follow a given pattern
# File Checker

This a file and directory checker that verifies that the structure follows
the rules defined in the configuration file.

## Installation

        $ cargo install file_checker

## Usage

    $ file_checker --help
    File checking program

    Usage: file_checker [OPTIONS]

    Options:
    -p, --path <PATH>      Path to the root directory to check [default: .]
    -c, --config <CONFIG>  Path to config file [default: config.toml]
    -t, --target <TARGET>  Target directory to print diagnostics for [default: .]
    -h, --help             Print help

## Configuration

The configuration file is a TOML file that contains the rules to check the
files and directories.

The configuration file is composed of two parts:

- [[alias]]: This part contains the aliases that can be used in the rules.
- [dir]: This part contains the rules to check the directories.

### Aliases

The aliases are used to simplify the rules. For example, if you
want to match a date, you can define an alias for it:

    [[alias]]
    name = 'date'
    regex = '\d{4}-\d{2}-\d{2}
    description = "Date"
    example = '2019-01-01'

The description and example are optional.

The regex entry can be a string or an array of strings. If it is an array,
the regex will be the concatenation of all the strings. If any string matches an alias,
the alias will be expanded:

    [[alias]]
    name = 'hour-date'
    regex = ['\d{2}:\d{2}' ,' ' ,'date']

The previous alias will match a string like '12:00 2019-01-01' with its regex being
'\d{2}:\d{2} \d{4}-\d{2}-\d{2}'.

### Rules

The rules are defined in the [dir] section. Only aliases defined in the
[[alias]] section can be used in the rules. You cannot use any raw regex.

Inside a rule you can use the following keywords:

- required-files: A list of list of files that must be present in the directory. At least
  one list must have all the files present to be considered valid.
- optional-files: A list of files that can be present in the directory.
- optional-dirs: A list of directories that can be present in the directory and will not be checked.

To match a directory, use the matching alias as the name of the section:

    [dir.date]
    required-files = [['file1', 'file2'], ['file3', 'file4']]
    optional-files = ['file5', 'file6']
    optional-dirs = ['dir1']

The previous rule will match a top level directory that matches the 'date' regex, and it
will verify that the directory contains at least one of the following combinations of aliases:

- file1 and file2
- file3 and file4

It will also verify that it does not contain any other files than the ones specified in the
required-files (only the list that entirely matched) and optional-files sections.

Any missing required file will create a violation MissingRequiredFile. Any extra file will
create a violation UnknownFile.

To specify a nested directory, use the alias of the directory as the name of the section:

    [dir.date.hour-date]
    required-files = [['file1', 'file2'], ['file3', 'file4']]
    optional-files = ['file5', 'file6']
    optional-dirs = ['dir1']

If a directory specified in the rules does not exist, it will create a violation MissingRequiredDir. You can
add the following keyword to the rule to ignore this violation:

    ignore-missing = true

If a directory is not matched by any of the regexes, it will create a violation UnknownDir.

### Example

The following is an example of a configuration file:

    [[alias]]
    name = 'date'
    regex = '\d{4}-\d{2}-\d{2}'
    description = "Date"
    example = '2019-01-01'

    [[alias]]
    name = 'hour-date'
    regex = ['\d{2}:\d{2}' ,' ' ,'date']

    [[alias]]
    name = 'csv'
    regex = '.*\.csv$'

    [[alias]]
    name = 'txt'
    regex = '.*\.txt$'

    [[alias]]
    name = 'toml'
    regex = '.*\.toml$'

    [[alias]]
    name = 'dir1'
    regex = 'opt'

    [dir.date]
    required-files = [['toml']]
    optional-files = []

    [dir.date.hour-date]
    required-files = [['csv'], ['txt']]
    optional-files = ['toml']
    optional-dirs = ['dir1']

This config file will produce some errors against the following directory structure:

    .
    ├── 2019-01-01
    │   ├── 12:00 2019-01-01
    │   │   ├── info.json
    │   │   ├── file1.csv
    │   │   └── opt
    │   │        └── file7.txt
    │   └── 13:00 2019-01-01
    │       └── info.toml
    └── 2019-01-02
        ├── config.toml
        ├── 12:00 2019-01-02
        │   └── file5.csv
        └── 13:00 2019-01-02
            └── file8.txt

The errors produced are:

    "./2019-01-01/12:00 2019-01-01" - UnknownFile - File "info.json" does not match any regex
    "./2019-01-01/13:00 2019-01-01" - MissingRequiredFile - File "csv" is missing