chaintools 0.0.8

work with .chain files in Rust
Documentation
<p align="center">
  <p align="center">
    <img width=200 align="center" src="../logo.png" >
  </p>

  <span>
    <h1 align="center">
        chaintools
    </h1>
  </span>

  <p align="center">
    <a href="https://img.shields.io/badge/version-0.0.2-green" target="_blank">
      <img alt="Version Badge" src="https://img.shields.io/badge/version-0.0.2-green">
    </a>
    <a href="https://crates.io/crates/chaintools" target="_blank">
      <img alt="Crates.io Version" src="https://img.shields.io/crates/v/chaintools">
    </a>
    <a href="https://github.com/alejandrogzi/chaintools" target="_blank">
      <img alt="GitHub License" src="https://img.shields.io/github/license/alejandrogzi/chaintools?color=blue">
    </a>
    <a href="https://crates.io/crates/chaintools" target="_blank">
      <img alt="Crates.io Total Downloads" src="https://img.shields.io/crates/d/chaintools">
    </a>
  </p>
</p>


# chaintools anti-repeat

Filters chains that are dominated by soft-masked sequence or degenerate nucleotide matches.

## Usage

```bash
chaintools anti-repeat \
  --reference target.2bit-or-fasta \
  --query query.2bit-or-fasta \
  [--chain input.chain] \
  [--out-chain output.chain] \
  [--gzip] \
  [--min-score 5000] \
  [--no-check-score 200000]
```

If `--chain` is omitted, input is read from standard input. If `--out-chain` is omitted, output is written to standard output.

## Behavior

- Chains with score `>= --no-check-score` are written without sequence checks.
- Remaining chains must pass both:
  - the degeneracy filter
  - the repeat-mask filter
- Output order matches input order.
- `#` metadata lines are preserved in stream order.
- Output chain text is written in canonical dense chain format.

## Accepted sequence inputs

`--reference` and `--query` each accept:

- `.2bit`
- `.fa`
- `.fasta`
- `.fna`
- gzipped FASTA variants such as `.fa.gz` and `.fasta.gz`

Soft-masked lowercase bases from `.2bit` are preserved and used by the repeat filter.

## Differences from UCSC `chainAntiRepeat`

- Uses descriptive named arguments instead of positional UCSC arguments.
- Supports stdin/stdout and optional gzip-compressed output.
- Processes chains in bounded batches and parallelizes filtering when built with the `parallel` feature, while preserving output order.
- Preserves the observed UCSC edge behavior for zero-match and zero-length filter cases instead of normalizing them.