# Markup Link Extractor
[](
LICENSE.txt)
[](
https://api.reuse.software/info/github.com/hoijui/mle)
[](
https://github.com/hoijui/mle)
[](
https://crates.io/crates/mle)
[](
https://docs.rs/codify_hoijui)
[](
https://crates.io/crates/mle)
[](
https://deps.rs/repo/github/hoijui/mle)
[](
https://github.com/hoijui/mle/actions)
[](
https://fabcity.hamburg)
[](
https://opensourceecology.de)
Extracts links and_or anchors from markup files.
Currently, `markdown`/`md` and `html` files are supported.
The main intended purpose of the Markup Link Extractor,
is to extract links from a set of files,
and then check them for validity using a separate tool,
e.g. the [Markdown Link *Checker*](https://github.com/hoijui/mlc).
Together, two such tools could be integrated in your CI pipeline
to warn about broken links in your markup docs.
## Features
* Extracts links from `markdown`/`md` and `html` files
* Extracts anchors from `markdown`/`md` and `html` files. \
Anchors are parts of a file that can be linked to,
by appending the parts identifier/name to the file path/URL after a `#` (hash); \
e.g. `https://www.example.com/some-dir/some-file.html#sub-section`
* Support HTML links and plain URLs in `markdown` files
* Command line interface according to the [UNIX philosophy],
first item: of "Make each program do one thing well". \
-> Therefore, this tool does not scan for markup files,
nor does it check the links itself.
* Easy [CI pipeline integration](#ci-pipeline-integration)
* Very fast execution using [async](https://rust-lang.github.io/async-book/)
* Operates offline, accessing only files on the local file-system
## Install Locally
There are different ways to install and use *mle*.
### Cargo
Use rust's package manager [cargo](https://doc.rust-lang.org/cargo/)
to install *mle* from [crates.io](https://crates.io/crates/mle):
``` bash
cargo install mle
```
### Download Binaries
To download a compiled binary version of *mle*
go to [github releases](https://github.com/hoijui/mle/releases)
and download the binaries compiled for `x86_64-unknown-linux-gnu`
or `x86_64-apple-darwin`.
## CI Pipeline Integration
### GitHub Actions
Use *mle* in GitHub using the *GitHub-Action*
from the [Marketplace](https://github.com/marketplace/actions/markup-link-checker-mle).
``` yaml
- name: Markup Link Extractor (mle)
uses: hoijui/mle@v0.14.3
```
Use *mle* [command line arguments](docs/reference.md) using the `with` argument:
``` yaml
- name: Markup Link Extractor (mle)
uses: hoijui/mle@v0.14.3
with:
args: ./README.md
```
### Binary
To integrate *mle* in your CI pipeline running in a *linux x86_64 environment*,
you can add the following commands to download the tool:
``` bash
curl -L https://github.com/hoijui/mle/releases/download/v0.14.3/mle -o mle
chmod +x mle
```
For example take a look at the [ntest repo](
https://github.com/becheran/ntest/blob/master/.gitlab-ci.yml)
which uses *mle* in the CI pipeline.
### Docker
Use the *mle* docker image from the [docker hub](
https://hub.docker.com/repository/docker/hoijui/mle),
which includes *mle*.
## Usage
Once you have *mle* installed,
it can be called from the command line.
The following call will extract all links in markup files
found under the current folder (including sub-directories):
``` bash
mle ./**.{html,md}
```
This extracts links from all git-tracked Markdown files,
except those matching `README` or `LICENSE`,
and write the result to stdout in CSV format.
``` bash
# explicit version
g ls-files **.{html,md} -z \
| grep --null-data --invert-match --ignore-case --regexp README --regexp LICENSE \
| xargs -0 mle --result-format csv
# same in short form
g ls-files **.{html,md} -z | grep -z -v -i -e README -e LICENSE | xargs -0 mle --result-format csv
```
Here we write the list of files to a file first,
and then pass that to `mle`.
This is useful for when the list of files is used multiple times,
or if it is very large,
potentially exceeding the shells limit for arguments.
``` bash
g ls-files **.{html,md} -z | tr '\0' '\n' > /tmp/link-check_files.csv
mle --markup-files-list /tmp/link-check_files.csv
```
Call *mle* with the `--help` flag to display all available cli arguments:
``` bash
mle --help
```
## Funding
This project was funded by the European Regional Development Fund (ERDF)
in the context of the [INTERFACER Project](https://www.interfacerproject.eu/),
from July 2022 (fork from [`mlc`](https://github.com/becheran/mlc)/project start)
until March 2023.

[UNIX philosophy]: https://en.wikipedia.org/wiki/Unix_philosophy#Origin