areyouok 0.1.0

Fast CLI to scan Markdown/HTML/text files for broken links
areyouok-0.1.0 is not a library.

🔍 AreYouOk

A fast, standalone Rust tool to scan files for broken or dead links. Works with Markdown, HTML, and any text files.

Features

Multi-format support - Scan Markdown, HTML, or any text files
Flexible filtering - Ignore specific directories and files
Multiple report formats - JSON, HTML, TXT, and GitHub-flavored Markdown
Fast link validation - Async HTTP checking with timeout protection
Cross-platform - Works on Linux, macOS, and Windows
Standalone binary - No dependencies required, just download and run

Installation

Download Pre-built Binary

Visit Releases and download the binary for your platform:

  • Linux (amd64): areyouok-linux-amd64
  • macOS (amd64): areyouok-macos-amd64
  • Windows (amd64): areyouok-windows-amd64.exe

After downloading, make the binary executable (on Linux/macOS):

chmod +x areyouok-linux-amd64

Build from Source

Requires Rust 1.70+

git clone https://github.com/yourusername/areyouok.git
cd areyouok
cargo build --release

The binary will be at target/release/areyouok (or areyouok.exe on Windows).

Usage

Basic Usage

./areyouok ./docs

Scans all Markdown files in ./docs and subdirectories for broken links.

Command-line Options

USAGE:
    areyouok [OPTIONS] <PATH>

OPTIONS:
    -t, --type <TYPE>
        Type of files to scan [default: markdown]
        Options: markdown, html, text, or specific extensions (e.g., "md,txt")

    -i, --ignore <PATTERNS>
        Files or directories to ignore (comma-separated)
        Example: "node_modules,.git,_site"

    -r, --report <FORMAT>
        Report format to generate [default: txt]
        Options: json, html, txt, github

    -h, --help
        Print help information

ARGS:
    <PATH>
        Directory path to scan

Examples

Scan with file type filter

./areyouok -t html ./website

Scan with ignore patterns

./areyouok -i "node_modules,.git,_site,README.md" ./project

Generate HTML report

./areyouok -r html ./docs

Multiple file types

./areyouok -t "md,txt,html" ./content

Complete example (Jekyll blog)

./areyouok \
  -t markdown \
  -i "_layouts,.git,_site,README.md,USAGE.md,build.py" \
  -r html \
  ./blog

This generates an interactive HTML report at areyouok_report_YYYY-MM-DD_HH-MM-SS.html

Report Formats

TXT (Default)

Human-readable text format with summary and broken links listing.

JSON

Structured JSON output containing all link validation results. Useful for CI/CD pipelines and automation.

[
  {
    "url": "https://example.com",
    "status_code": 404,
    "is_valid": false,
    "error_message": "HTTP 404",
    "files": [
      {
        "file": "docs/index.md",
        "line": 42
      }
    ]
  }
]

HTML

Visual, interactive report with statistics and color-coded results. Perfect for web viewing.

GitHub

GitHub-flavored Markdown format compatible with GitHub Issues and Pull Requests.

CI/CD Integration

GitHub Actions

Create .github/workflows/link-check.yml:

name: Check Links

on:
  schedule:
    - cron: '0 0 * * 0'  # Weekly on Sunday
  workflow_dispatch:

jobs:
  check-links:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Download AreYouOk
        run: |
          curl -L https://github.com/yourusername/areyouok/releases/download/v0.1.0/areyouok-linux-amd64 -o areyouok
          chmod +x areyouok
      
      - name: Check links
        run: ./areyouok -r github .
        continue-on-error: true
      
      - name: Upload report
        uses: actions/upload-artifact@v4
        with:
          name: link-report
          path: areyouok_report_*.html

Generic CI/CD (Every 30 days)

#!/bin/bash
set -e

# Download fresh copy
curl -L https://github.com/yourusername/areyouok/releases/download/v0.1.0/areyouok-linux-amd64 -o areyouok
chmod +x areyouok

# Run scan
./areyouok -r github . > report.md || true

# Upload/notify on broken links
if grep -q "Broken Links" report.md; then
  echo "⚠️ Found broken links in the repository"
  # Send notification, create issue, etc.
fi

How It Works

  1. Scan - Recursively walks through directories, identifying files matching your criteria
  2. Extract - Uses regex patterns to find URLs in Markdown, HTML, and text files
  3. Validate - Makes HEAD requests to check if URLs are accessible
  4. Report - Generates a report in your chosen format with results and locations

Supported Link Formats

  • Markdown links: [text](https://example.com)
  • HTML attributes: href="https://example.com"
  • Plain URLs: https://example.com

Performance

  • Fast scanning - Walks directories efficiently, skips ignored paths
  • Async validation - Checks multiple links in parallel
  • Deduplication - Avoids re-checking duplicate URLs
  • Configurable timeouts - 10-second timeout per link (default)

Troubleshooting

SSL/Certificate errors

Some websites have SSL issues. The tool will report these as broken links.

Timeout errors

If a site responds slowly, it may timeout. Adjust the request timeout in the code if needed.

Rate limiting

Some servers rate-limit requests. Space out link checks by running the tool at different times.

License

See LICENSE file for details.

Contributing

Contributions welcome! Please submit issues and pull requests.


AreYouOk v0.1.0 - Made with 🦀 Rust