exine 0.1.1

Universal Markdown extraction engine. 37+ formats, zero external dependencies, 10-96× faster than Pandoc.
Documentation
# Contributing to Exine

Thank you for your interest in contributing to Exine! This document provides guidelines for contributing.

## Contribution Guidelines

Exine's core principle is **zero external dependencies** for format extraction.
PRs that introduce mandatory external service dependencies will not be merged.

Vision AI and OCR remain optional feature-gated additions.

**What we welcome:**
- New format extractors
- Performance improvements
- Bug fixes for existing extractors
- Better test coverage
- Documentation improvements

**What we'll decline:**
- Anything that requires external services as a core dependency
- Changes that would break the zero-dependency extraction guarantee

## Getting Started

### Prerequisites

- **Rust toolchain** — Install via [rustup]https://rustup.rs/
- **Optional**: Tesseract (`brew install tesseract` on macOS) for OCR tests

### Building

```bash
git clone https://github.com/nma-vc/exine.git
cd exine
cargo build
```

### Running Tests

```bash
# All library tests
cargo test --lib

# All tests including integration
cargo test
```

### Code Quality

```bash
# Linting
cargo clippy -- -D warnings

# Formatting
cargo fmt

# Check formatting (CI uses this)
cargo fmt -- --check
```

## Making Changes

1. **Fork** the repository and create a feature branch
2. **Write tests** for any new functionality
3. **Run the full check suite** before submitting:
   
   ```bash
   cargo fmt -- --check
   cargo clippy -- -D warnings
   cargo test --lib
   cargo build
   ```

4. **Submit a pull request** with a clear description of what you changed and why

## Adding a New Extractor

1. Create `src/extract/your_format.rs` implementing the `Extractor` trait
2. Register the module in `src/extract/mod.rs`
3. Add the extension(s) to `extract_by_extension()` in `mod.rs`
4. Add tests — see `src/extract/docx.rs` for a comprehensive example
5. Update the format count in `README.md` and `CHANGELOG.md`

## Code Style

- Follow standard Rust conventions (`rustfmt` defaults)
- Use `thiserror` for error types via `ExineError`
- Prefer returning `Result<T>` over panicking
- Document public APIs with `///` doc comments
- Keep dependencies minimal — prefer the Rust stdlib when practical

## Reporting Issues

- Use GitHub Issues for bug reports and feature requests
- Include reproduction steps, expected vs actual behavior, and your Rust version

## License

By contributing, you agree that your contributions will be licensed under the [MIT License](LICENSE).