pdf_oxide 0.3.22

The fastest Rust PDF library with text extraction: 0.8ms mean, 100% pass rate on 3,830 PDFs. 5× faster than pdf_extract, 17× faster than oxidize_pdf. Extract, create, and edit PDFs.
Documentation
# PDF Library Scripts

Utility scripts for development, testing, and analysis.

## Cleanup

### `cleanup_for_release.sh`
Cleans repository for public release by removing development artifacts.

```bash
./scripts/cleanup_for_release.sh
```

**What it does:**
- Removes development markdown files from root
- Removes Python comparison scripts from root
- Removes log and JSON analysis files
- Cleans docs/development/sessions/
- Cleans docs/issues/
- Prunes docs/quality/ to keep only summaries

## Download Scripts

### `download/`
Scripts for downloading test PDFs from various sources.

Located in `../test_datasets/download.py`

## Analysis Scripts

Utility scripts for analyzing PDF content and quality.

### Usage
Most scripts are Python-based:
```bash
python scripts/analyze_something.py input.pdf
```

## Python type stubs

Type stub files (`.pyi`) are generated from the Rust PyO3 source with [Rylai](https://github.com/monchin/Rylai) (no compilation). Use this after changing the Python API in `src/python.rs`:

```bash
# From project root (runs uvx rylai -o python/pdf_oxide/)
pdm run stub_gen
```

Config is in `rylai.toml` at the crate root. Output is written under `python/pdf_oxide/` and bundled into the wheel by maturin.

## Benchmark Scripts

Performance benchmarking and comparison utilities.

### Running Benchmarks
```bash
cargo bench
```

For detailed benchmarking, see `../test_datasets/` and comparison scripts.

## Running Python Scripts
```bash
# Install corresponding python dependencies
uv sync --group {group_name} # Refer to [dependency-groups] in pyproject.toml
uv run scripts/{script_name}.py {args}
```


## Development

These scripts are for development purposes and are not part of the public API.

## Contributing

When adding new scripts:
1. Add clear documentation in this README
2. Include usage examples
3. Add error handling
4. Follow existing patterns

See [../CONTRIBUTING.md](../CONTRIBUTING.md) for general guidelines.

## License

All scripts are licensed under MIT OR Apache-2.0, same as the main library.