unpdf 0.4.3 - Docs.rs

# unpdf

Python bindings for [unpdf](https://github.com/iyulab/unpdf) - High-performance PDF content extraction to Markdown, text, and JSON.

## Installation

```bash
pip install unpdf
```

## Quick Start

```python
import unpdf

# Convert PDF to Markdown
markdown = unpdf.to_markdown("document.pdf")
print(markdown)

# Convert PDF to plain text
text = unpdf.to_text("document.pdf")
print(text)

# Convert PDF to JSON
json_data = unpdf.to_json("document.pdf", pretty=True)
print(json_data)

# Get document information
info = unpdf.get_info("document.pdf")
print(info)

# Get page count
pages = unpdf.get_page_count("document.pdf")
print(f"Total pages: {pages}")

# Check if file is a valid PDF
is_valid = unpdf.is_pdf("document.pdf")
print(f"Is valid PDF: {is_valid}")
```

## API Reference

### `to_markdown(path: str) -> str`
Convert a PDF file to Markdown format.

### `to_text(path: str) -> str`
Convert a PDF file to plain text.

### `to_json(path: str, pretty: bool = False) -> str`
Convert a PDF file to JSON format.

### `get_info(path: str) -> dict`
Get document metadata (title, author, page count, etc.)

### `get_page_count(path: str) -> int`
Get the number of pages in a PDF file.

### `is_pdf(path: str) -> bool`
Check if a file is a valid PDF.

### `version() -> str`
Get the version of the native library.

## License

MIT License