oxidize-pdf-api 1.1.7

REST API for oxidizePdf (Community edition)
Documentation
# oxidize-pdf-api

REST API for oxidizePdf - A high-performance PDF manipulation library written in Rust.

## Features

- **RESTful API**: Clean and intuitive HTTP endpoints for PDF operations
- **Async Processing**: Built on Tokio for high-performance async I/O
- **File Upload Support**: Multipart form data handling for PDF uploads
- **CORS Support**: Cross-origin resource sharing for web applications
- **Community Edition**: Full-featured API for common PDF operations

## Installation

### From Source
```bash
git clone https://github.com/yourusername/oxidize-pdf
cd oxidize-pdf/oxidize-pdf-api
cargo build --release
```

### Using Cargo
```bash
cargo install oxidize-pdf-api
```

## Running the Server

```bash
# Default port 3000
oxidize-pdf-api

# Custom port
oxidize-pdf-api --port 8080

# With environment variables
OXIDIZEPDF_API_PORT=8080 RUST_LOG=info oxidize-pdf-api
```

## API Endpoints

### Health Check
```
GET /health
```
Returns server status and version information.

### Merge PDFs
```
POST /api/merge
Content-Type: multipart/form-data
```

Merge multiple PDF files into a single document.

**Request:**
- `files`: Multiple PDF files (form field name: `file`)
- `preserve_metadata`: Boolean (optional, default: true)
- `optimize`: Boolean (optional, default: false)

**Response:**
```json
{
  "success": true,
  "pages": 42,
  "size": 1048576,
  "processing_time_ms": 150
}
```

Returns the merged PDF file with appropriate headers.

### Split PDF
```
POST /api/split
Content-Type: multipart/form-data
```

Split a PDF into individual pages or chunks.

**Request:**
- `file`: PDF file to split
- `mode`: "pages" | "chunks" (optional, default: "pages")
- `chunk_size`: Number (optional, for chunks mode)

**Response:** ZIP file containing split PDFs

### Extract Pages
```
POST /api/extract
Content-Type: multipart/form-data
```

Extract specific pages from a PDF.

**Request:**
- `file`: Source PDF file
- `pages`: Page range (e.g., "1-5,10,15-20")

**Response:** PDF containing only the specified pages

### Rotate Pages
```
POST /api/rotate
Content-Type: multipart/form-data
```

Rotate pages in a PDF.

**Request:**
- `file`: PDF file
- `angle`: 90 | 180 | 270
- `pages`: Page range (optional, rotates all if not specified)

**Response:** PDF with rotated pages

### Extract Text
```
POST /api/text
Content-Type: multipart/form-data
```

Extract text content from a PDF.

**Request:**
- `file`: PDF file
- `pages`: Page range (optional)
- `format`: "plain" | "json" (optional, default: "plain")

**Response:**
- Plain text or JSON structure with page-by-page text

### Get PDF Info
```
POST /api/info
Content-Type: multipart/form-data
```

Get metadata and information about a PDF.

**Request:**
- `file`: PDF file

**Response:**
```json
{
  "pages": 10,
  "title": "Document Title",
  "author": "Author Name",
  "subject": "Document Subject",
  "keywords": "pdf, document",
  "creator": "oxidize_pdf",
  "producer": "oxidize_pdf v0.1.0",
  "creation_date": "2024-01-01T12:00:00Z",
  "modification_date": "2024-01-15T14:30:00Z",
  "file_size": 1048576,
  "pdf_version": "1.7"
}
```

## Example Usage

### cURL
```bash
# Merge PDFs
curl -X POST http://localhost:3000/api/merge \
  -F "file=@doc1.pdf" \
  -F "file=@doc2.pdf" \
  -F "file=@doc3.pdf" \
  -o merged.pdf

# Extract pages
curl -X POST http://localhost:3000/api/extract \
  -F "file=@input.pdf" \
  -F "pages=1-5,10" \
  -o extracted.pdf

# Get PDF info
curl -X POST http://localhost:3000/api/info \
  -F "file=@document.pdf"
```

### JavaScript/Fetch
```javascript
// Merge PDFs
const formData = new FormData();
formData.append('file', file1);
formData.append('file', file2);
formData.append('preserve_metadata', 'true');

const response = await fetch('http://localhost:3000/api/merge', {
  method: 'POST',
  body: formData
});

const blob = await response.blob();
// Handle the merged PDF blob
```

### Python
```python
import requests

# Merge PDFs
files = [
    ('file', open('doc1.pdf', 'rb')),
    ('file', open('doc2.pdf', 'rb'))
]

response = requests.post(
    'http://localhost:3000/api/merge',
    files=files
)

with open('merged.pdf', 'wb') as f:
    f.write(response.content)
```

## Configuration

Environment variables:
- `OXIDIZEPDF_API_PORT`: Server port (default: 3000)
- `OXIDIZEPDF_API_HOST`: Server host (default: 0.0.0.0)
- `OXIDIZEPDF_MAX_FILE_SIZE`: Maximum upload size in bytes (default: 100MB)
- `RUST_LOG`: Logging level (trace, debug, info, warn, error)

## Error Handling

All endpoints return consistent error responses:

```json
{
  "error": "Error message",
  "code": "ERROR_CODE",
  "details": "Additional error details"
}
```

HTTP status codes:
- 200: Success
- 400: Bad Request (invalid input)
- 413: Payload Too Large
- 415: Unsupported Media Type
- 500: Internal Server Error

## Security

- File size limits to prevent DoS
- Input validation for all parameters
- Temporary file cleanup after processing
- CORS configuration for web security

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## See Also

- [oxidize-pdf]https://crates.io/crates/oxidize-pdf - Core PDF library
- [oxidize-pdf-cli]https://crates.io/crates/oxidize-pdf-cli - CLI tool
- [API Documentation]https://github.com/yourusername/oxidize-pdf/blob/main/oxidize-pdf-api/API_DOCUMENTATION.md - Detailed API docs