rasterrocket
Pure Rust PDF → pixels pipeline. Zero Poppler, zero subprocesses, zero Leptonica in the render path.
Renders PDF pages to 8-bit grayscale pixel buffers for direct consumption by Tesseract OCR or any other downstream consumer. No intermediate files.
# Cargo.toml — library
= "1.0"
# CLI — drop-in pdftoppm replacement
What's new in v1.0.0
- Spec-correct simple-font text. The
Widths-array lookup now prevails over FreeType metrics for all embedded and non-embedded simple fonts, per PDF §9.2.4. Academic and English-language PDFs show the largest improvement: avg RMSE vs pdftoppm dropped 17–19 points on corpus-05 and corpus-14. - Vulkan compute backend.
--features vulkanor--backend vulkanruns AA fill, tile fill, and parallel-Huffman JPEG decode on any Vulkan 1.3+ device — NVIDIA, AMD, Intel, or Apple viaMoltenVK. Verified on RTX 5070; cross-vendor smoke pending hardware. Underauto, Vulkan is preferred over CUDA when both are compiled in (faster process init on single-session workloads). - Process-static GPU init. The CUDA and Vulkan contexts are initialised once per process (not once per
open_session). Short-lived multi-document pipelines no longer pay ~240 ms per document. PDF_RASTER_BACKENDenv var. Switch backends at runtime without recompiling. Valid values:auto,cpu,cuda,vaapi,vulkan. The CLI--backendflag takes precedence.- Parallel-Huffman JPEG. GPU-accelerated Huffman decode (
gpu-jpeg-huffmanfeature, implied byvulkan) is wired into the production decode path. Dormant by default (threshold =u32::MAX); enable withPDF_RASTER_HUFFMAN_THRESHOLD=0for benchmarking. RasterOptions::default()—dpi = 300,first_page = 1,last_page = u32::MAX,deskew = false,pages = None. Use..RasterOptions::default()to fill unset fields.PageSet— render a sparse subset of pages without visiting intermediate ones.
use ;
let opts = RasterOptions ;
for in raster_pdf
Documentation
| Document | Contents |
|---|---|
| Getting Started | Installation, quickstart, Tesseract integration, DPI guidance, error handling, security |
| API Reference | Full signatures for raster_pdf, render_channel, RasterOptions, RenderedPage, RasterError, PageDiagnostics, feature flags, GPU dispatch thresholds |
| CLI Reference | All rrocket command-line flags, output format matrix, examples, pixel-diff comparison |
| Benchmarks | Methodology, 10-document corpus results, CPU-only AVX-512 vs AVX2, GPU-accelerated, reproduction steps |
| OCR Integration | Tesseract (leptess) and ocrs — instance reuse, zero-copy patterns, DPI wiring, multi-threaded examples |
| LLM Vision OCR Integration | Google Cloud Vision and GPT-5 — encoding helper, Rust + Python examples, cost and latency guidance |
Crates
| Crate | What you get |
|---|---|
rasterrocket |
Library — raster_pdf, render_channel, RasterOptions, RenderedPage |
rasterrocket-cli |
rrocket binary — drop-in pdftoppm replacement |
Hardware compatibility
CPU: x86-64 (AMD and Intel) and aarch64 (ARM). AVX2/AVX-512 on x86-64; NEON (and SVE2 on nightly) on aarch64. Build with -C target-cpu=native to enable AVX-512 or native NEON width.
GPU (optional):
- NVIDIA via CUDA 12 or 13 — full feature set (nvJPEG, nvJPEG2000, AA fill, ICC CLUT, ICC matrix, deskew, image cache).
cudarcis pinned to thecuda-12080driver-API binding so the same source builds against both 12.x and 13.x drivers (forward-compatible per the CUDA driver-API ABI). - Cross-vendor via Vulkan compute — AA fill, tile fill, and parallel-Huffman JPEG decode kernels run on any Vulkan 1.3+ device (NVIDIA, AMD, Intel, Apple via
MoltenVK). Verified on RTX 5070; cross-vendor smoke pending hardware. No nvJPEG / cache support under Vulkan today (JPEG decode goes through the GPU parallel-Huffman path, not nvJPEG). - Linux iGPU/dGPU via VA-API — JPEG baseline decode on AMD VCN, Intel Quick Sync, Intel Arc.
All GPU features fall back to CPU automatically when unavailable. AMD/Radeon ROCm and Apple Metal-native backends are not implemented (Vulkan covers Apple via MoltenVK).
Build
# CPU-only (no CUDA)
# With all GPU features (CUDA 12 or 13 toolkit, NVIDIA GPU required)
# Default CUDA_ARCH is sm_80 (Ampere); override for older or newer GPUs.
CUDA_ARCH=sm_120
# With Vulkan compute backend (cross-vendor; no NVIDIA dependency).
# Requires the LunarG Vulkan SDK on the build host (slangc compiles the
# .slang shaders to SPIR-V). Vulkan 1.3+ ICD on the runtime host.
Picking CUDA_ARCH for your GPU
The CUDA_ARCH environment variable controls which Compute Capability the PTX kernels target. Mismatched arch flags produce kernels the GPU can't load at runtime. Set it to your card's CC (e.g. sm_75, sm_86, sm_120).
| GPU generation | Architecture | CUDA_ARCH |
|---|---|---|
| GTX 10-series | Pascal | sm_61 |
| RTX 20-series, Quadro RTX | Turing | sm_75 |
| RTX 30-series, A100 | Ampere | sm_80 / sm_86 |
| RTX 40-series | Ada Lovelace | sm_89 |
| H100 / Hopper | Hopper | sm_90 |
| RTX 50-series | Blackwell | sm_120 |
Look up your card's exact Compute Capability at developer.nvidia.com/cuda-gpus. The build defaults to sm_80 if CUDA_ARCH is unset; that's a reasonable fallback for any Ampere-or-later card thanks to PTX forward-compatibility, but matching your hardware exactly produces better-optimised code.
Feature flags
| Flag | What it enables | Required runtime |
|---|---|---|
nvjpeg |
GPU JPEG decode for DCTDecode |
libnvjpeg.so (ships with CUDA 12 or 13 toolkit) |
nvjpeg2k |
GPU JPEG-2000 decode for JPXDecode |
libnvjpeg2k.so |
gpu-aa |
GPU supersampled anti-aliased fill | CUDA |
gpu-icc |
GPU CMYK→RGB ICC transform | CUDA |
gpu-deskew |
GPU deskew rotation via NPP | CUDA + NPP |
cache |
Device-resident image cache (3-tier VRAM/host/disk) | CUDA |
vaapi |
Linux iGPU/dGPU JPEG decode (AMD/Intel) | libva.so.2 + DRM render node |
vulkan |
Vulkan compute backend for AA fill, tile fill, and parallel-Huffman JPEG decode (cross-vendor) | Vulkan 1.3+ ICD; pulls in gpu-aa and gpu-jpeg-huffman. Slang shaders compiled to SPIR-V via slangc from the LunarG Vulkan SDK |
All GPU features fall back to CPU automatically when the runtime requirement is missing, except --backend cuda / --backend vulkan / --backend vaapi which fail loudly with a clear error.
Backend selection
The runtime backend is chosen from three sources, in priority order:
- The CLI
--backend {auto,cpu,cuda,vaapi,vulkan}flag. - The
PDF_RASTER_BACKENDenvironment variable (same valid values). - The compile-time default —
auto.
Under auto, when both backends are compiled in, Vulkan is preferred over CUDA. Vulkan's per-process init is faster and the kernel dispatch is comparable on the workloads that matter; CUDA wins narrowly when the device-resident cache feature is firing and amortising across many pages from one session. Both backends fall through to CPU when their runtime is unavailable; --backend cuda / --backend vulkan make the failure loud instead.
# Ship a Vulkan-default binary, override per-process when you need CUDA:
PDF_RASTER_BACKEND=cuda
# CLI flag always wins over the env var:
PDF_RASTER_BACKEND=cuda
Compile cache (sccache) — optional
.cargo/config.toml sets SCCACHE_CACHE_MULTIARCH=1 so -C target-cpu=native builds can be cached, but the wrapper itself is opt-in: plain cargo build works without sccache. To enable:
# add to ~/.bashrc to make it permanent
Shared with any other Rust project on the same machine that opts in — cross-project cache keys don't collide (sccache hashes the full compiler args). Verify with sccache --show-stats (a healthy hit-rate after the first build is ≥ 70%). Do not rsync ~/.cache/sccache between machines with different CPUs — target-cpu=native resolves differently per arch.
Testing
# Unit tests (always filter by module, never run unfiltered)
# Pixel-diff comparison against pdftoppm (requires release build in PATH)
Performance
Benchmarks vs Poppler's pdftoppm on a 10-document corpus at 150 DPI. Full methodology, hardware details, and AVX2 vs AVX-512 comparison in the Benchmarks wiki page.
CPU-only (no GPU), Ryzen 9 9900X3D + AVX-512, v0.9.1, RAM-backed output, cold cache, hyperfine 5 runs:
| Document | Pages | rasterrocket |
|---|---|---|
| Native text, small | 16 | 41 ms ± 1 ms |
| Native vector + text | 16 | 18 ms ± 1 ms |
| Native text, dense | 254 | 231 ms ± 2 ms |
| Ebook, mixed | 358 | 278 ms ± 3 ms |
| Academic book | 601 | 582 ms ± 12 ms |
| Modern layout, DCT | 160 | 1 450 ms ± 10 ms |
| Journal, DCT-heavy | 162 | 783 ms ± 5 ms |
| 1927 scan, DCT | 390 | 1 652 ms ± 89 ms |
| 1836 scan, DCT | 490 | 2 859 ms ± 658 ms |
| Scan, JBIG2+JPX | 576 | 17 616 ms ± 260 ms |
Per-version regression history and the full pdftoppm comparison are in the Benchmarks wiki page.
GPU-accelerated (CUDA: nvJPEG + nvJPEG2000), same machine + RTX 5070 (v0.9.1):
| Document | Pages | rasterrocket | pdftoppm | Speedup |
|---|---|---|---|---|
| Native text, dense | 254 | 4.3 s | 9.8 s | 2.3× |
| 1927 scan, DCT | 390 | 50 s | 279 s | 5.6× |
| 1836 scan, DCT | 490 | 71 s | 356 s | 5.0× |
| Scan, JBIG2+JPX | 576 | 19.6 s | 148.9 s | 7.6× |
Largest gains on scan-heavy corpora where SIMD JPEG decoding (CPU) and nvJPEG/nvJPEG2000 (GPU) dominate. Short native-text PDFs are startup-bound and show modest gains. See the Benchmarks wiki page for the full table including an Intel i7-8700K (AVX2-only) comparison.