Expand description
§harumi
A pure-Rust library for overlaying text onto existing PDFs, with full support for CJK (Japanese / Chinese / Korean) fonts.
§Motivation
Rust lacks a high-level, zero-C-dependency library for injecting text into
existing PDFs. Low-level crates like lopdf expose the raw PDF object graph
and require manual CID font assembly. harumi wraps that complexity behind
a simple, ergonomic API.
§Quick start
use harumi::{Document, TextRun};
let mut doc = Document::from_file("scanned.pdf")?;
let font = doc.embed_font(include_bytes!("../tests/fixtures/NotoSansJP-Regular.ttf"))?;
// Invisible OCR text layer
doc.page(1)?.add_invisible_text("日本語テキスト", font, [72.0, 700.0], 12.0)?;
// Visible red label
doc.page(1)?.add_text("CONFIDENTIAL", font, [72.0, 750.0], 18.0, [0.8, 0.0, 0.0])?;
doc.save("output.pdf")?;§Coordinate system
All coordinates are in PDF points (1 pt = 1/72 inch). The origin is at
the bottom-left of the page. Use page.size() to
query the page dimensions and position text relative to them.
§Font subsetting
embed_font stores the raw TTF bytes without
processing. At save time, harumi collects every
character used across all pages, runs a single subset per font, and embeds
the result. This means subsetting overhead is paid once regardless of how
many pages or text runs reference the same font.
§Feature flags
| Flag | What it enables |
|---|---|
ocr | [ocr] module: helpers for converting Tesseract/hOCR pixel coordinates to PDF points |
draw | [PageHandle::add_rect], [PageHandle::add_line] — filled rectangles and stroked lines (no extra dependencies) |
image | [PageHandle::add_image], [PageHandle::add_image_with_opacity] — JPEG/PNG raster image overlay (enables draw, adds image crate) |
Structs§
- Document
- An existing PDF document that can be annotated with text overlays.
- Font
Handle - Opaque handle to a font registered with [
Document::embed_font]. - Page
Handle - A handle to a specific page for queuing text overlays.
- PdfMetadata
- PDF /Info dictionary fields.
- Text
Fragment - A text fragment extracted from a page content stream.
- TextRun
- A single text placement descriptor for use with
PageHandle::add_invisible_text_runs.
Enums§
- Error
- Errors returned by harumi operations.
- Vertical
Align - Vertical alignment for
PageHandle::add_text_box_aligned.
Type Aliases§
- Result
- Alias for
std::result::Result<T, harumi::Error>.