Skip to main content

Crate harumi

Crate harumi 

Source
Expand description

§harumi

A pure-Rust library for overlaying text onto existing PDFs, with full support for CJK (Japanese / Chinese / Korean) fonts.

§Motivation

Rust lacks a high-level, zero-C-dependency library for injecting text into existing PDFs. Low-level crates like lopdf expose the raw PDF object graph and require manual CID font assembly. harumi wraps that complexity behind a simple, ergonomic API.

§Quick start

use harumi::{Document, TextRun};

let mut doc = Document::from_file("scanned.pdf")?;
let font = doc.embed_font(include_bytes!("../tests/fixtures/NotoSansJP-Regular.ttf"))?;

// Invisible OCR text layer
doc.page(1)?.add_invisible_text("日本語テキスト", font, [72.0, 700.0], 12.0)?;

// Visible red label
doc.page(1)?.add_text("CONFIDENTIAL", font, [72.0, 750.0], 18.0, [0.8, 0.0, 0.0])?;

doc.save("output.pdf")?;

§Coordinate system

All coordinates are in PDF points (1 pt = 1/72 inch). The origin is at the bottom-left of the page. Use page.size() to query the page dimensions and position text relative to them.

§Font subsetting

embed_font stores the raw TTF bytes without processing. At save time, harumi collects every character used across all pages, runs a single subset per font, and embeds the result. This means subsetting overhead is paid once regardless of how many pages or text runs reference the same font.

§Feature flags

FlagWhat it enables
ocr[ocr] module: helpers for converting Tesseract/hOCR pixel coordinates to PDF points
draw[PageHandle::add_rect], [PageHandle::add_line] — filled rectangles and stroked lines (no extra dependencies)
image[PageHandle::add_image], [PageHandle::add_image_with_opacity] — JPEG/PNG raster image overlay (enables draw, adds image crate)

Structs§

Document
An existing PDF document that can be annotated with text overlays.
FontHandle
Opaque handle to a font registered with [Document::embed_font].
PageHandle
A handle to a specific page for queuing text overlays.
PdfMetadata
PDF /Info dictionary fields.
TextFragment
A text fragment extracted from a page content stream.
TextRun
A single text placement descriptor for use with PageHandle::add_invisible_text_runs.

Enums§

Error
Errors returned by harumi operations.
VerticalAlign
Vertical alignment for PageHandle::add_text_box_aligned.

Type Aliases§

Result
Alias for std::result::Result<T, harumi::Error>.