1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
//! # harumi
//!
//! A pure-Rust library for overlaying text onto existing PDFs, with full
//! support for CJK (Chinese / Japanese / Korean) fonts.
//!
//! ## Use cases
//!
//! | Scenario | Key API |
//! |---|---|
//! | OCR invisible text layer | `add_invisible_text` · `add_invisible_text_runs` |
//! | AI / RAG text extraction | `extract_text_runs` · `extract_text_chunks` · `extract_as_markdown` |
//! | PDF watermark / stamp | `add_text` · `add_text_with_rotation` |
//! | Scanned PDF → searchable | `add_invisible_text` + hOCR helpers (`ocr` feature) |
//! | HTML → PDF | `render_html_to_pdf` (`html` feature) |
//! | PDF text replacement | `replace_text` · `replace_text_resubset` |
//! | Page merge / split | `merge_from` · `extract_pages` |
//! | WASM / Edge / Lambda | All APIs — zero C/C++ dependencies |
//!
//! ## Motivation
//!
//! Rust lacks a high-level, zero-C-dependency library for injecting text into
//! existing PDFs. Low-level crates like `lopdf` expose the raw PDF object graph
//! and require manual CID font assembly. `harumi` wraps that complexity behind
//! a simple, ergonomic API.
//!
//! ## Quick start
//!
//! ```no_run
//! use harumi::{Document, TextRun};
//!
//! # fn main() -> harumi::Result<()> {
//! let mut doc = Document::from_file("scanned.pdf")?;
//! let font = doc.embed_font(include_bytes!("../tests/fixtures/NotoSansJP-Regular.ttf"))?;
//!
//! // Invisible OCR text layer
//! doc.page(1)?.add_invisible_text("日本語テキスト", font, [72.0, 700.0], 12.0)?;
//!
//! // Visible red label
//! doc.page(1)?.add_text("CONFIDENTIAL", font, [72.0, 750.0], 18.0, [0.8, 0.0, 0.0])?;
//!
//! doc.save("output.pdf")?;
//! # Ok(())
//! # }
//! ```
//!
//! ## Coordinate system
//!
//! All coordinates are in **PDF points** (1 pt = 1/72 inch). The origin is at
//! the **bottom-left** of the page. Use [`page.size()`](PageHandle::size) to
//! query the page dimensions and position text relative to them.
//!
//! ## Font subsetting
//!
//! [`embed_font`](Document::embed_font) stores the raw TTF bytes without
//! processing. At [`save`](Document::save) time, harumi collects every
//! character used across all pages, runs a single subset per font, and embeds
//! the result. This means subsetting overhead is paid once regardless of how
//! many pages or text runs reference the same font.
//!
//! ## Feature flags
//!
//! | Flag | Enables | Extra deps |
//! |---------------------|---------|------------|
//! | `ocr` | hOCR pixel→PDF coordinate helpers | none |
//! | `draw` | Shapes: rect, line, ellipse, polygon, path | none |
//! | `image` | JPEG/PNG embed + extraction; enables `draw` | `image` crate |
//! | `flow` | `FlowDocument` auto-pagination builder + headers/footers | none |
//! | `html` | HTML→PDF renderer; enables `flow` | `scraper` |
//! | `digital-signature` | PDF digital signature metadata extraction | `cms`, `rsa`, `x509-cert` |
pub
pub use ;
pub use ;
pub use ;
pub use ;
pub use FontHandle;
pub use ;
pub use ;
pub use ;
pub use SignatureInfo;
// Re-export lopdf for integration test access.
pub use lopdf;