1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
//! # harumi
//!
//! Pure-Rust PDF library — CJK font embedding (Chinese/Japanese/Korean),
//! OCR text overlay, text extraction, HTML→PDF, page merge/split.
//! Zero C/C++ dependencies. WASM-compatible.
//!
//! ## Use cases
//!
//! | Scenario | Key API |
//! |---|---|
//! | OCR invisible text layer | `add_invisible_text` · `add_invisible_text_runs` |
//! | AI / RAG text extraction | `extract_text_runs` · `extract_text_chunks` · `extract_as_markdown` |
//! | PDF watermark / stamp | `add_text` · `add_text_with_rotation` |
//! | Scanned PDF → searchable | `add_invisible_text` + hOCR helpers (`ocr` feature) |
//! | HTML → PDF | `render_html_to_pdf` (`html` feature) |
//! | PDF text replacement | `replace_text` · `replace_text_resubset` |
//! | Page merge / split | `merge_from` · `extract_pages` |
//! | Digital signature creation | `sign_document` · `add_signature_field` (`digital-signature` feature) |
//! | WASM / Edge / Lambda | All APIs — zero C/C++ dependencies |
//!
//! ## Motivation
//!
//! Rust lacks a high-level, zero-C-dependency library for injecting text into
//! existing PDFs. Low-level crates like `lopdf` expose the raw PDF object graph
//! and require manual CID font assembly. `harumi` wraps that complexity behind
//! a simple, ergonomic API.
//!
//! ## Quick start
//!
//! ```no_run
//! use harumi::{Document, TextRun};
//!
//! # fn main() -> harumi::Result<()> {
//! let mut doc = Document::from_file("scanned.pdf")?;
//! let font = doc.embed_font(include_bytes!("../tests/fixtures/NotoSansJP-Regular.ttf"))?;
//!
//! // Invisible OCR text layer
//! doc.page(1)?.add_invisible_text("日本語テキスト", font, [72.0, 700.0], 12.0)?;
//!
//! // Visible red label
//! doc.page(1)?.add_text("CONFIDENTIAL", font, [72.0, 750.0], 18.0, [0.8, 0.0, 0.0])?;
//!
//! doc.save("output.pdf")?;
//! # Ok(())
//! # }
//! ```
//!
//! ## Coordinate system
//!
//! All coordinates are in **PDF points** (1 pt = 1/72 inch). The origin is at
//! the **bottom-left** of the page. Use [`page.size()`](PageHandle::size) to
//! query the page dimensions and position text relative to them.
//!
//! ## Font subsetting
//!
//! [`embed_font`](Document::embed_font) stores the raw TTF bytes without
//! processing. At [`save`](Document::save) time, harumi collects every
//! character used across all pages, runs a single subset per font, and embeds
//! the result. This means subsetting overhead is paid once regardless of how
//! many pages or text runs reference the same font.
//!
//! ## Feature flags
//!
//! | Flag | Enables | Extra deps |
//! |---------------------|---------|------------|
//! | `ocr` | hOCR pixel→PDF coordinate helpers | none |
//! | `draw` | Shapes: rect, line, ellipse, polygon, path | none |
//! | `image` | JPEG/PNG embed + extraction; enables `draw` | `png` crate |
//! | `flow` | `FlowDocument` auto-pagination builder + headers/footers | none |
//! | `html` | HTML→PDF renderer; enables `flow` | none (internal tokenizer) |
//! | `digital-signature` | Create and verify PKCS#7/CMS signatures | RustCrypto crates |
pub
pub
pub
pub use ;
pub use ;
pub use ;
pub use ;
pub use FontHandle;
pub use ;
pub use ;
pub use ;
pub use SignatureInfo;
pub use ;
// Re-export lopdf for integration test access.
pub use lopdf;