Expand description
§pdfer_forms — Fast Pure-Rust PDF Forms & Document Operations
pdfer_forms is a fast, pure-Rust library for filling PDF forms, inspecting AcroForm fields, flattening fillable PDFs, and document operations (merge, split, rotate, encrypt/decrypt) — with an API modeled on Python’s pypdf and PyPDF2.
Built and maintained by Clark Labs Inc.
- Pure Rust, no Python, no C dependencies (built on
lopdf) - Fast — up to 23× faster than pypdf on real-world government forms (see benchmarks below)
- PyPDF / PyPDF2 compatibility layer — familiar
get_fields,update_page_form_field_values,updatePageFormFieldValues, etc. - AcroForm inspection, form filling, and flattening in one crate
- Document operations — merge, split, rotate, encrypt, decrypt (replaces
qpdfCLI) #![forbid(unsafe_code)]
§Why pdfer_forms?
If you are porting a Python PDF workflow to Rust, or building a Rust service that needs to fill fillable PDFs (government forms, tax forms, application forms, contracts), the pure-Rust PDF ecosystem has historically been thin on AcroForm support. pdfer_forms closes that gap:
- Inspect every AcroForm field, including qualified and partial names
- Fill text, checkbox, radio, and choice (listbox / combo) fields
- Regenerate widget appearance streams so filled values render in every viewer
- Flatten forms (draw appearances into page content) for archival or print
- Strip widget annotations for final delivery
- Reattach orphan widgets to
/AcroForm /Fields - Toggle
/NeedAppearances, group fields under a top-level name, and more
§Install
[dependencies]
pdfer_forms = "0.2"Requires Rust 1.85+ (set by the pinned lopdf = 0.40 dependency).
§Quick start — fill a PDF form in Rust
use pdfer_forms::{FieldInput, PageSelection, PdfReaderCompat, PdfWriterCompat};
use std::collections::BTreeMap;
fn main() -> pdfer_forms::Result<()> {
let reader = PdfReaderCompat::load("input.pdf")?;
let fields = reader.get_fields()?;
println!("fields: {fields:#?}");
let mut writer = PdfWriterCompat::from_reader(&reader);
let mut updates = BTreeMap::new();
updates.insert("sender.city".to_string(), FieldInput::from("Paris"));
updates.insert(
"sender.name".to_string(),
FieldInput::from(("Alice Example", "/Helv", 11.0)),
);
writer.update_page_form_field_values(
PageSelection::All,
&updates,
0,
Some(false),
false,
)?;
writer.save("output.pdf")?;
Ok(())
}§Document operations — merge, split, rotate, encrypt
New in 0.2.0: the ops module provides pure-Rust replacements for qpdf / pdftk CLI operations.
use pdfer_forms::ops;
fn main() -> pdfer_forms::Result<()> {
// Merge multiple PDFs
let mut merged = ops::merge_files(&["doc1.pdf", "doc2.pdf", "doc3.pdf"])?;
merged.save("merged.pdf")?;
// Split: extract pages 1 and 3
let doc = lopdf::Document::load("input.pdf")?;
let mut subset = ops::split_pages(&doc, &[1, 3])?;
subset.save("pages_1_3.pdf")?;
// Split into one PDF per page
let mut pages = ops::split_each_page(&doc)?;
for (i, page_doc) in pages.iter_mut().enumerate() {
page_doc.save(format!("page_{}.pdf", i + 1))?;
}
// Rotate pages 90° clockwise
let mut doc = lopdf::Document::load("input.pdf")?;
ops::rotate_pages(&mut doc, &[1, 2], 90)?;
doc.save("rotated.pdf")?;
// Encrypt with passwords
let mut doc = lopdf::Document::load("input.pdf")?;
ops::encrypt_document(&mut doc, "user_pass", "owner_pass")?;
doc.save("encrypted.pdf")?;
// Decrypt
let mut doc = lopdf::Document::load("encrypted.pdf")?;
ops::decrypt_document(&mut doc, "user_pass")?;
doc.save("decrypted.pdf")?;
Ok(())
}§Available functions
| Function | Description |
|---|---|
ops::merge_documents(docs) | Merge multiple lopdf::Documents into one |
ops::merge_files(paths) | Load and merge PDFs from file paths |
ops::split_pages(doc, pages) | Extract specific pages (1-based) into a new document |
ops::split_each_page(doc) | Split into one document per page |
ops::rotate_pages(doc, pages, degrees) | Rotate pages by 0/90/180/270 degrees |
ops::encrypt_document(doc, user_pw, owner_pw) | Encrypt with AES-128 |
ops::decrypt_document(doc, password) | Decrypt with password |
§Features
- AcroForm tree inspection
- Qualified and partial field names
- Text field value extraction
- Page lookup for repeated widgets
- Top-level form grouping / renaming
/NeedAppearancescontrol- Page-scoped field filling
- Text and choice appearance regeneration
- Button state updates for checkboxes and radio groups
- Orphan widget reattachment to
/AcroForm /Fields - Annotation removal for post-flatten cleanup
- Optional flatten step that draws widget appearance streams into page content
FieldInput::KeepCurrentfor flattening without changing the stored value
§Known caveats
- Generated text appearances use built-in Type1 fonts and a simple WinAnsi text stream. The stored field value uses UTF-16BE, but generated appearance content itself is safest for ASCII / WinAnsi text.
- Signature-field appearance generation is not implemented.
- The API is intentionally close to pypdf / PyPDF2, but remains idiomatic Rust rather than mimicking Python objects exactly.
§PyPDF / PyPDF2 API compatibility
pdfer_forms mirrors the form-manipulation surface of pypdf and PyPDF2, including camelCase aliases:
| pypdf / PyPDF2 | pdfer_forms |
|---|---|
PdfReader.get_fields() | PdfReaderCompat::get_fields |
PdfReader.get_form_text_fields() | PdfReaderCompat::get_form_text_fields |
PdfReader.get_pages_showing_field() | PdfReaderCompat::get_pages_showing_field |
PdfWriter.add_form_topname() | PdfReaderCompat::add_form_topname |
PdfWriter.rename_form_topname() | PdfReaderCompat::rename_form_topname |
PdfWriter.set_need_appearances_writer() | PdfWriterCompat::set_need_appearances_writer |
PdfWriter.update_page_form_field_values() | PdfWriterCompat::update_page_form_field_values |
PdfWriter.reattach_fields() | PdfWriterCompat::reattach_fields |
PdfWriter.remove_annotations() | PdfWriterCompat::remove_annotations |
updatePageFormFieldValues (PyPDF2) | updatePageFormFieldValues |
setNeedAppearancesWriter (PyPDF2) | setNeedAppearancesWriter |
§Reader-like surface
use pdfer_forms::PdfReaderCompat;
let mut reader = PdfReaderCompat::load("form.pdf")?;
let all_fields = reader.get_fields()?;
let text_fields = reader.get_form_text_fields(false)?;
let pages = reader.get_pages_showing_field("sender.city")?;
reader.add_form_topname("form1")?;
reader.rename_form_topname("renamed_form")?;§Writer-like surface
use pdfer_forms::{FieldInput, PageSelection, PdfWriterCompat};
use std::collections::BTreeMap;
let mut writer = PdfWriterCompat::load("form.pdf")?;
writer.set_need_appearances_writer(false)?;
let mut fields = BTreeMap::new();
fields.insert("check1".into(), FieldInput::from("/Yes"));
fields.insert("city".into(), FieldInput::from("Berlin"));
fields.insert("choices".into(), FieldInput::from(vec!["A".into(), "C".into()]));
writer.update_page_form_field_values(
PageSelection::Index(0),
&fields,
0,
Some(false),
true,
)?;
writer.remove_annotations(Some(&["/Widget"]))?;
writer.save("flattened.pdf")?;§PyPDF2 camelCase shims
use pdfer_forms::{PageSelection, PdfWriterCompat};
use std::collections::BTreeMap;
let mut writer = PdfWriterCompat::load("form.pdf")?;
writer.setNeedAppearancesWriter()?;
let mut fields = BTreeMap::new();
fields.insert("city".to_string(), "Berlin".to_string());
writer.updatePageFormFieldValues(PageSelection::Index(0), &fields, 0)?;§Main types
PdfReaderCompat— pypdf-style reader wrapperPdfWriterCompat— pypdf-style writer wrapperFormField— an AcroForm field with value, type, and widgetsFieldValue— decoded field value (text, button state, choice list)FieldInput— input variant for field updates (text, button, choice,KeepCurrent)PageSelection—AllorIndex(n)scope for updatesPageHandle— page identity helperFieldSpecifier— qualified / partial field name resolver
§Benchmarks — pdfer_forms vs pypdf / PyPDF2
Benchmarked against pypdf 6.9.2 and PyPDF2 3.0.1 on 9 real-world government PDF forms (IRS, USCIS, GSA, Hong Kong IRD, Guatemala SAT) in English, Spanish, and Chinese.
§Accuracy
| Metric | Result |
|---|---|
| Field name match rate | 1004/1011 (99.3%) |
| Field type match rate | 1004/1004 (100.0%) |
| Field value match rate | 1004/1004 (100.0%) |
The 7 name mismatches are encoding differences on a single Spanish-language PDF where pypdf decodes non-ASCII field names (e.g. DÍA) while pdfer_forms currently returns the raw bytes.
§Performance (average across 9 PDFs)
| Operation | pypdf | pdfer_forms | Speedup |
|---|---|---|---|
get_fields | 12.1 ms | 0.51 ms | 23.6× faster |
get_pages_showing_field | 2.2 ms | 0.47 ms | 4.8× faster |
fill_form | 40.3 ms | 11.0 ms | 3.7× faster |
remove_annotations | 26.4 ms | 6.8 ms | 3.9× faster |
get_form_text_fields | 1.2 ms | 0.49 ms | 2.4× faster |
load | 1.9 ms | 5.2 ms | 2.7× slower* |
*Load is slower because lopdf eagerly parses the full cross-reference table; pypdf uses lazy loading. For most workflows the total round-trip is still faster.
§API parity
All 6 core pypdf form APIs pass on every test PDF. PyPDF2-style camelCase aliases (getFields, updatePageFormFieldValues, etc.) are included.
§Related crates
lopdf— the pure-Rust PDF library this crate is built onprintpdf— for generating PDFs from scratchpdf— another pure-Rust PDF reader
§Contributing
Issues and pull requests are welcome at https://github.com/clark-labs-inc/pdfer-forms-rs.
§License
Licensed under either of
- Apache License, Version 2.0 (https://www.apache.org/licenses/LICENSE-2.0)
- MIT license (https://opensource.org/licenses/MIT)
at your option.
© Clark Labs Inc. pdfer_forms is not affiliated with the authors of pypdf or PyPDF2. pypdf and PyPDF2 are trademarks of their respective owners; compatibility is provided for porting convenience.
Modules§
- field_
flags - ops
- PDF document operations: merge, split, rotate, encrypt/decrypt.