Skip to main content

Crate pdfer_forms

Crate pdfer_forms 

Source
Expand description

§pdfer_forms — Fast Pure-Rust PDF Forms & Document Operations

Crates.io docs.rs Crates.io downloads License

pdfer_forms is a fast, pure-Rust library for filling PDF forms, inspecting AcroForm fields, flattening fillable PDFs, and document operations (merge, split, rotate, encrypt/decrypt) — with an API modeled on Python’s pypdf and PyPDF2.

Built and maintained by Clark Labs Inc.

  • Pure Rust, no Python, no C dependencies (built on lopdf)
  • Fast — up to 23× faster than pypdf on real-world government forms (see benchmarks below)
  • PyPDF / PyPDF2 compatibility layer — familiar get_fields, update_page_form_field_values, updatePageFormFieldValues, etc.
  • AcroForm inspection, form filling, and flattening in one crate
  • Document operations — merge, split, rotate, encrypt, decrypt (replaces qpdf CLI)
  • #![forbid(unsafe_code)]

§Why pdfer_forms?

If you are porting a Python PDF workflow to Rust, or building a Rust service that needs to fill fillable PDFs (government forms, tax forms, application forms, contracts), the pure-Rust PDF ecosystem has historically been thin on AcroForm support. pdfer_forms closes that gap:

  • Inspect every AcroForm field, including qualified and partial names
  • Fill text, checkbox, radio, and choice (listbox / combo) fields
  • Regenerate widget appearance streams so filled values render in every viewer
  • Flatten forms (draw appearances into page content) for archival or print
  • Strip widget annotations for final delivery
  • Reattach orphan widgets to /AcroForm /Fields
  • Toggle /NeedAppearances, group fields under a top-level name, and more

§Install

[dependencies]
pdfer_forms = "0.2"

Requires Rust 1.85+ (set by the pinned lopdf = 0.40 dependency).

§Quick start — fill a PDF form in Rust

use pdfer_forms::{FieldInput, PageSelection, PdfReaderCompat, PdfWriterCompat};
use std::collections::BTreeMap;

fn main() -> pdfer_forms::Result<()> {
    let reader = PdfReaderCompat::load("input.pdf")?;
    let fields = reader.get_fields()?;
    println!("fields: {fields:#?}");

    let mut writer = PdfWriterCompat::from_reader(&reader);

    let mut updates = BTreeMap::new();
    updates.insert("sender.city".to_string(), FieldInput::from("Paris"));
    updates.insert(
        "sender.name".to_string(),
        FieldInput::from(("Alice Example", "/Helv", 11.0)),
    );

    writer.update_page_form_field_values(
        PageSelection::All,
        &updates,
        0,
        Some(false),
        false,
    )?;

    writer.save("output.pdf")?;
    Ok(())
}

§Document operations — merge, split, rotate, encrypt

New in 0.2.0: the ops module provides pure-Rust replacements for qpdf / pdftk CLI operations.

use pdfer_forms::ops;

fn main() -> pdfer_forms::Result<()> {
    // Merge multiple PDFs
    let mut merged = ops::merge_files(&["doc1.pdf", "doc2.pdf", "doc3.pdf"])?;
    merged.save("merged.pdf")?;

    // Split: extract pages 1 and 3
    let doc = lopdf::Document::load("input.pdf")?;
    let mut subset = ops::split_pages(&doc, &[1, 3])?;
    subset.save("pages_1_3.pdf")?;

    // Split into one PDF per page
    let mut pages = ops::split_each_page(&doc)?;
    for (i, page_doc) in pages.iter_mut().enumerate() {
        page_doc.save(format!("page_{}.pdf", i + 1))?;
    }

    // Rotate pages 90° clockwise
    let mut doc = lopdf::Document::load("input.pdf")?;
    ops::rotate_pages(&mut doc, &[1, 2], 90)?;
    doc.save("rotated.pdf")?;

    // Encrypt with passwords
    let mut doc = lopdf::Document::load("input.pdf")?;
    ops::encrypt_document(&mut doc, "user_pass", "owner_pass")?;
    doc.save("encrypted.pdf")?;

    // Decrypt
    let mut doc = lopdf::Document::load("encrypted.pdf")?;
    ops::decrypt_document(&mut doc, "user_pass")?;
    doc.save("decrypted.pdf")?;

    Ok(())
}

§Available functions

FunctionDescription
ops::merge_documents(docs)Merge multiple lopdf::Documents into one
ops::merge_files(paths)Load and merge PDFs from file paths
ops::split_pages(doc, pages)Extract specific pages (1-based) into a new document
ops::split_each_page(doc)Split into one document per page
ops::rotate_pages(doc, pages, degrees)Rotate pages by 0/90/180/270 degrees
ops::encrypt_document(doc, user_pw, owner_pw)Encrypt with AES-128
ops::decrypt_document(doc, password)Decrypt with password

§Features

  • AcroForm tree inspection
  • Qualified and partial field names
  • Text field value extraction
  • Page lookup for repeated widgets
  • Top-level form grouping / renaming
  • /NeedAppearances control
  • Page-scoped field filling
  • Text and choice appearance regeneration
  • Button state updates for checkboxes and radio groups
  • Orphan widget reattachment to /AcroForm /Fields
  • Annotation removal for post-flatten cleanup
  • Optional flatten step that draws widget appearance streams into page content
  • FieldInput::KeepCurrent for flattening without changing the stored value

§Known caveats

  • Generated text appearances use built-in Type1 fonts and a simple WinAnsi text stream. The stored field value uses UTF-16BE, but generated appearance content itself is safest for ASCII / WinAnsi text.
  • Signature-field appearance generation is not implemented.
  • The API is intentionally close to pypdf / PyPDF2, but remains idiomatic Rust rather than mimicking Python objects exactly.

§PyPDF / PyPDF2 API compatibility

pdfer_forms mirrors the form-manipulation surface of pypdf and PyPDF2, including camelCase aliases:

pypdf / PyPDF2pdfer_forms
PdfReader.get_fields()PdfReaderCompat::get_fields
PdfReader.get_form_text_fields()PdfReaderCompat::get_form_text_fields
PdfReader.get_pages_showing_field()PdfReaderCompat::get_pages_showing_field
PdfWriter.add_form_topname()PdfReaderCompat::add_form_topname
PdfWriter.rename_form_topname()PdfReaderCompat::rename_form_topname
PdfWriter.set_need_appearances_writer()PdfWriterCompat::set_need_appearances_writer
PdfWriter.update_page_form_field_values()PdfWriterCompat::update_page_form_field_values
PdfWriter.reattach_fields()PdfWriterCompat::reattach_fields
PdfWriter.remove_annotations()PdfWriterCompat::remove_annotations
updatePageFormFieldValues (PyPDF2)updatePageFormFieldValues
setNeedAppearancesWriter (PyPDF2)setNeedAppearancesWriter

§Reader-like surface

use pdfer_forms::PdfReaderCompat;

let mut reader = PdfReaderCompat::load("form.pdf")?;
let all_fields = reader.get_fields()?;
let text_fields = reader.get_form_text_fields(false)?;
let pages = reader.get_pages_showing_field("sender.city")?;
reader.add_form_topname("form1")?;
reader.rename_form_topname("renamed_form")?;

§Writer-like surface

use pdfer_forms::{FieldInput, PageSelection, PdfWriterCompat};
use std::collections::BTreeMap;

let mut writer = PdfWriterCompat::load("form.pdf")?;
writer.set_need_appearances_writer(false)?;

let mut fields = BTreeMap::new();
fields.insert("check1".into(), FieldInput::from("/Yes"));
fields.insert("city".into(), FieldInput::from("Berlin"));
fields.insert("choices".into(), FieldInput::from(vec!["A".into(), "C".into()]));

writer.update_page_form_field_values(
    PageSelection::Index(0),
    &fields,
    0,
    Some(false),
    true,
)?;

writer.remove_annotations(Some(&["/Widget"]))?;
writer.save("flattened.pdf")?;

§PyPDF2 camelCase shims

use pdfer_forms::{PageSelection, PdfWriterCompat};
use std::collections::BTreeMap;

let mut writer = PdfWriterCompat::load("form.pdf")?;
writer.setNeedAppearancesWriter()?;

let mut fields = BTreeMap::new();
fields.insert("city".to_string(), "Berlin".to_string());
writer.updatePageFormFieldValues(PageSelection::Index(0), &fields, 0)?;

§Main types

  • PdfReaderCompat — pypdf-style reader wrapper
  • PdfWriterCompat — pypdf-style writer wrapper
  • FormField — an AcroForm field with value, type, and widgets
  • FieldValue — decoded field value (text, button state, choice list)
  • FieldInput — input variant for field updates (text, button, choice, KeepCurrent)
  • PageSelectionAll or Index(n) scope for updates
  • PageHandle — page identity helper
  • FieldSpecifier — qualified / partial field name resolver

§Benchmarks — pdfer_forms vs pypdf / PyPDF2

Benchmarked against pypdf 6.9.2 and PyPDF2 3.0.1 on 9 real-world government PDF forms (IRS, USCIS, GSA, Hong Kong IRD, Guatemala SAT) in English, Spanish, and Chinese.

§Accuracy

MetricResult
Field name match rate1004/1011 (99.3%)
Field type match rate1004/1004 (100.0%)
Field value match rate1004/1004 (100.0%)

The 7 name mismatches are encoding differences on a single Spanish-language PDF where pypdf decodes non-ASCII field names (e.g. DÍA) while pdfer_forms currently returns the raw bytes.

§Performance (average across 9 PDFs)

Operationpypdfpdfer_formsSpeedup
get_fields12.1 ms0.51 ms23.6× faster
get_pages_showing_field2.2 ms0.47 ms4.8× faster
fill_form40.3 ms11.0 ms3.7× faster
remove_annotations26.4 ms6.8 ms3.9× faster
get_form_text_fields1.2 ms0.49 ms2.4× faster
load1.9 ms5.2 ms2.7× slower*

*Load is slower because lopdf eagerly parses the full cross-reference table; pypdf uses lazy loading. For most workflows the total round-trip is still faster.

§API parity

All 6 core pypdf form APIs pass on every test PDF. PyPDF2-style camelCase aliases (getFields, updatePageFormFieldValues, etc.) are included.

  • lopdf — the pure-Rust PDF library this crate is built on
  • printpdf — for generating PDFs from scratch
  • pdf — another pure-Rust PDF reader

§Contributing

Issues and pull requests are welcome at https://github.com/clark-labs-inc/pdfer-forms-rs.

§License

Licensed under either of

at your option.


© Clark Labs Inc. pdfer_forms is not affiliated with the authors of pypdf or PyPDF2. pypdf and PyPDF2 are trademarks of their respective owners; compatibility is provided for porting convenience.

Modules§

field_flags
ops
PDF document operations: merge, split, rotate, encrypt/decrypt.

Structs§

FormField
PageHandle
PdfReaderCompat
PdfWriterCompat

Enums§

FieldInput
FieldSpecifier
FieldValue
PageSelection
PdferError

Type Aliases§

Result