Crate gotenberg_pdf

Crate gotenberg_pdf 

Source
Expand description

§Gotenberg PDF Client

crates.io docs.rs GitHub Actions

gotenberg_pdf is a Rust library that provides an easy-to-use interface for interacting with the Gotenberg API.

Gotenberg is a docker-based service for converting HTML, Markdown, URLs, and various documents to PDFs. It uses the Chrome engine to render web content to PDF and the LibreOffice engine to convert documents.

§Features

  • URL to PDF: Generate PDFs directly from a webpage URL.
  • HTML to PDF: Convert raw HTML into a PDF.
  • Markdown to PDF: Render Markdown files into a PDF.
  • Document to PDF: Convert various document formats (e.g., DOCX, PPTX) to PDF using the LibreOffice engine.
  • Screenshot: Capture screenshots of webpages or HTML content.

§Installation

Add gotenberg_pdf to your Cargo.toml:

[dependencies]
gotenberg_pdf = "0.5"

Ensure you have a running instance of Gotenberg, typically via Docker:

docker run --rm -p 3000:3000 gotenberg/gotenberg:8

N.B.: This crate is compatible with Gotenberg version 8.

§Usage Examples

§Convert URL to PDF

use gotenberg_pdf::{Client, WebOptions, PaperFormat};
use tokio;

#[tokio::main]
async fn main() {
    // Initialize the client with the Gotenberg server URL
    let client = Client::new("http://localhost:3000");

    // Define optional rendering configurations
    let mut options = WebOptions::default();
    options.set_paper_format(PaperFormat::A4);

    // Convert a URL to PDF
    let pdf_bytes = client.pdf_from_url("https://example.com", options).await.unwrap();
}

§Convert HTML to PDF

use gotenberg_pdf::{Client, WebOptions};
use tokio;

#[tokio::main]
async fn main() {
    let client = Client::new("http://localhost:3000");

    let html_content = r#"
    <!doctype html>
    <html>
        <head><title>My PDF</title></head>
        <body><h1>Hello, PDF!</h1></body>
    </html>
    "#;

    let options = WebOptions::default();

    let pdf_bytes = client.pdf_from_html(html_content, options).await.unwrap();
}

§Convert Markdown to PDF

use gotenberg_pdf::{Client, WebOptions};
use std::collections::HashMap;
use tokio;

#[tokio::main]
async fn main() {
    let client = Client::new("http://localhost:3000");

    // Markdown content
    let mut markdown_files = HashMap::new();
    markdown_files.insert("example.md", "# My Markdown PDF\nThis is a test document.");

    // HTML template to wrap the markdown
    let html_template = r#"
    <!doctype html>
    <html>
        <head><title>Markdown PDF</title></head>
        <body>{{ toHTML "example.md" }}</body>
    </html>
    "#;

    let options = WebOptions::default();

    let pdf_bytes = client.pdf_from_markdown(html_template, markdown_files, options).await.unwrap();
}

§Take a Screenshot of a URL

use gotenberg_pdf::{Client, ScreenshotOptions, ImageFormat};
use tokio;

#[tokio::main]
async fn main() {
    let client = Client::new("http://localhost:3000");

    let mut options = ScreenshotOptions::default();
    options.width = Some(1920);
    options.height = Some(1080);
    options.format = Some(ImageFormat::Png);

    let image_bytes = client.screenshot_url("https://example.com", options).await.unwrap();

    println!("Screenshot captured: {} bytes", image_bytes.len());
}

§Convert Document to PDF Using LibreOffice Engine

use gotenberg_pdf::{Client, DocumentOptions};
use tokio;

#[tokio::main]
async fn main() {
    let client = Client::new("http://localhost:3000");

    let filename = "test_files/example.docx";
    let file_content = std::fs::read(filename).expect("Failed to read the file");

    let options = DocumentOptions {
        landscape: Some(false),
        ..Default::default()
    };

    let pdf_bytes = client.pdf_from_doc(filename, file_content, options).await.unwrap();
}

§Convert HTML to Screenshot Image

use gotenberg_pdf::{Client, ScreenshotOptions, ImageFormat};
use tokio;

#[tokio::main]
async fn main() {
    let client = Client::new("http://localhost:3000");

    let html_content = r#"
    <!doctype html>
    <html>
        <head><title>Screenshot</title></head>
        <body><h1>Hello, Screenshot!</h1></body>
    </html>
    "#;

    let mut options = ScreenshotOptions::default();
    options.width = Some(800);
    options.height = Some(600);
    options.format = Some(ImageFormat::Png);

    let image_bytes = client.screenshot_html(html_content, options).await.unwrap();
}

§Use the streaming client

Requires the stream feature to be enabled in your Cargo.toml.

use gotenberg_pdf::{StreamingClient, WebOptions};
use futures::StreamExt; // for `next()`
use tokio::fs::File;
use tokio::io::AsyncWriteExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = StreamingClient::new("http://localhost:3000");

    let options = WebOptions::default();
    let mut stream = client.pdf_from_url("https://example.com", options).await?;

    // Create or overwrite the PDF file asynchronously
    let temp_dir = std::env::temp_dir();
    let pdf_path = temp_dir.join("example_com.pdf");
    let mut file = File::create(pdf_path).await?;

    // As we receive chunks, write them directly to disk
    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        file.write_all(&chunk).await?;
    }

    println!("PDF rendered and saved as example_com.pdf");
    Ok(())
}

§Use the blocking client for use without tokio or another async runtime.

Requires the blocking feature to be enabled in your Cargo.toml.

use gotenberg_pdf::{BlockingClient, WebOptions, PaperFormat};

fn main() {
    // Initialize the client with the Gotenberg server URL
    let client = BlockingClient::new("http://localhost:3000");

    // Define optional rendering configurations
    let mut options = WebOptions::default();
    options.set_paper_format(PaperFormat::A4);

    // Convert a URL to PDF
    let pdf_bytes = client.pdf_from_url("https://example.com", options).unwrap();
}

§Configuration Options

§WebOptions

Provides control over the PDF generation process from the Chrome engine. These options can be passed to the following methods:

Field NameDescriptionDefault
trace_idUnique trace ID for requestRandom UUID
single_pagePrint content on one pagefalse
paper_widthPaper width as a LinearDimention8.5 inches
paper_heightPaper height as a LinearDimention11 inches
margin_topTop margin as a LinearDimention0.39 inches
margin_bottomBottom margin as a LinearDimention0.39 inches
margin_leftLeft margin as a LinearDimention0.39 inches
margin_rightRight margin as a LinearDimention0.39 inches
prefer_css_page_sizeUse CSS-defined page sizefalse
generate_document_outlineEmbed document outlinefalse
print_backgroundInclude background graphicsfalse
omit_backgroundAllow transparency in PDFfalse
landscapeSet page orientation to landscapefalse
scaleScale of page rendering1.0
native_page_rangesPageRange to print, eg "1,3,5", "1-4'All pages
header_htmlHTML for header contentNone
footer_htmlHTML for footer contentNone
wait_delayDelay before conversionNone
wait_for_expressionWait until this JS expression returns trueNone
emulated_media_typeEmulated MediaType (“screen” or “print”)print
cookiesCookies for ChromiumNone
skip_network_idle_eventsIgnore network idle eventstrue
user_agentOverride default User-Agent headerNone
extra_http_headersAdditional HTTP headersNone
pdfaConvert to specific PDF/A PDFFormatNone
pdfuaEnable Universal Access compliancefalse
metadataPDF metadataNone
fail_on_http_status_codesHTTP status codes to fail on, 99’s are wild[499, 599]
fail_on_resource_http_status_codesResource HTTP status codes to fail onNone
fail_on_resource_loading_failedFail if resource loading failsfalse
fail_on_console_exceptionsFail on Chromium console exceptionsfalse

Includes the WebOptions::set_paper_format utlity method for common paper sizes.

§ScreenshotOptions

Provides control over the screenshot generation process from the Chrome engine. These options can be passed to the following method:

Field NameDescriptionDefault
trace_idUnique trace ID for requestRandom UUID
widthDevice screen width in pixels800
heightDevice screen height in pixels600
clipClip screenshot to device dimensionsfalse
formatImage format as an ImageFormatpng
qualityCompression quality (jpeg only, 0-100)100
omit_backgroundGenerate screenshot with transparencyfalse
optimize_for_speedOptimize image encoding for speedfalse
wait_delayDelay before taking screenshotNone
wait_for_expressionWait until this JS expression returns trueNone
emulated_media_typeEmulated MediaType (“screen” or “print”)print
cookiesCookies for ChromiumNone
skip_network_idle_eventsIgnore network idle eventstrue
user_agentOverride default User-Agent headerNone
extra_http_headersAdditional HTTP headersNone
fail_on_http_status_codesHTTP status codes to fail on, 99’s are wild[499, 599]
fail_on_resource_http_status_codesResource HTTP status codes to fail onNone
fail_on_resource_loading_failedFail if resource loading failsfalse
fail_on_console_exceptionsFail on Chromium console exceptionsNone

§DocumentOptions

Provides control over the document generation process from the LibreOffice engine. These options can be passed to the following method:

Field NameDescriptionDefault
trace_idUnique trace ID for requestRandom UUID
passwordPassword for opening the source fileNone
landscapeSet paper orientation to landscapefalse
native_page_rangesPageRange to print, eg "1,2,3" or "1-4"All pages
export_form_fieldsExport form fields as widgetstrue
allow_duplicate_field_namesAllow duplicate field names in form fieldsfalse
export_bookmarksExport bookmarks to PDFtrue
export_bookmarks_to_pdf_destinationExport bookmarks as named destinationsfalse
export_placeholdersExport placeholder fields visual markings onlyfalse
export_notesExport notes to PDFfalse
export_notes_pagesExport notes pages (Impress only)false
export_only_notes_pagesExport only notes pagesfalse
export_notes_in_marginExport notes in marginfalse
convert_ooo_target_to_pdf_targetConvert .od[tpgs] links to .pdffalse
export_links_relative_fsysExport file:// links as relativefalse
export_hidden_slidesExport hidden slides (Impress only)false
skip_empty_pagesSuppress automatically inserted empty pagesfalse
add_original_document_as_streamAdd original document as a streamfalse
single_page_sheetsPut each sheet on one pagefalse
lossless_image_compressionUse lossless image compression (e.g., PNG)false
qualityJPG export quality (1-100)90
reduce_image_resolutionReduce image resolutionfalse
max_image_resolutionMax resolution DPI. 75, 150, 300, 600 or 1200300
pdfaConvert to specific PDF/A PDFFormatNone
pdfuaEnable Universal Access compliancefalse

§Features

§TLS / HTTPS

By default there is no support for HTTPS. If you need TLS, you can enable it by adding one of the following features to your Cargo.toml:

  • rustls-tls - Enables TLS / HTTPS support using the rustls library.
  • native-tls - Enables TLS / HTTPS support using the native system TLS library.

§HTTP/2

By default there is no HTTP/2 support. HTTP/2 support can be enalbed with the http2 feature. Even with the feature enabled, HTTP/2 will not be selected unless connecting over HTTPS. If you need HTTP/2 over plain HTTP, you need to make use of Client::new_with_client and reqwest::ClientBuilder::http2_prior_knowledge.

§Additional features

  • stream - Enables the streaming client to stream generated PDFs directly to disk or other destinations.
  • blocking - Enables the blocking client for use without tokio or another async runtime.
  • zeroize - Enables zeroizing sensitive data in the client. Enabled by default.

§Web Assembly / Browser Support

This crate compiles to wasm32-unknown-unknown and is runnable in the browser. In the browser, it will use the built-in browser fetch API to make requests to the Gotenberg server. The stream, blocking, rustls-tls and native-tls features are not available on wasm32 or in the browser.

Be aware that in the browser, the gotenberg server will need to be behind a proxy that sets the correct CORS headers (‘Access-Control-Allow-Origin’).

Modules§

health
Gotenberg server health status. See Client::health_check.

Structs§

BlockingClientblocking
Gotenberg API blocking client. Available when the blocking feature is enabled.
Bytes
Re-exported from the bytes crate (See bytes::Bytes). A cheaply cloneable and sliceable chunk of contiguous memory.
Client
Gotenberg API client.
Cookie
Cookie to send to the end server.
DocumentOptions
Options for converting a document to a PDF using the LibreOffice engine.
LinearDimention
Linear dimention, allowed units are mm, cm, in, px, pt, pc. Default unit is in.
PageRange
Represents a set of page ranges, eg "1,3-5,7".
ScreenshotOptions
Options for taking a screenshot of a webpage.
StreamingClientstream
Gotenberg Streaming API client. Available with the stream feature enabled.
WebOptions
Configuration for rendering PDF from web content using the Chromium engine.

Enums§

Error
Error type for the Gotenberg API.
ImageFormat
Image format to use when taking a screenshot.
MediaType
Media type, either “print” or “screen”.
PDFFormat
Supported PDF binary formats.
PageRangeChunk
Represents a chunk of a page range, either a single page or a range of pages.
PaperFormat
Paper Format, A0 to A6, Ledger, Legal, Letter, Tabloid
SameSite
The SameSite cookie attribute.
Unit
Unit of the linear dimention, for example mm, cm, in, px, pt, pc