# printpdf.js WASM API
This document describes the **JavaScript** API exposed by the `printpdf.js` WebAssembly module,
which wraps the Rust code in `wasm.rs`. These functions allow you to:
1. Generate a PDF document from HTML (`Pdf_HtmlToDocument`)
2. Parse an existing PDF document from PDF bytes (`Pdf_BytesToDocument`)
3. Extract resource IDs (images, fonts, layers) from a PDF page (`Pdf_ResourcesForPage`)
4. Convert a PDF page into an SVG string (`Pdf_PageToSvg`)
5. Save a PDF document back into PDF bytes (`Pdf_DocumentToBytes`)
All functions take a stringified JS object as both input and output.
The output, when converted back to a JS object, has a
```ts
{
status: number, // 0 = okay, non-zero = error
```
**Enum fields** in the underlying Rust code are **always renamed** to `kebab-case` in JSON.
Tagged enums are tagged with `type` for the variant and `data` for the payload.
> `BuiltinFont::TimesRoman` becomes `"times-roman"`.
>
> `MyEnum::VariantType { data }` becomes `{ type: "variant-type", data: ... }`.
**Struct fields** in Rust are **always renamed** to `camelCase` when serialized to JSON.
> `foo.page_height` in Rust becomes `foo.pageHeight` in JSON
**Note:** All documented async API functions also have equivalent synchronous counterparts with the same name plus "Sync" suffix (e.g., `Pdf_HtmlToDocumentSync`). These synchronous versions have the same input/output parameters but execute synchronously.
## Initialization
Before calling any of the below functions, **make sure** you have initialized the WASM module.
```js
// printpdf_bg.wasm needs to be in the same directory as printpdf.js
import init, {
Pdf_HtmlToDocument,
Pdf_BytesToDocument,
Pdf_ResourcesForPage,
Pdf_PageToSvg,
Pdf_DocumentToBytes,
} from './pkg/printpdf.js';
async function main() {
// Initialize the WASM
await init();
// Now we can safely call our PDF functions
// ...
}
main().catch(console.error);
```
## Pdf_HtmlToDocument
Generates a **new PDF document** from given HTML, optional images,
optional fonts, and page generation options.
```ts
interface PdfHtmlToDocumentInput {
html: string; // Required: the source HTML to convert
title?: string; // Title of the PDF document
images?: Record<string, string> // filename => base64-encoded image data
fonts?: Record<string, string> // filename => base64-encoded font data
options?: { // PDF generation options
fontEmbedding?: boolean; // default true
pageWidth?: number; // in mm; default 210
pageHeight?: number; // in mm; default 297
imageOptimization?: {
quality?: number; // e.g. 0.75 for 75% image quality, null/undefined to disable
maxImageSize?: string; // max size like "300kb"
ditherGreyscale?: boolean; // apply dithering to greyscale images
autoOptimize?: boolean; // auto-optimize images (remove unused alpha, etc.)
format?: string; // preferred format: "auto", "jpeg", "flate", etc.
}
}
}
```
```json5
{
"status": 0,
"data": {
"doc": { // Changed from "pdf" to "doc"
"metadata": { /* ... */ },
"resources": { /* ... */ },
"bookmarks": { /* ... */ },
"pages": [
{
"mediaBox": { /* ... */ },
"trimBox": { /* ... */ },
"cropBox": { /* ... */ },
"ops": [ ... ]
}
// ...
]
}
}
}
```
### Example: Generating PDF from HTML
```js
const inputObject = {
title: "My PDF!",
html: "<!doctype html><html><body><h1>Hello World!</h1></body></html>",
// Suppose we have a base64 version of 'dog.png' we want to embed:
images: {
"dog.png": "..."
},
fonts: {
"f1.woff2": "data:font/woff2;base64,..." // supports: ttf, otf, woff, woff2
},
options: {
pageWidth: 210, // mm
pageHeight: 297 // mm
}
};
const inputJson = JSON.stringify(inputObject);
const outputJson = await Pdf_HtmlToDocument(inputJson);
const result = JSON.parse(outputJson);
if (result.status === 0) {
// result.data.doc is the PdfDocument object
console.log("PDF document:", result.data.doc);
} else {
console.error("Error generating PDF:", result.data);
}
```
## Pdf_BytesToDocument
Parses an **existing PDF** from a **Base64-encoded** byte string.
Outputs a structured JSON representation of the PDF and any warnings that occurred.
```ts
interface PdfBytesToDocumentInput {
bytes: string; // Base64-encoded PDF bytes (can be raw or with data:application/pdf;base64, prefix)
options?: {
failOnError?: boolean; // default false; if true, parse errors become fatal
}
}
```
```json5
{
"status": 0,
"data": {
"doc": { /* ... */ }, // The PdfDocument JSON
"warnings": [ /* ... */ ] // Array of any PDF parse warnings
}
}
```
### Example: Parsing a PDF file
```js
const myPdfBase64 = "data:application/pdf;base64,JVBEAwIG9C9M...";
const parseInput = {
bytes: myPdfBase64,
options: { failOnError: false }
};
const parseOutputJson = await Pdf_BytesToDocument(JSON.stringify(parseInput));
const parseResult = JSON.parse(parseOutputJson);
if (parseResult.status === 0) {
console.log("PDF parsed!");
const pdfDoc = console.log(parseResult.data.doc);
console.log("Warnings:", parseResult.data.warnings);
} else {
console.error("Failed to parse PDF:", parseResult.data);
}
```
## Pdf_ResourcesForPage
Given a **single PDF page** (in JSON), returns the **xobject IDs**, **font IDs**, and
**layer IDs** that page references. This is especially useful if you want to render
or further process only the resources used by that page, because all fonts / images
are decoded in / to base64 during the JSON reading process from the Rust side. If this
function didn't exist, we'd have to re-decode every single font in the entire PDF for
rendering every page, even if the font isn't used by the page.
```ts
interface PdfResourcesForPageInput {
page: PdfPage; // single element from `pdfDocument.pages[index]`
}
```
```json5
{
"status": 0,
"data": {
"xobjects": [ "X001", "X012", "X10", ... ],
"fonts": [ "F1", "F3", "F4", ... ],
"layers": [ "Page1-Layer0138", ... ]
}
}
```
### Example: Extracting resource IDs used on a page
```js
const inputObj = {
page: pdfDocument.pages[0]
};
const resourcesJson = await Pdf_ResourcesForPage(JSON.stringify(inputObj));
const resourcesResult = JSON.parse(resourcesJson);
if (resourcesResult.status === 0) {
console.log("Resources for this page:", resourcesResult.data);
// data.xobjects, data.fonts, data.layers
} else {
console.error("Error getting resources:", resourcesResult.data);
}
```
## Pdf_PageToSvg
Converts a **PDF page** into **SVG** for rendering or preview. The function
also needs the **document resources** that page relies on and optional conversion options.
```ts
interface PdfPageToSvgInput {
page: PdfPage; // The PDF page you want to render as SVG
resources?: PdfResources; // The subset (or entire) PDF resources
options?: PdfToSvgOptions;
}
interface PdfToSvgOptions {
// The image formats you prefer in the SVG `<image xlink:href="data:image/...;base64,">`
// tags, in order of preference. Depends on what image features the library was compiled with.
imageFormats?: ["png"|"jpeg"|"gif"|"webp"|"pnm"|"tiff"|"tga"|"bmp"|"avif"]
}
```
```json5
{
"status": 0,
"data": {
"svg": "<svg ...> ... </svg>" // raw SVG string
}
}
```
### Example: Rendering a PDF page as SVG
```js
// Determine which resources this page needs
const pageResourcesRequest = {
page: pdfDocument.pages[0]
};
const resourcesJson = await Pdf_ResourcesForPage(JSON.stringify(pageResourcesRequest));
const resourcesResult = JSON.parse(resourcesJson);
let pageResources = pdfDocument.resources;
if (resourcesResult.status === 0) {
// If needed, you could copy only the required IDs from pdfDocument.resources
// into a new object. For this example, we'll just use the full resources:
pageResources = pdfDocument.resources;
}
const svgRequest = {
page: page,
resources: pageResources,
options: {
imageFormats: ["png", "jpeg"]
}
};
const svgOutputJson = await Pdf_PageToSvg(JSON.stringify(svgRequest));
const svgResult = JSON.parse(svgOutputJson);
if (svgResult.status === 0) {
// Insert the SVG string into the DOM. Don't forget to use display=block on the parent!
document.getElementById("mySvgContainer").innerHTML = svgResult.data.svg;
} else {
console.error("SVG conversion error:", svgResult.data);
}
```
## Pdf_DocumentToBytes
Takes a **`PdfDocument`** (the JSON structure) plus optional save options,
and **serializes** it into a **Base64**-encoded PDF.
```ts
interface PdfDocumentToBytesInput {
doc: PdfDocument; // The PdfDocument object you want to export
options?: {
optimize?: boolean; // default true, compress/prune unreferenced objects
subsetFonts?: boolean; // default true, subsets embedded fonts
secure?: boolean; // default true, skip unknown PDF ops if encountered
}
}
```
```json5
{
"status": 0,
"data": {
"bytes": "<base64-encoded PDF bytes>" // use atob(data.bytes)
}
}
```
### Example: Saving a PdfDocument back into PDF bytes
```js
const inputObj = {
doc: pdfDocument,
options: {
optimize: true,
subsetFonts: true,
secure: true
}
};
const inputJson = JSON.stringify(inputObj);
const outputJson = await Pdf_DocumentToBytes(inputJson);
const outputResult = JSON.parse(outputJson);
if (outputResult.status === 0) {
// outputResult.data.bytes is the PDF in base64 form
const base64Pdf = outputResult.data.bytes;
const pdfBytes = atob(base64Pdf); // decode base64
const pdfBuffer = new Uint8Array(pdfBytes.length);
for (let i = 0; i < pdfBytes.length; i++) {
pdfBuffer[i] = pdfBytes.charCodeAt(i);
}
const blob = new Blob([pdfBuffer], { type: 'application/pdf' });
const url = URL.createObjectURL(blob);
// Trigger a download
const link = document.createElement('a');
link.href = url;
link.download = "my_exported.pdf";
link.click();
URL.revokeObjectURL(url);
console.log("PDF exported successfully!");
} else {
console.error("PDF export error:", outputResult.data);
}
```
## Datastructures
Many advanced fields appear in the `pdfDocument` JSON (fonts, xobjects, layers, color definitions, etc.).
For most basic use cases, you only need to manipulate the top-level `pages`, or embed images/fonts. If you need
to dig deeper, the datastructures are documented in the [/STRUCTS.md](/STRUCTS.md) file.
Enjoy creating, parsing, and manipulating PDFs with `printpdf.js`!