Crate printpdf

source ·
Expand description

printpdf is a library designed for creating printable PDF documents.

Crates.io | Documentation

[dependencies]
printpdf = "0.5.0"

§Features

Currently, printpdf can only write documents, not read them.

  • Page generation
  • Layers (Illustrator like layers)
  • Graphics (lines, shapes, bezier curves)
  • Images (currently BMP/PNG/JPG only or generate your own images)
  • Embedded fonts (TTF and OTF) with Unicode support
  • Advanced graphics - overprint control, blending modes, etc.
  • Advanced typography - character scaling, character spacing, superscript, subscript, outlining, etc.
  • PDF layers (you should be able to open the PDF in Illustrator and have the layers appear)

§Getting started

§Writing PDF

§Simple page

use printpdf::*;
use std::fs::File;
use std::io::BufWriter;

let (doc, page1, layer1) = PdfDocument::new("PDF_Document_title", Mm(247.0), Mm(210.0), "Layer 1");
let (page2, layer1) = doc.add_page(Mm(10.0), Mm(250.0),"Page 2, Layer 1");

doc.save(&mut BufWriter::new(File::create("test_working.pdf").unwrap())).unwrap();

§Adding graphical shapes

use printpdf::*;
use printpdf::path::{PaintMode, WindingOrder};
use std::fs::File;
use std::io::BufWriter;
use std::iter::FromIterator;

let (doc, page1, layer1) = PdfDocument::new("printpdf graphics test", Mm(297.0), Mm(210.0), "Layer 1");
let current_layer = doc.get_page(page1).get_layer(layer1);

// Quadratic shape. The "false" determines if the next (following)
// point is a bezier handle (for curves)
//
// If you want holes, use WindingOrder::EvenOdd
let points1 = vec![(Point::new(Mm(100.0), Mm(100.0)), false),
                   (Point::new(Mm(100.0), Mm(200.0)), false),
                   (Point::new(Mm(300.0), Mm(200.0)), false),
                   (Point::new(Mm(300.0), Mm(100.0)), false)];

let line1 = Polygon {
    rings: vec![points1],
    mode: PaintMode::FillStroke,
    winding_order: WindingOrder::NonZero,
};

let fill_color = Color::Cmyk(Cmyk::new(0.0, 0.23, 0.0, 0.0, None));
let outline_color = Color::Rgb(Rgb::new(0.75, 1.0, 0.64, None));
let mut dash_pattern = LineDashPattern::default();
dash_pattern.dash_1 = Some(20);

current_layer.set_fill_color(fill_color);
current_layer.set_outline_color(outline_color);
current_layer.set_outline_thickness(10.0);

// Draw first line
current_layer.add_polygon(line1);

let fill_color_2 = Color::Cmyk(Cmyk::new(0.0, 0.0, 0.0, 0.0, None));
let outline_color_2 = Color::Greyscale(Greyscale::new(0.45, None));

// More advanced graphical options
current_layer.set_overprint_stroke(true);
current_layer.set_blend_mode(BlendMode::Seperable(SeperableBlendMode::Multiply));
current_layer.set_line_dash_pattern(dash_pattern);
current_layer.set_line_cap_style(LineCapStyle::Round);

current_layer.set_fill_color(fill_color_2);
current_layer.set_outline_color(outline_color_2);
current_layer.set_outline_thickness(15.0);

// Triangle shape
let mut line2 = Line::from_iter(vec![
    (Point::new(Mm(150.0), Mm(150.0)), false),
    (Point::new(Mm(150.0), Mm(250.0)), false),
    (Point::new(Mm(350.0), Mm(250.0)), false)]);

// draw second line
current_layer.add_line(line2);

§Adding images

Note: Images only get compressed in release mode. You might get huge PDFs (6 or more MB) in debug mode. In release mode, the compression makes these files much smaller (~ 100 - 200 KB).

To make this process faster, use BufReader instead of directly reading from the file. Images are currently not a top priority.

Scaling of images is implicitly done to fit one pixel = one dot at 300 dpi.

// Compile with --feature="embedded_images"
extern crate printpdf;

// imports the `image` library with the exact version that we are using
use printpdf::*;

use std::convert::From;
use std::convert::TryFrom;
use std::fs::File;

fn main() {
    let (doc, page1, layer1) = PdfDocument::new("PDF_Document_title", Mm(247.0), Mm(210.0), "Layer 1");
    let current_layer = doc.get_page(page1).get_layer(layer1);

    // currently, the only reliable file formats are bmp/jpeg/png
    // this is an issue of the image library, not a fault of printpdf
    let mut image_file = File::open("assets/img/BMP_test.bmp").unwrap();
    let image = Image::try_from(image_crate::codecs::bmp::BmpDecoder::new(&mut image_file).unwrap()).unwrap();

    // translate x, translate y, rotate, scale x, scale y
    // by default, an image is optimized to 300 DPI (if scale is None)
    // rotations and translations are always in relation to the lower left corner
    image.add_to_layer(current_layer.clone(), ImageTransform::default());

    // you can also construct images manually from your data:
    let mut image_file_2 = ImageXObject {
        width: Px(200),
        height: Px(200),
        color_space: ColorSpace::Greyscale,
        bits_per_component: ColorBits::Bit8,
        interpolate: true,
        /* put your bytes here. Make sure the total number of bytes =
           width * height * (bytes per component * number of components)
           (e.g. 2 (bytes) x 3 (colors) for RGB 16bit) */
        image_data: Vec::new(),
        image_filter: None, /* does not work yet */
        clipping_bbox: None, /* doesn't work either, untested */
    };

    let image2 = Image::from(image_file_2);
}

§Adding fonts

Note: Fonts are shared between pages. This means that they are added to the document first and then a reference to this one object can be passed to multiple pages. This is different to images, for example, which can only be used once on the page they are created on (since that’s the most common use-case).

use printpdf::*;
use std::fs::File;

let (doc, page1, layer1) = PdfDocument::new("PDF_Document_title", Mm(247.0), Mm(210.0), "Layer 1");
let current_layer = doc.get_page(page1).get_layer(layer1);

let text = "Lorem ipsum";
let text2 = "unicode: стуфхfцчшщъыьэюя";

let font = doc.add_external_font(File::open("assets/fonts/RobotoMedium.ttf").unwrap()).unwrap();
let font2 = doc.add_external_font(File::open("assets/fonts/RobotoMedium.ttf").unwrap()).unwrap();

// text, font size, x from left edge, y from bottom edge, font
current_layer.use_text(text, 48.0, Mm(200.0), Mm(200.0), &font);

// For more complex layout of text, you can use functions
// defined on the PdfLayerReference
// Make sure to wrap your commands
// in a `begin_text_section()` and `end_text_section()` wrapper
current_layer.begin_text_section();

    // setup the general fonts.
    // see the docs for these functions for details
    current_layer.set_font(&font2, 33.0);
    current_layer.set_text_cursor(Mm(10.0), Mm(10.0));
    current_layer.set_line_height(33.0);
    current_layer.set_word_spacing(3000.0);
    current_layer.set_character_spacing(10.0);
    current_layer.set_text_rendering_mode(TextRenderingMode::Stroke);

    // write two lines (one line break)
    current_layer.write_text(text.clone(), &font2);
    current_layer.add_line_break();
    current_layer.write_text(text2.clone(), &font2);
    current_layer.add_line_break();

    // write one line, but write text2 in superscript
    current_layer.write_text(text.clone(), &font2);
    current_layer.set_line_offset(10.0);
    current_layer.write_text(text2.clone(), &font2);

current_layer.end_text_section();

§Changelog

See the CHANGELOG.md file.

§Further reading

The PdfDocument is hidden behind a PdfDocumentReference, which locks the things you can do behind a facade. Pretty much all functions operate on a PdfLayerReference, so that would be where to look for existing functions or where to implement new functions. The PdfDocumentReference is a reference-counted document. It uses the pages and layers for inner mutablility, because I ran into borrowing issues with the document. IMPORTANT: All functions that mutate the state of the document, “borrow” the document mutably for the duration of the function. It is important that you don’t borrow the document twice (your program will crash if you do so). I have prevented this wherever possible, by making the document only public to the crate so you cannot lock it from outside of this library.

Images have to be added to the pages resources before using them. Meaning, you can only use an image on the page that you added it to. Otherwise, you may end up with a corrupt PDF.

Fonts are embedded using freetype. There is a rusttype branch in this repository, but rusttype does fails to get the height of an unscaled font correctly, so that’s why you currently have to use freetype

Please report issues if you have any, especially if you see BorrowMut errors (they should not happen). Kerning is currently not done, because neither freetype nor rusttype can reliably read kerning data. However, “correct” kerning / placement requires a full font shaping engine, etc. This would be a completely different project.

For learning how a PDF is actually made, please read the wiki (currently not completely finished). When I began making this library, these resources were not available anywhere, so I hope to help other people with these topics. Reading the wiki is essential if you want to contribute to this library.

§Goals and Roadmap

The goal of printpdf is to be a general-use PDF library, such as libharu or similar. PDFs generated by printpdf should always adhere to a PDF standard, except if you turn it off. Currently, only the standard PDF/X-3:2002 is covered (i.e. valid PDF according to Adobe Acrobat). Over time, there will be more standards supported. Checking a PDF for errors is currently only a stub.

§Planned features / Not done yet

The following features aren’t implemented yet, most

  • Clipping
  • Aligning / layouting text
  • Open Prepress Interface
  • Halftoning images, Gradients, Patterns
  • SVG / instantiated content
  • Forms, annotations
  • Bookmarks / Table of contents
  • Conformance / error checking for various PDF standards
  • Embedded Javascript
  • Reading PDF
  • Completion of printpdf wiki

§Testing

Currently the testing is pretty much non-existent, because PDF is very hard to test. This should change over time: Testing should be done in two stages. First, test the individual PDF objects, if the conversion into a PDF object is done correctly. The second stage is manual inspection of PDF objects via Adobe Preflight.

Put the tests of the first stage in /tests/mod.rs. The second stage tests are better to be handled inside the plugins’ mod.rs file. printpdf depends highly on lopdf, so you can either construct your test object against a real type or a debug string of your serialized type. Either way is fine - you just have to check that the test object is conform to what PDF expects.

Here are some resources I found while working on this library:

PDFXPlorer, shows the DOM tree of a PDF, needs .NET 2.0

Official PDF 1.7 reference

[GERMAN] How to embed unicode fonts in PDF

PDF X/1-a Validator

PDF X/3 technical notes

Re-exports§

  • pub extern crate image as image_crate;
  • pub extern crate log;
  • pub use lopdf;

Modules§

  • Color module (CMYK or RGB). Shared between 2D and 3D module.
  • Current transformation matrix, for transforming shapes (rotate, translate, scale)
  • Info dictionary of a PDF document
  • Errors for printpdf
  • Extended graphics state, for advanced graphical operation (overprint, black point control, etc.)
  • Embedding fonts in 2D for Pdf
  • ICC profile that can be embedded into a PDF
  • Abstraction class for images. Please use this class instead of adding ImageXObjects yourself
  • These indices are for library internal use only. Use the add_* functions to get an index instead.
  • Utilities to work with path objects.
  • Module regulating the comparison and feature sets / allowed plugins of a PDF document
  • A PDFDocument represents the whole content of the file
  • PDF layer management. Layers can contain referenced or real content.
  • Wapper type for shared metadata between XMP Metadata and the DocumentInfo dictionary
  • PDF page management
  • Utilities for rectangle paths.
  • Scaling types for reducing errors between conversions between point (pt) and millimeter (mm)
  • Abstraction class for images. Please use this class instead of adding ImageXObjects yourself
  • Utility / conveniece functions for commonly use graphical shapes
  • Stub plugin for XMP Metadata streams, to be expanded later

Structs§

  • CMYK color
  • Allows building custom conformance profiles. This is useful if you want very small documents for example and you don’t need conformance with any PDF standard, you just want a PDF file.
  • Direct reference (wrapper for lopdf::Object::Reference) for increased type safety
  • “Info” dictionary of a PDF document. Actual data is contained in DocumentMetadata, to keep it in sync with the XmpMetadata (if the timestamps / settings are not in sync, Preflight will complain)
  • ExtGState dictionary
  • List of many ExtendedGraphicsState
  • A reference to the graphics state, for reusing the graphics state during a stream without adding new graphics states all the time
  • Index of a font
  • Font list for tracking fonts within a single PDF document
  • The unscaled base metrics for a font provided by a FontData implementation.
  • THIS IS NOT A PDF FORM! A form XObject can be nearly everything. PDF allows you to reuse content for the graphics stream in a FormXObject. A FormXObject is basically a layer-like content stream and can contain anything as long as it’s a valid strem. A FormXObject is intended to be used for reapeated content on one page.
  • The metrics for a glyph provided by a FontData implementation.
  • Greyscale color
  • Icc profile
  • Named reference for an ICC profile
  • Image - wrapper around an ImageXObject to allow for more control within the library
  • Transform that is applied immediately before the image gets painted. Does not affect anything other than the image.
  • Named reference to an image
  • Indexed reference to a font that was added to the document This is a “reference by postscript name”
  • Line dash pattern is made up of a total width
  • Named reference to a LinkAnnotation
  • Scale in millimeter
  • Optional content group, for PDF layers. Only available in PDF 1.4 but (I think) lower versions of PDF allow this, too. Used to create Adobe Illustrator-like layers in PDF
  • STUB
  • Named reference to a pattern
  • Index of the arbitrary content data
  • PDF document
  • Marker struct for a document. Used to make the API a bit nicer. It simply calls PdfDocument functions.
  • One layer of PDF data
  • Index of the layer on the nth page
  • A “reference” to the current layer, allows for inner mutability but only inside this library
  • This is a wrapper in order to keep shared data between the documents XMP metadata and the “Info” dictionary in sync
  • PDF page
  • Index of the page (0-based)
  • A “reference” to the current page, allows for inner mutability but only inside this library
  • Struct for storing the PDF Resources, to be used on a PDF page
  • TODO, very low priority
  • Scale in point
  • Scale in pixels
  • A helper struct to insert rectangular shapes into a PDF.
  • PDF 1.4 and higher Contains a PDF file to be embedded in the current PDF
  • RGB color
  • SMask dictionary. A soft mask (or SMask) is a greyscale image that is used to mask another image
  • A soft mask is used for transparent images such as PNG with an alpha component The bytes range from 0xFF (opaque) to 0x00 (transparent). The alpha channel of a PNG image have to be sorted out. Can also be used for Vignettes, etc. Beware of color spaces! See PDF Reference Page 545 - Soft masks
  • Spot color Spot colors are like Cmyk, but without color space They are essentially “named” colors from specific vendors currently they are the same as a CMYK color.
  • SVG - wrapper around an XObject to allow for more control within the library.
  • Index of a svg file
  • Transform that is applied immediately before the image gets painted. Does not affect anything other than the image.
  • List of XObjects
  • Named reference to an XObject
  • Initial struct for Xmp metatdata. This should be expanded later for XML handling, etc. Right now it just fills out the necessary fields

Enums§

  • Black generation calculates the amount of black to be used when trying to reproduce a particular color.
  • Standard built-in PDF fonts
  • Wrapper for Rgb, Cmyk and other color types
  • How many bits does a color have?
  • Color space (enum for marking the number of bits a color has)
  • PDF “current transformation matrix”. Once set, will operate on all following shapes, until the layer.restore_graphics_state() is called. It is important to call layer.save_graphics_state() earlier.
  • The font
  • In PDF 1.2, the graphics state includes a current halftone parameter, which determines the halftoning process to be used by the painting operators. It may be defined by either a dictionary or a stream, depending on the type of halftone; the term halftone dictionary is used generically throughout this section to refer to either a dictionary object or the dictionary portion of a stream object. (The halftones that are defined by streams are specifically identified as such in the descriptions of particular halftone types; unless otherwise stated, they are understood to be defined by simple dictionaries instead.) Deserialized into Integer: 1, 5, 6, 10 or 16
  • Type of the icc profile
  • Describes the format the image bytes are compressed with.
  • See PDF Reference (Page 216) - Line cap (ending) style
  • See PDF Reference Page 216 - Line join style
  • Since the nonseparable blend modes consider all color components in combination, their computation depends on the blending color space in which the components are interpreted. They may be applied to all multiple-component color spaces that are allowed as blending color spaces (see Section 7.2.3, “Blending Color Space”).
  • Intent to use for the optional content groups
  • (PDF 1.3) A code specifying whether a color component value of 0 in a DeviceCMYK color space should erase that component (EraseUnderlying) or leave it unchanged (KeepUnderlying) when overprinting (see Section 4.5.6, “Over- print Control”). Initial value: EraseUnderlying
  • Tuple for differentiating outline and fill colors
  • List of (relevant) PDF versions Please note the difference between PDF/A (archiving), PDF/UA (universal acessibility), PDF/X (printing), PDF/E (engineering / CAD), PDF/VT (large volume transactions with repeated content)
  • Although CIE-based color specifications are theoretically device-independent, they are subject to practical limitations in the color reproduction capabilities of the output device. Such limitations may sometimes require compromises to be made among various properties of a color specification when rendering colors for a given device. Specifying a rendering intent (PDF 1.1) allows a PDF file to set priorities regarding which of these properties to preserve and which to sacrifice.
  • PDF Reference 1.7, Page 520, Table 7.2 Blending modes for objects In the following reference, each function gets one new color (the thing to paint on top) and an old color (the color that was already present before the object gets painted)
  • Spot functions, Table 6.1, Page 489 in Pdf Reference v1.7 The code is pseudo code, returning the grey component at (x, y).
  • Text matrix. Text placement is a bit different, but uses the same concepts as a CTM that’s why it’s merged here
  • The text rendering mode determines how a text is drawn The default rendering mode is Fill. The color of the fill / stroke is determine by the current pages outline / fill color.
  • See BlackGenerationFunction, too. Undercolor removal reduces the amounts of the cyan, magenta, and yellow components to compensate for the amount of black that was added by black generation.
  • External object that gets reference outside the PDF content stream Gets constructed similar to the ExtGState, then inserted into the /XObject dictionary on the page. You can instantiate XObjects with the /Do operator. The layer.add_xobject() (or better yet, the layer.add_image(), layer.add_form()) methods will do this for you.

Constants§

Traits§

Functions§

  • Calculates and returns the points for an approximated circle, given a radius and an offset into the centre of circle (starting from bottom left corner of page).
  • Calculates and returns the points for a rectangle, given a horizontal and vertical scale. and an offset into the centre of rectangle (starting from bottom left corner of page).