Module content

Expand description

PDF Content Stream Parser - Complete support for PDF graphics operators

This module implements comprehensive parsing of PDF content streams according to the PDF specification. Content streams contain the actual drawing instructions (operators) that render text, graphics, and images on PDF pages.

§Overview

Content streams are sequences of PDF operators that describe:

Text positioning and rendering
Path construction and painting
Color and graphics state management
Image and XObject placement
Coordinate transformations

§Architecture

The parser is divided into two main components:

ContentTokenizer: Low-level tokenization of content stream bytes
ContentParser: High-level parsing of tokens into structured operations

§Example

use oxidize_pdf::parser::content::{ContentParser, ContentOperation};

// Parse a content stream
let content_stream = b"BT /F1 12 Tf 100 200 Td (Hello World) Tj ET";
let operations = ContentParser::parse_content(content_stream)?;

// Process operations
for op in operations {
    match op {
        ContentOperation::BeginText => println!("Start text object"),
        ContentOperation::SetFont(name, size) => println!("Font: {} at {}", name, size),
        ContentOperation::ShowText(text) => println!("Text: {:?}", text),
        _ => {}
    }
}

§Supported Operators

This parser supports all standard PDF operators including:

Text operators (BT, ET, Tj, TJ, Tf, Td, etc.)
Graphics state operators (q, Q, cm, w, J, etc.)
Path construction operators (m, l, c, re, h)
Path painting operators (S, f, B, n, etc.)
Color operators (g, rg, k, cs, scn, etc.)
XObject operators (Do)
Marked content operators (BMC, BDC, EMC, etc.)

Structs§

ContentParser: High-level content stream parser.
ContentTokenizer: Content stream tokenizer

Enums§

ContentOperation: Represents a single operator in a PDF content stream.
TextElement: Represents a text element in a TJ array for ShowTextArray operations.

Module content

Module content Copy item path

§Overview

§Architecture

§Example

§Supported Operators

Structs§

Enums§

Module content