Skip to main content

Module content

Module content 

Source
Expand description

PDF Content Stream Parser - Complete support for PDF graphics operators

This module implements comprehensive parsing of PDF content streams according to the PDF specification. Content streams contain the actual drawing instructions (operators) that render text, graphics, and images on PDF pages.

§Overview

Content streams are sequences of PDF operators that describe:

  • Text positioning and rendering
  • Path construction and painting
  • Color and graphics state management
  • Image and XObject placement
  • Coordinate transformations

§Architecture

The parser is divided into two main components:

  • ContentTokenizer: Low-level tokenization of content stream bytes
  • ContentParser: High-level parsing of tokens into structured operations

§Example

use oxidize_pdf::parser::content::{ContentParser, ContentOperation};

// Parse a content stream
let content_stream = b"BT /F1 12 Tf 100 200 Td (Hello World) Tj ET";
let operations = ContentParser::parse_content(content_stream)?;

// Process operations
for op in operations {
    match op {
        ContentOperation::BeginText => println!("Start text object"),
        ContentOperation::SetFont(name, size) => println!("Font: {} at {}", name, size),
        ContentOperation::ShowText(text) => println!("Text: {:?}", text),
        _ => {}
    }
}

§Supported Operators

This parser supports all standard PDF operators including:

  • Text operators (BT, ET, Tj, TJ, Tf, Td, etc.)
  • Graphics state operators (q, Q, cm, w, J, etc.)
  • Path construction operators (m, l, c, re, h)
  • Path painting operators (S, f, B, n, etc.)
  • Color operators (g, rg, k, cs, scn, etc.)
  • XObject operators (Do)
  • Marked content operators (BMC, BDC, EMC, etc.)

Structs§

ContentParser
High-level content stream parser.
ContentTokenizer
Content stream tokenizer

Enums§

ContentOperation
Represents a single operator in a PDF content stream.
TextElement
Represents a text element in a TJ array for ShowTextArray operations.