Skip to main content

Crate pdfplumber_core

Crate pdfplumber_core 

Source
Expand description

Backend-independent data types and algorithms for pdfplumber-rs.

This crate provides the foundational types (BBox, Char, Word, Line, Rect, Table, etc.) and algorithms (text grouping, table detection) used by pdfplumber-rs. It has no required external dependencies — all functionality is pure Rust.

§Modules

Re-exports§

pub use edges::Edge;
pub use edges::EdgeSource;
pub use edges::derive_edges;
pub use edges::edge_from_curve;
pub use edges::edge_from_line;
pub use edges::edges_from_rect;
pub use encoding::EncodingResolver;
pub use encoding::FontEncoding;
pub use encoding::StandardEncoding;
pub use error::ExtractOptions;
pub use error::ExtractResult;
pub use error::ExtractWarning;
pub use error::PdfError;
pub use geometry::BBox;
pub use geometry::Ctm;
pub use geometry::Orientation;
pub use geometry::Point;
pub use images::Image;
pub use images::ImageMetadata;
pub use images::image_from_ctm;
pub use layout::TextBlock;
pub use layout::TextLine;
pub use layout::TextOptions;
pub use layout::blocks_to_text;
pub use layout::cluster_lines_into_blocks;
pub use layout::cluster_words_into_lines;
pub use layout::sort_blocks_reading_order;
pub use layout::split_lines_at_columns;
pub use layout::words_to_text;
pub use painting::Color;
pub use painting::DashPattern;
pub use painting::ExtGState;
pub use painting::FillRule;
pub use painting::GraphicsState;
pub use painting::PaintedPath;
pub use path::Path;
pub use path::PathBuilder;
pub use path::PathSegment;
pub use shapes::Curve;
pub use shapes::Line;
pub use shapes::LineOrientation;
pub use shapes::Rect;
pub use shapes::extract_shapes;
pub use table::Cell;
pub use table::ExplicitLines;
pub use table::Intersection;
pub use table::Strategy;
pub use table::Table;
pub use table::TableFinder;
pub use table::TableSettings;
pub use table::cells_to_tables;
pub use table::edges_to_intersections;
pub use table::explicit_lines_to_edges;
pub use table::extract_text_for_cells;
pub use table::intersections_to_cells;
pub use table::join_edge_group;
pub use table::snap_edges;
pub use table::words_to_edges_stream;
pub use text::Char;
pub use text::TextDirection;
pub use text::is_cjk;
pub use text::is_cjk_text;
pub use words::Word;
pub use words::WordExtractor;
pub use words::WordOptions;

Modules§

edges
Edge derivation from geometric primitives for table detection. Edge derivation from geometric primitives.
encoding
Font encoding mapping (Standard, Windows, Mac, Custom). Standard PDF text encodings and encoding resolution.
error
Error and warning types for PDF processing. Error and warning types for pdfplumber-rs.
geometry
Geometric primitives: Point, BBox, CTM, Orientation.
images
Image extraction and metadata. Image extraction from XObject Do operator.
layout
Text layout: words → lines → blocks, reading order, text output.
painting
Graphics state, colors, dash patterns, and painted paths. Path painting operators, graphics state, and ExtGState types.
path
PDF path construction (MoveTo, LineTo, CurveTo, ClosePath).
shapes
Shape extraction: Lines, Rects, Curves from painted paths. Line and Rect extraction from painted paths.
table
Table detection: lattice, stream, and explicit strategies. Table detection types and pipeline.
text
Character data types and CJK detection.
words
Word extraction from characters based on spatial proximity.