pdfplumber-cli-0.2.0 is not a library.
pdfplumber-cli
Command-line tool to extract text, characters, words, and tables from PDF documents.
pdfplumber-cli is the CLI frontend for pdfplumber-rs, a Rust port of Python's pdfplumber.
Installation
Usage
Subcommands
| Command | Description |
|---|---|
text |
Extract text from PDF pages |
chars |
Extract individual characters with coordinates |
words |
Extract words with bounding box coordinates |
tables |
Detect and extract tables from PDF pages |
info |
Display PDF metadata and page information |
Global Options
| Option | Description |
|---|---|
--version |
Print version number |
--help |
Print help information |
Extract Text
# Extract all text
# Extract text from specific pages
# Layout-preserving extraction
# JSON output (one object per page)
Extract Characters
# Tab-separated output (default)
# JSON output with all fields (text, fontname, size, bbox, etc.)
# CSV output
Example CSV output:
page,text,x0,top,x1,bottom,fontname,size
1,H,72.00,72.00,84.00,84.00,Helvetica,12.00
1,e,84.00,72.00,90.72,84.00,Helvetica,12.00
Extract Words
# Tab-separated output (default)
# JSON output
# CSV output with custom tolerances
Example CSV output:
page,text,x0,top,x1,bottom
1,Hello,72.00,72.00,108.00,84.00
1,World,112.00,72.00,148.00,84.00
Extract Tables
# Human-readable grid format (default)
# JSON output
# CSV output
# Use stream strategy instead of lattice
# Tune detection parameters
Example grid output:
Name | Age | City
Alice | 30 | New York
Bob | 25 | London
Inspect PDF Info
# Text summary
# JSON output
# Specific pages only
Example text output:
=== PDF Info ===
Pages: 3
Chars: 1250
Lines: 45
Rects: 12
Curves: 0
Images: 2
=== Summary ===
Total chars: 3200
Total tables: 1
Output Formats
| Subcommand | text (default) | json | csv |
|---|---|---|---|
text |
Plain text | JSON lines | — |
chars |
TSV | JSON array | CSV |
words |
TSV | JSON array | CSV |
tables |
Grid | JSON array | CSV |
info |
Summary | JSON | — |
Page Selection
Use --pages to select specific pages (1-indexed):
--pages 1— single page--pages 1-5— range--pages 1,3,5— list--pages 1-3,7,10-12— mixed
Omit --pages to process all pages.
License
MIT OR Apache-2.0