Skip to main content

Module text

Module text 

Source

Structs§

WorkTextResult
Result of extracting text from a work’s PDF.
ZoteroItemInfo
Brief Zotero library info for a work matched by DOI.

Enums§

PdfSource
Where the PDF was obtained from.
ProcessingMode
WorkTextError
Errors from the work_text pipeline.

Functions§

extract_text_bytes
Extract text from PDF bytes using pdf-extract.
find_work_in_zotero
Check if a work exists in the Zotero library, matched by DOI.
poll_zotero_for_work
Poll Zotero for a work by DOI. Waits 5s initially, then polls every 2s for up to ~2 min.
try_zotero
Try to find and download a PDF from Zotero (local storage first, then remote API).
work_text
Download and extract the full text of a scholarly work.