Skip to main content

work_text

Function work_text 

Source
pub async fn work_text(
    openalex: &OpenAlexClient,
    zotero: Option<&ZoteroClient>,
    datalab: Option<(&DatalabClient, ProcessingMode)>,
    work_id: &str,
) -> Result<WorkTextResult, WorkTextError>
Expand description

Download and extract the full text of a scholarly work.

Tries multiple sources in priority order:

  1. Local Zotero storage (filesystem)
  2. Remote Zotero API (if credentials available)
  3. Direct PDF URLs from OpenAlex locations (whitelisted domains)
  4. OpenAlex Content API (requires OPENALEX_API_KEY)

When datalab is Some, the final extraction step uses the DataLab Marker API instead of local pdfium extraction, producing higher-quality markdown. The ProcessingMode controls quality vs. speed: Fast < Balanced < Accurate.