Skip to main content

Module extract

Module extract 

Source
Expand description

Content extraction — converts raw HTML into readable Markdown or structured JSON.

Structs§

ArticleData
Structured article data for JSON output.
ExtractInput
Input parameters for content extraction.

Enums§

ExtractError
Errors that can occur during content extraction.

Functions§

extract_json
Extract readable content as JSON.
extract_pdf
Extract text content from a PDF byte slice.
extract_text
Extract readable content as Markdown text.