Skip to main content

Module extract

Module extract 

Source

Functionsยง

content_root
Pick the element most likely to contain the primary article content.
extract_metadata
Extract citation-oriented metadata: description, author, publish date, language, and site name (from standard <meta>/OpenGraph tags).
extract_title
Extract the page title from <title> or the first <h1>.