Skip to main content

Module html

Module html 

Source
Expand description

HTML processing utilities

Provides HTML cleaning and conversion functions for documentation extraction.

Functions§

clean_html
Clean HTML by removing unwanted tags (script, style, noscript, iframe) and their content
extract_documentation
Extract documentation from HTML by cleaning and converting to Markdown
extract_search_results
Extract search results from HTML
html_to_text
Convert HTML to plain text by removing all HTML tags