pub fn convert_with_metadata(
html: &str,
options: Option<ConversionOptions>,
metadata_cfg: MetadataConfig,
) -> Result<(String, ExtendedMetadata)>Expand description
Convert HTML to Markdown with comprehensive metadata extraction (requires the metadata feature).
Performs HTML-to-Markdown conversion while simultaneously extracting structured metadata in a single pass for maximum efficiency. Ideal for content analysis, SEO optimization, and document indexing workflows.
§Arguments
html- The HTML string to convert. Will normalize line endings (CRLF → LF).options- Optional conversion configuration. Defaults toConversionOptions::default()ifNone. Controls heading style, list indentation, escape behavior, wrapping, and other output formatting.metadata_cfg- Configuration for metadata extraction granularity. UseMetadataConfig::default()to extract all metadata types, or customize with selective extraction flags.
§Returns
On success, returns a tuple of:
String: The converted Markdown outputExtendedMetadata: Comprehensive metadata containing:document: Title, description, author, language, Open Graph, Twitter Card, and other meta tagsheaders: All heading elements (h1-h6) with hierarchy and IDslinks: Hyperlinks classified as anchor, internal, external, email, or phoneimages: Image elements with source, dimensions, and alt textstructured_data: JSON-LD, Microdata, and RDFa blocks
§Errors
Returns ConversionError if:
- HTML parsing fails
- Invalid UTF-8 sequences encountered
- Internal panic during conversion (wrapped in
ConversionError::Panic) - Configuration size limits exceeded
§Performance Notes
- Single-pass collection: metadata extraction has minimal overhead
- Zero cost when metadata feature is disabled
- Pre-allocated buffers: typically handles 50+ headers, 100+ links, 20+ images efficiently
- Structured data size-limited to prevent memory exhaustion (configurable)
§Example: Basic Usage
ⓘ
use html_to_markdown_rs::{convert_with_metadata, MetadataConfig};
let html = r#"
<html lang="en">
<head><title>My Article</title></head>
<body>
<h1 id="intro">Introduction</h1>
<p>Welcome to <a href="https://example.com">our site</a></p>
</body>
</html>
"#;
let (markdown, metadata) = convert_with_metadata(html, None, MetadataConfig::default())?;
assert_eq!(metadata.document.title, Some("My Article".to_string()));
assert_eq!(metadata.document.language, Some("en".to_string()));
assert_eq!(metadata.headers[0].text, "Introduction");
assert_eq!(metadata.headers[0].id, Some("intro".to_string()));
assert_eq!(metadata.links.len(), 1);§Example: Selective Metadata Extraction
ⓘ
use html_to_markdown_rs::{convert_with_metadata, MetadataConfig};
let html = "<html><body><h1>Title</h1><a href='#anchor'>Link</a></body></html>";
// Extract only headers and document metadata, skip links/images
let config = MetadataConfig {
extract_headers: true,
extract_links: false,
extract_images: false,
extract_structured_data: false,
max_structured_data_size: 0,
};
let (markdown, metadata) = convert_with_metadata(html, None, config)?;
assert!(metadata.headers.len() > 0);
assert!(metadata.links.is_empty()); // Not extracted§Example: With Conversion Options and Metadata Config
ⓘ
use html_to_markdown_rs::{convert_with_metadata, ConversionOptions, MetadataConfig, HeadingStyle};
let html = "<html><head><title>Blog Post</title></head><body><h1>Hello</h1></body></html>";
let options = ConversionOptions {
heading_style: HeadingStyle::Atx,
wrap: true,
wrap_width: 80,
..Default::default()
};
let metadata_cfg = MetadataConfig::default();
let (markdown, metadata) = convert_with_metadata(html, Some(options), metadata_cfg)?;
// Markdown will use ATX-style headings (# H1, ## H2, etc.)
// Wrapped at 80 characters
// All metadata extracted§See Also
convert- Simple HTML to Markdown conversion without metadata- [
convert_with_inline_images] - Conversion with inline image extraction MetadataConfig- Configuration for metadata extractionExtendedMetadata- Metadata structure documentationmetadatamodule - Detailed type documentation for metadata components