1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
//! HTML to Markdown conversion engine with modular architecture.
//!
//! This module provides the complete conversion pipeline for transforming HTML documents
//! into Markdown format. It follows a modular, type-safe design where HTML element handling
//! is organized by semantic category (block, inline, list, table, etc.) with dispatch functions
//! routing elements to their specialized handlers.
//!
//! # Module Organization
//!
//! The converter module is organized into semantic categories:
//!
//! - **[block]**: Block-level elements (headings, paragraphs, blockquotes, preformatted text, tables)
//! - **[inline]**: Inline formatting (emphasis, links, code, semantic formatting)
//! - **[list]**: List structures (ordered, unordered, definition lists)
//! - **[table]**: Accessible via `block::table` submodule
//! - **[media]**: Media elements (images, video, audio, embedded content, SVG)
//! - **[semantic]**: Semantic HTML5 elements (sectioning, figures, interactive elements)
//! - **[form]**: Form elements (inputs, selects, buttons, fieldsets)
//! - **[utility]**: Helper functions (DOM traversal, caching, serialization, attributes)
//! - **[text]**: Text processing and escaping (via crate::text module)
//!
//! # Public Types
//!
//! The main context types used across the conversion pipeline:
//!
//! - **[Context]**: Stateful conversion context tracking (e.g., list nesting, code blocks, in_heading)
//! - **[DomContext]**: DOM relationship cache for efficient tree navigation
//!
//! # Conversion Flow
//!
//! The conversion process follows these steps:
//!
//! 1. **Parse HTML**: Input HTML is parsed into a DOM tree using the astral-tl parser
//! 2. **Walk Tree**: Recursive tree walk starting from the root document node
//! 3. **Dispatch**: Each element is dispatched to its handler based on tag name
//! 4. **Convert**: Handler transforms the element to Markdown representation
//! 5. **Post-process**: Text escaping and whitespace normalization
//!
//! # Handler Pattern
//!
//! Each submodule (block, inline, list, etc.) follows a consistent pattern:
//!
//! ```text
//! // Module declares handlers for specific element types
//! pub fn dispatch_<category>_handler(
//! tag_name: &str,
//! node_handle: &NodeHandle,
//! parser: &Parser,
//! output: &mut String,
//! options: &ConversionOptions,
//! ctx: &Context,
//! depth: usize,
//! dom_ctx: &DomContext,
//! ) -> bool {
//! // Route to appropriate handler, return true if handled
//! }
//! ```
//!
//! # Visibility Rules
//!
//! - **Context & DomContext**: Public types for external module coordination
//! - **Dispatch functions**: Public for main walk_node caller
//! - **Individual handlers**: Typically pub for direct access if needed
//! - **Internal utilities**: pub(crate) or pub(super) for module-internal use
//!
//! # Feature Support
//!
//! - Inline image extraction (`inline-images` feature)
//! - Metadata collection (`metadata` feature)
//! - Custom visitor callbacks (`visitor` feature)
//!
//! # Example Integration
//!
//! Once `converter.rs` is refactored to use `converter/main.rs`, the walk_node function
//! will use dispatch functions like:
//!
//! ```text
//! use crate::converter::{block, inline, list, media, semantic, form};
//!
//! fn walk_node(...) {
//! // Try each dispatcher in order
//! if block::dispatch_block_handler(&tag, ...) { return; }
//! if inline::dispatch_inline_handler(&tag, ...) { return; }
//! if list::dispatch_list_handler(&tag, ...) { return; }
//! if media::dispatch_media_handler(&tag, ...) { return; }
//! if semantic::dispatch_semantic_handler(&tag, ...) { return; }
//! if form::dispatch_form_handler(&tag, ...) { return; }
//! // Default handling for unrecognized tags
//! }
//! ```
// Import and re-export public types and functions from the main module
pub use Context;
pub use DomContext;
// Import the tree walker and utility functions from main and main_helpers
pub use ;
pub use trim_trailing_whitespace;
// Re-export helper functions from utility modules (migrated from converter_legacy)
pub use crate;
pub use crate;
// Helper functions migrated to utility modules
pub use crateappend_inline_suffix;
// Caching functions migrated to utility/caching
// Content functions migrated to utility/content
// Heading functions migrated to block/heading
pub use cratefind_single_heading_child;
// Link functions migrated to inline/link
// Re-export dispatch functions for routing elements to handlers
// Media module doesn't have a dispatcher - it exports utility functions
// Re-export utility submodules for public access to their types
// NOTE: utility::preprocessing is deliberately not re-exported to avoid naming conflict
// with preprocessing_helpers module. Users should access utility::preprocessing directly.
// Re-export format renderer types
// Block and inline handlers are internal - only dispatchers are exposed
// Individual handlers are pub(crate) and not meant to be part of the public API
// Re-export media utilities for internal use (crate-private)
// Re-export list utilities for internal use (crate-private)
// Semantic and form handlers are also internal (pub(crate))