# MD033 - No HTML tags
Aliases: `no-inline-html`
## What this rule does
Prevents the use of HTML tags in Markdown - use Markdown syntax instead.
## Why this matters
- **Portability**: Pure Markdown works everywhere, HTML might be blocked or stripped
- **Security**: Many platforms sanitize HTML for security reasons
- **Simplicity**: Markdown syntax is cleaner and easier to read than HTML
- **Consistency**: Mixing HTML and Markdown creates inconsistent documents
## Examples
### ✅ Correct
```markdown
# Heading
This is a paragraph with **bold** and *italic* text.
> This is a quote
- List item 1
- List item 2
[Link text](https://example.com)

Contact us at <support@example.com>
Visit <https://example.com>
```
### ❌ Incorrect
```markdown
# Heading
This is a paragraph with <strong>bold</strong> and <em>italic</em> text.
<blockquote>This is a quote</blockquote>
<ul>
<li>List item 1</li>
<li>List item 2</li>
</ul>
<a href="https://example.com">Link text</a>
<img src="image.png" alt="Image description">
```
### 🔧 Fixed
```markdown
# Heading
This is a paragraph with **bold** and *italic* text.
> This is a quote
- List item 1
- List item 2
[Link text](https://example.com)

```
## Configuration
```toml
[MD033]
allowed-elements = [] # List of allowed HTML tags (default: none)
disallowed-elements = [] # List of disallowed HTML tags (enables disallowed-only mode)
fix = false # Enable auto-fix to convert simple HTML to Markdown (default: false)
fix-mode = "conservative" # conservative (default) or relaxed
drop-attributes = ["target", "rel", "width", "height", "align", "class", "id", "style"] # Used in relaxed mode
strip-wrapper-elements = ["p"] # Used in relaxed mode
br-style = "trailing-spaces" # Style for <br> conversion: "trailing-spaces" or "backslash"
```
Shorthand aliases are also supported:
```toml
[MD033]
allowed = [] # Alias for allowed-elements
disallowed = [] # Alias for disallowed-elements
```
### Example allowing specific tags
```toml
[MD033]
allowed-elements = ["br", "hr", "details", "summary"]
```
This would allow line breaks, horizontal rules, and collapsible sections while blocking other HTML.
### GFM Security Mode (disallowed-only)
For GitHub Flavored Markdown, you can use the `disallowed-elements` option to only flag
security-sensitive HTML tags while allowing all other HTML. Use the special value `"gfm"`
to automatically include all GFM-disallowed tags:
```toml
[MD033]
disallowed-elements = ["gfm"]
```
This flags only these security-sensitive tags:
- `<title>`, `<textarea>`, `<style>`, `<xmp>`, `<iframe>`
- `<noembed>`, `<noframes>`, `<script>`, `<plaintext>`
These are the same tags that GitHub filters from rendered markdown for security reasons.
### Custom disallowed tags
You can also specify your own list of disallowed tags:
```toml
[MD033]
disallowed-elements = ["script", "iframe", "style"]
```
Or combine GFM tags with custom ones:
```toml
[MD033]
disallowed-elements = ["gfm", "marquee", "blink"]
```
### mdbook projects with semantic HTML
mdbook documentation often uses HTML with CSS classes to add semantic meaning that pure Markdown cannot express (e.g., marking text as filenames, captions, or warnings). For mdbook projects, you can
allow semantic containers:
```toml
[tool.rumdl.MD033]
allowed-elements = ["div", "span"]
```
This permits semantic HTML like:
- `<span class="filename">src/main.rs</span>` - Filename styling
- `<div class="warning">Important note</div>` - Warning boxes
- `<span class="caption">Figure 1: Architecture</span>` - Figure captions
While still catching potentially problematic HTML like `<em>`, `<strong>`, or `<script>` tags that have Markdown equivalents or security concerns.
## Automatic fixes
Auto-fix for MD033 is **opt-in** (disabled by default). Enable it with:
```toml
[MD033]
fix = true
```
When enabled, simple HTML tags are converted to their Markdown equivalents:
| `<em>text</em>`, `<i>text</i>` | `*text*` |
| `<strong>text</strong>`, `<b>text</b>` | `**text**` |
| `<code>text</code>` | `` `text` `` |
| `<a href="url">text</a>` | `[text](url)` |
| `<a href="url" title="tip">text</a>` | `[text](url "tip")` |
| `<img src="url" alt="text">` | `` |
| `<img src="url" alt="text" title="tip">` | `` |
| `<br>`, `<br/>` | Two trailing spaces + newline |
| `<hr>`, `<hr/>` | `---` |
### Relaxed conversion mode
By default, MD033 uses conservative conversion. In conservative mode, links and images
with significant extra attributes are not converted.
If you want broader conversion for common real-world docs HTML, enable relaxed mode:
```toml
[MD033]
fix = true
fix-mode = "relaxed"
```
In relaxed mode:
- `<a href="..." target="_blank">...</a>` can be converted to Markdown link syntax
- `<img src="..." alt="..." width="120">` can be converted to Markdown image syntax
- configured wrapper elements (default: `p`) can be stripped once their inner content no longer contains HTML tags
You can customize which attributes are dropped and which wrappers are stripped:
```toml
[MD033]
fix = true
fix-mode = "relaxed"
drop-attributes = ["target", "rel", "width", "height", "align", "class", "id", "style"]
strip-wrapper-elements = ["p"]
```
**Link and image conversion safety:**
Links (`<a>`) and images (`<img>`) are only converted when safe:
- URL must use a safe scheme (http, https, mailto, tel, ftp) or be a relative path
- Dangerous schemes like `javascript:`, `data:`, `vbscript:` are never converted
- URL encoding bypass attempts are detected and blocked (`java%73cript:`, `javascript:`)
- Conservative mode: tags with significant extra attributes are not converted
- Relaxed mode: only attributes in `drop-attributes` are dropped; unknown extra attributes still block conversion
- Event handler attributes (`onclick`, `onload`, etc.) are never dropped, even in relaxed mode
- The `title` attribute IS supported and preserved in the Markdown output
- Links with nested HTML content are not converted
- Special characters in URLs (parentheses) and text (brackets) are properly escaped
**Limitations:**
- Conservative mode: tags with attributes beyond href/src/alt/title are usually not converted
- Relaxed mode: wrapper stripping is applied only after wrapper inner content no longer contains HTML tags
- Tags with nested HTML content are not converted to Markdown
- Complex tags (like `<div>`, `<span>`) have their content extracted but are not converted to Markdown equivalents
- For deeply nested HTML, you may need to run the fix multiple times
**Line break style:**
By default, `<br>` tags are converted to two trailing spaces followed by a newline (CommonMark standard). You can use backslash-style line breaks instead:
```toml
[MD033]
fix = true
br-style = "backslash" # Converts <br> to backslash + newline
```
## What's allowed
These are **not** considered HTML and are allowed:
- HTML comments: `<!-- This is a comment -->`
- Email autolinks: `<user@example.com>`
- URL autolinks: `<https://example.com>`
- FTP autolinks: `<ftp://files.example.com>`
## Learn more
- [CommonMark HTML blocks](https://spec.commonmark.org/0.31.2/#html-blocks) - When HTML is needed
- [Markdown Guide - Basic Syntax](https://www.markdownguide.org/basic-syntax/) - Markdown alternatives to HTML
## Related rules
- [MD046](md046.md) - Code block style should be consistent
- [MD034](md034.md) - URLs should be formatted as links