Expand description
Cached Regex Patterns and Fast Content Checks for Markdown Linting
This module provides a centralized collection of pre-compiled, cached regex patterns for all major Markdown constructs (headings, lists, code blocks, links, images, etc.). It also includes fast-path utility functions for quickly checking if content potentially contains certain Markdown elements, allowing rules to skip expensive processing when unnecessary.
§Performance
All regexes are compiled once at startup using lazy_static, avoiding repeated
compilation and improving performance across the linter. Use these shared patterns
in rules instead of compiling new regexes.
§Usage
- Use the provided statics for common Markdown patterns.
- Use the
regex_lazy!macro for ad-hoc regexes that are not predefined. - Use the utility functions for fast content checks before running regexes.
Macros§
- regex_
lazy - Macro for defining a lazily-initialized, cached regex pattern.
Structs§
- Regex
Cache - Global regex cache for dynamic patterns
Constants§
- URL_
IPV6_ STR - Pattern for IPv6 URLs specifically.
- URL_
QUICK_ CHECK_ STR - Quick check pattern for early exits.
- URL_
SIMPLE_ STR - Simple URL pattern for content detection.
- URL_
STANDARD_ STR - Pattern for standard HTTP(S)/FTP(S) URLs with full path support.
- URL_
WWW_ STR - Pattern for www URLs without protocol.
- XMPP_
URI_ STR - Pattern for XMPP URIs per GFM extended autolinks specification.
Statics§
- ABBREVIATION
- ASTERISK_
EMPHASIS - ATX_
HEADING_ REGEX - BLOCKQUOTE_
PREFIX_ RE - DISPLAY_
MATH_ REGEX - DOUBLE_
UNDERSCORE_ EMPHASIS - EMAIL_
PATTERN - EMOJI_
SHORTCODE_ REGEX - FENCED_
CODE_ BLOCK_ END - FENCED_
CODE_ BLOCK_ START - FOOTNOTE_
REF_ REGEX - HTML_
COMMENT_ PATTERN - HTML_
ENTITY_ REGEX - HTML_
HEADING_ PATTERN - HTML_
TAG_ PATTERN - HTML_
TAG_ QUICK_ CHECK - HTML_
TAG_ REGEX - HUGO_
SHORTCODE_ REGEX - IMAGE_
REF_ PATTERN - IMAGE_
REGEX - INLINE_
IMAGE_ REGEX - INLINE_
LINK_ FANCY_ REGEX - INLINE_
MATH_ REGEX - LINKED_
IMAGE_ INLINE_ INLINE - LINKED_
IMAGE_ INLINE_ REF - LINKED_
IMAGE_ REF_ INLINE - LINKED_
IMAGE_ REF_ REF - LINK_
REF_ PATTERN - LIST_
ITEM - ORDERED_
LIST_ MARKER_ REGEX - REF_
IMAGE_ REGEX - REF_
LINK_ REGEX - SHORTCUT_
REF_ REGEX - UNDERSCORE_
EMPHASIS - UNORDERED_
LIST_ MARKER_ REGEX - URL_
IPV6_ REGEX - IPv6 URL regex - for URLs with IPv6 addresses.
See
URL_IPV6_STRfor documentation. - URL_
PATTERN - Alias for
URL_SIMPLE_REGEX. Used by MD013 for line length exemption. - URL_
QUICK_ CHECK_ REGEX - Quick check regex - fast early-exit test.
See
URL_QUICK_CHECK_STRfor documentation. - URL_
SIMPLE_ REGEX - Simple URL regex - for content detection and line length exemption.
See
URL_SIMPLE_STRfor documentation. - URL_
STANDARD_ REGEX - Standard URL regex - primary pattern for bare URL detection (MD034).
See
URL_STANDARD_STRfor documentation. - URL_
WWW_ REGEX - WWW URL regex - for URLs starting with www. without protocol.
See
URL_WWW_STRfor documentation. - WIKI_
LINK_ REGEX - XMPP_
URI_ REGEX - XMPP URI regex - for GFM extended autolinks.
See
XMPP_URI_STRfor documentation.
Functions§
- escape_
regex - Escapes a string to be used in a regex pattern
- get_
cache_ stats - Get cache usage statistics
- get_
cached_ regex - Get a regex from the global cache
- is_
blank_ in_ blockquote_ context - Check if a line is blank in the context of blockquotes.