Expand description
Cached Regex Patterns and Fast Content Checks for Markdown Linting
This module provides a centralized collection of pre-compiled, cached regex patterns for all major Markdown constructs (headings, lists, code blocks, links, images, etc.). It also includes fast-path utility functions for quickly checking if content potentially contains certain Markdown elements, allowing rules to skip expensive processing when unnecessary.
§Performance
All regexes are compiled once at startup using lazy_static, avoiding repeated
compilation and improving performance across the linter. Use these shared patterns
in rules instead of compiling new regexes.
§Usage
- Use the provided statics for common Markdown patterns.
- Use the
regex_lazy!macro for ad-hoc regexes that are not predefined. - Use the utility functions for fast content checks before running regexes.
Macros§
- regex_
lazy - Macro for defining a lazily-initialized, cached regex pattern. Use this for ad-hoc regexes that are not already defined in this module. Example:
Structs§
- ABBREVIATION
- ALTERNATE_
FENCED_ CODE_ BLOCK_ END - ALTERNATE_
FENCED_ CODE_ BLOCK_ START - ASTERISK_
EMPHASIS - ATX_
HEADING_ REGEX - ATX_
HEADING_ WITH_ CAPTURE - BARE_
URL_ REGEX - BLOCKQUOTE_
PREFIX_ RE - BOLD_
ASTERISK_ REGEX - BOLD_
UNDERSCORE_ REGEX - CLOSED_
ATX_ HEADING_ REGEX - CODE_
FENCE_ REGEX - DECIMAL_
NUMBER - DISPLAY_
MATH_ REGEX - DOUBLE_
ASTERISK_ EMPHASIS - DOUBLE_
ASTERISK_ SPACE_ END - DOUBLE_
ASTERISK_ SPACE_ START - DOUBLE_
UNDERSCORE_ EMPHASIS - EMAIL_
PATTERN - EMOJI_
SHORTCODE_ REGEX - EMPHASIS_
REGEX - EXTERNAL_
URL_ REGEX - FENCED_
CODE_ BLOCK_ END - FENCED_
CODE_ BLOCK_ END_ REGEX - FENCED_
CODE_ BLOCK_ START - FENCED_
CODE_ BLOCK_ START_ REGEX - FOOTNOTE_
REF_ REGEX - FRONT_
MATTER_ REGEX - HEADING_
CHECK - HR_
ASTERISK - HR_DASH
- HR_
SPACED_ ASTERISK - HR_
SPACED_ DASH - HR_
SPACED_ UNDERSCORE - HR_
UNDERSCORE - HTML_
COMMENT_ END - HTML_
COMMENT_ PATTERN - HTML_
COMMENT_ START - HTML_
ENTITY_ REGEX - HTML_
HEADING_ PATTERN - HTML_
SELF_ CLOSING_ TAG_ REGEX - HTML_
TAG_ FINDER - HTML_
TAG_ PATTERN - HTML_
TAG_ QUICK_ CHECK - HTML_
TAG_ REGEX - IMAGE_
REF_ PATTERN - IMAGE_
REGEX - INDENTED_
CODE_ BLOCK_ PATTERN - INDENTED_
CODE_ BLOCK_ REGEX - INLINE_
CODE_ REGEX - INLINE_
IMAGE_ FANCY_ REGEX - INLINE_
LINK_ FANCY_ REGEX - INLINE_
LINK_ REGEX - INLINE_
MATH_ REGEX - ITALIC_
ASTERISK_ REGEX - ITALIC_
UNDERSCORE_ REGEX - LINK_
REFERENCE_ DEFINITION_ REGEX - LINK_
REF_ PATTERN - LINK_
REGEX - LINK_
TEXT_ FULL_ REGEX - LINK_
TEXT_ REGEX - LIST_
ITEM - LIST_
MARKER_ ANY_ REGEX - MULTIPLE_
BLANK_ LINES_ REGEX - MULTIPLE_
HYPHENS - ORDERED_
LIST_ MARKER_ REGEX - REFERENCE_
LINK - REF_
IMAGE_ REGEX - REF_
LINK_ REGEX - Regex
Cache - Global regex cache for dynamic patterns
- SENTENCE_
END - SETEXT_
HEADING_ REGEX - SETEXT_
HEADING_ WITH_ CAPTURE - SHORTCUT_
REF_ REGEX - SPACE_
IN_ EMPHASIS_ REGEX - STRIKETHROUGH_
FANCY_ REGEX - STRIKETHROUGH_
REGEX - TOC_
SECTION_ START - TRAILING_
PUNCTUATION_ REGEX - TRAILING_
WHITESPACE_ REGEX - UNDERSCORE_
EMPHASIS - UNORDERED_
LIST_ MARKER_ REGEX - URL_
IN_ TEXT - URL_
PATTERN - URL_
REGEX - WIKI_
LINK_ REGEX
Functions§
- contains_
url - Optimize URL detection by implementing a character-by-character scanner that’s much faster than regex for cases where we know there’s no URL
- escape_
regex - Escapes a string to be used in a regex pattern
- get_
cache_ stats - Get cache usage statistics
- get_
cached_ fancy_ regex - Get a fancy regex from the global cache
- get_
cached_ regex - Get a regex from the global cache
- has_
code_ block_ markers - Check if content contains any code blocks (quick check before regex)
- has_
emphasis_ markers - Check if content contains any emphasis markers (quick check before regex)
- has_
heading_ markers - Utility functions for quick content checks Check if content contains any headings (quick check before regex)
- has_
html_ tags - Check if content contains any HTML tags (quick check before regex)
- has_
image_ markers - Check if content contains any images (quick check before regex)
- has_
link_ markers - Check if content contains any links (quick check before regex)
- has_
list_ markers - Check if content contains any lists (quick check before regex)