Module regex_cache

Module regex_cache 

Source
Expand description

Cached Regex Patterns and Fast Content Checks for Markdown Linting

This module provides a centralized collection of pre-compiled, cached regex patterns for all major Markdown constructs (headings, lists, code blocks, links, images, etc.). It also includes fast-path utility functions for quickly checking if content potentially contains certain Markdown elements, allowing rules to skip expensive processing when unnecessary.

§Performance

All regexes are compiled once at startup using lazy_static, avoiding repeated compilation and improving performance across the linter. Use these shared patterns in rules instead of compiling new regexes.

§Usage

  • Use the provided statics for common Markdown patterns.
  • Use the regex_lazy! macro for ad-hoc regexes that are not predefined.
  • Use the utility functions for fast content checks before running regexes.

Macros§

regex_lazy
Macro for defining a lazily-initialized, cached regex pattern. Use this for ad-hoc regexes that are not already defined in this module. Example:

Structs§

ABBREVIATION
ALTERNATE_FENCED_CODE_BLOCK_END
ALTERNATE_FENCED_CODE_BLOCK_START
ASTERISK_EMPHASIS
ATX_HEADING_REGEX
ATX_HEADING_WITH_CAPTURE
BARE_URL_REGEX
BLOCKQUOTE_PREFIX_RE
BOLD_ASTERISK_REGEX
BOLD_UNDERSCORE_REGEX
CLOSED_ATX_HEADING_REGEX
CODE_FENCE_REGEX
DECIMAL_NUMBER
DISPLAY_MATH_REGEX
DOUBLE_ASTERISK_EMPHASIS
DOUBLE_ASTERISK_SPACE_END
DOUBLE_ASTERISK_SPACE_START
DOUBLE_UNDERSCORE_EMPHASIS
EMAIL_PATTERN
EMOJI_SHORTCODE_REGEX
EMPHASIS_REGEX
EXTERNAL_URL_REGEX
FENCED_CODE_BLOCK_END
FENCED_CODE_BLOCK_END_REGEX
FENCED_CODE_BLOCK_START
FENCED_CODE_BLOCK_START_REGEX
FOOTNOTE_REF_REGEX
FRONT_MATTER_REGEX
HEADING_CHECK
HR_ASTERISK
HR_DASH
HR_SPACED_ASTERISK
HR_SPACED_DASH
HR_SPACED_UNDERSCORE
HR_UNDERSCORE
HTML_COMMENT_END
HTML_COMMENT_PATTERN
HTML_COMMENT_START
HTML_ENTITY_REGEX
HTML_HEADING_PATTERN
HTML_SELF_CLOSING_TAG_REGEX
HTML_TAG_FINDER
HTML_TAG_PATTERN
HTML_TAG_QUICK_CHECK
HTML_TAG_REGEX
IMAGE_REF_PATTERN
IMAGE_REGEX
INDENTED_CODE_BLOCK_PATTERN
INDENTED_CODE_BLOCK_REGEX
INLINE_CODE_REGEX
INLINE_IMAGE_FANCY_REGEX
INLINE_LINK_FANCY_REGEX
INLINE_LINK_REGEX
INLINE_MATH_REGEX
ITALIC_ASTERISK_REGEX
ITALIC_UNDERSCORE_REGEX
LINK_REFERENCE_DEFINITION_REGEX
LINK_REF_PATTERN
LINK_REGEX
LINK_TEXT_FULL_REGEX
LINK_TEXT_REGEX
LIST_ITEM
LIST_MARKER_ANY_REGEX
MULTIPLE_BLANK_LINES_REGEX
MULTIPLE_HYPHENS
ORDERED_LIST_MARKER_REGEX
REFERENCE_LINK
REF_IMAGE_REGEX
REF_LINK_REGEX
RegexCache
Global regex cache for dynamic patterns
SENTENCE_END
SETEXT_HEADING_REGEX
SETEXT_HEADING_WITH_CAPTURE
SHORTCUT_REF_REGEX
SPACE_IN_EMPHASIS_REGEX
STRIKETHROUGH_FANCY_REGEX
STRIKETHROUGH_REGEX
TOC_SECTION_START
TRAILING_PUNCTUATION_REGEX
TRAILING_WHITESPACE_REGEX
UNDERSCORE_EMPHASIS
UNORDERED_LIST_MARKER_REGEX
URL_IN_TEXT
URL_PATTERN
URL_REGEX
WIKI_LINK_REGEX

Functions§

contains_url
Optimize URL detection by implementing a character-by-character scanner that’s much faster than regex for cases where we know there’s no URL
escape_regex
Escapes a string to be used in a regex pattern
get_cache_stats
Get cache usage statistics
get_cached_fancy_regex
Get a fancy regex from the global cache
get_cached_regex
Get a regex from the global cache
has_code_block_markers
Check if content contains any code blocks (quick check before regex)
has_emphasis_markers
Check if content contains any emphasis markers (quick check before regex)
has_heading_markers
Utility functions for quick content checks Check if content contains any headings (quick check before regex)
has_html_tags
Check if content contains any HTML tags (quick check before regex)
has_image_markers
Check if content contains any images (quick check before regex)
has_link_markers
Check if content contains any links (quick check before regex)
has_list_markers
Check if content contains any lists (quick check before regex)