Skip to main content

Module sentence_utils

Module sentence_utils 

Source
Expand description

Sentence detection utilities

This module provides shared functionality for detecting sentence boundaries in markdown text. Used by both text reflow (MD013) and the multiple spaces rule (MD064).

Features:

  • Common abbreviation detection (Mr., Dr., Prof., etc.)
  • CJK punctuation support (。, !, ?)
  • Closing quote detection (straight and curly)
  • Both forward-looking (reflow) and backward-looking (MD064) sentence detection

Functions§

get_abbreviations
Get the effective abbreviations set based on custom additions All abbreviations are normalized to lowercase for case-insensitive matching Custom abbreviations are always merged with built-in defaults
is_after_sentence_ending
Check if multiple spaces occur immediately after sentence-ending punctuation. This is a backward-looking check used by MD064.
is_cjk_char
Check if a character is a CJK character (Chinese, Japanese, Korean)
is_cjk_sentence_ending
Check if a character is CJK sentence-ending punctuation These include: 。(ideographic full stop), !(fullwidth exclamation), ?(fullwidth question)
is_closing_quote
Check if a character is a closing quote mark Includes straight quotes and curly/smart quotes
is_opening_quote
Check if a character is an opening quote mark Includes straight quotes and curly/smart quotes
text_ends_with_abbreviation
Check if text ends with a common abbreviation followed by a period