Expand description
Sentence detection utilities
This module provides shared functionality for detecting sentence boundaries in markdown text. Used by both text reflow (MD013) and the multiple spaces rule (MD064).
Features:
- Common abbreviation detection (Mr., Dr., Prof., etc.)
- CJK punctuation support (。, !, ?)
- Closing quote detection (straight and curly)
- Both forward-looking (reflow) and backward-looking (MD064) sentence detection
Constants§
- DEFAULT_
ABBREVIATIONS - Default abbreviations that should NOT be treated as sentence endings.
Functions§
- get_
abbreviations - Get the effective abbreviations set based on custom additions All abbreviations are normalized to lowercase for case-insensitive matching Custom abbreviations are always merged with built-in defaults
- is_
after_ sentence_ ending - Check if multiple spaces occur immediately after sentence-ending punctuation. This is a backward-looking check used by MD064.
- is_
after_ sentence_ ending_ with_ abbreviations - Check if multiple spaces occur immediately after sentence-ending punctuation, with a custom abbreviations set.
- is_
cjk_ char - Check if a character is a CJK character (Chinese, Japanese, Korean)
- is_
cjk_ sentence_ ending - Check if a character is CJK sentence-ending punctuation These include: 。(ideographic full stop), !(fullwidth exclamation), ?(fullwidth question)
- is_
closing_ quote - Check if a character is a closing quote mark Includes straight quotes and curly/smart quotes
- is_
opening_ quote - Check if a character is an opening quote mark Includes straight quotes and curly/smart quotes
- is_
sentence_ ending_ punctuation - Check if a character is sentence-ending punctuation (ASCII or CJK)
- is_
trailing_ close_ punctuation - Check if a character is closing punctuation that can follow sentence-ending punctuation This includes closing quotes, parentheses, and brackets
- text_
ends_ with_ abbreviation - Check if text ends with a common abbreviation followed by a period