Expand description
Text utilities for markdown parsing and chunking.
Line splitting, fence/heading detection, token estimation, wikilink parsing, and keyword/path normalization. Ported from the TypeScript Talon implementation.
Structs§
- Line
Span - A line span within the original content.
- Parsed
Wiki Link - Parsed components of a wikilink.
Constants§
- TOKEN_
CHAR_ RATIO - Token-to-character ratio for rough token estimation.
Functions§
- estimate_
tokens - Estimates the number of tokens in text using a character ratio.
- is_
fence_ line - Checks if a line is a fenced code block (3+ backticks or tildes).
- is_
heading_ line - Checks if a line is an ATX heading (1-6 hash characters followed by space).
- normalize_
keyword - Normalizes a keyword for comparison: NFD normalization + lowercase + trim.
- normalize_
vault_ path - Normalizes a vault path: backslashes to forward slashes, NFD normalization.
- parse_
wikilink - Parses a raw wikilink string into components.
- split_
lines - Splits markdown content into line spans.
- strip_
heading_ text - Strips heading markers from a heading line.
- strip_
outer_ quotes - Strips outer matching quotes from a string.