count_document

Function count_document 

Source
pub fn count_document(
    introspector: &Introspector,
    exclude_imports: bool,
    main_file_id: FileId,
) -> Count
Expand description

Counts words and characters in a compiled Typst document.

This function traverses all elements in the document using the introspector and extracts plain text content. It handles the following cases:

  • Text styling: Skips styling elements (bold, italic, etc.) to avoid double-counting since their text is already included in parent elements.
  • Math equations: Skips mathematical notation to avoid counting math symbols as words.
  • Imports: Optionally excludes text from imported/included files.
  • Rendered content: Only counts text that appears in the final rendered document, ignoring code, comments, and markup syntax.

§Arguments

  • introspector - The Typst introspector providing access to document elements
  • exclude_imports - If true, only counts text from the main file
  • main_file_id - File ID of the main document (used when exclude_imports is true)

§Returns

A Count struct containing the word and character counts.

§Examples

use typst_count::count_document;

let count = count_document(&introspector, false, main_file_id);
println!("Words: {}, Characters: {}", count.words, count.characters);

§Counting Method

  • Words: Split by Unicode whitespace (equivalent to Rust’s split_whitespace())
  • Characters: Total Unicode scalar values (equivalent to Rust’s chars().count())

§Avoiding Double-Counting

Typst’s document tree includes both container elements and their styled children. For example, *bold text* creates:

  • A paragraph element containing “bold text”
  • A strong element also containing “bold text”

To avoid counting the same text twice, we skip known styling elements whose content is already included in their parent elements.