Module tokenizer

Module tokenizer 

Source
Expand description

ASS script tokenizer module

Provides zero-copy lexical analysis of ASS subtitle scripts with incremental tokenization. Supports SIMD-accelerated delimiter scanning and hex parsing for optimal performance.

§Performance

  • Target: <1ms/1KB tokenization with zero allocations
  • SIMD: 20-30% faster delimiter scanning when enabled
  • Memory: Zero-copy via &'a str spans referencing source

§Example

use ass_core::tokenizer::AssTokenizer;

let source = "[Script Info]\nTitle: Example";
let mut tokenizer = AssTokenizer::new(source);

while let Some(token) = tokenizer.next_token()? {
    println!("{:?}", token);
}

Re-exports§

pub use scanner::CharNavigator;
pub use scanner::TokenScanner;
pub use state::IssueCollector;
pub use state::IssueLevel;
pub use state::TokenContext;
pub use state::TokenIssue;
pub use tokens::DelimiterType;
pub use tokens::Token;
pub use tokens::TokenType;

Modules§

scanner
Token scanning methods for ASS tokenizer
simdsimd
SIMD-accelerated tokenization utilities
state
Tokenizer state management and issue reporting
tokens
Token definitions for ASS script tokenization

Structs§

AssTokenizer
Incremental tokenizer for ASS scripts with zero-copy design