Skip to main content

Tokenizer

Struct Tokenizer 

Source
pub struct Tokenizer(/* private fields */);
Expand description

High-level tokenizer for SQLite SQL.

In most codebases this is the tokenizer you want.

  • Fast lexical analysis without building an AST.
  • Returns token kind + original source slice.
  • Reusable across many SQL inputs.

Advanced generic tokenizer APIs exist in crate::typed and crate::any.

Implementations§

Source§

impl Tokenizer

Source

pub fn new() -> Self

Create a tokenizer for SQLite SQL.

Source

pub fn tokenize<'a>(&self, source: &'a str) -> impl Iterator<Item = Token<'a>>

Tokenize one SQL source string and iterate SQLite tokens.

§Examples
use syntaqlite_syntax::{Tokenizer, TokenType};

let tokenizer = Tokenizer::new();
let tokens: Vec<_> = tokenizer
    .tokenize("SELECT x FROM t")
    .map(|tok| (tok.token_type(), tok.text().to_string()))
    .collect();

assert!(tokens.iter().any(|(ty, _)| *ty == TokenType::Select));
assert!(tokens.iter().any(|(_, text)| text == "x"));
§Panics

Panics if another cursor from this tokenizer is still active. Drop the previous iterator before starting a new one.

Source

pub fn tokenize_cstr<'a>( &self, source: &'a CStr, ) -> impl Iterator<Item = Token<'a>>

Zero-copy tokenization over a null-terminated source buffer.

Use this when your SQL already lives in a CStr and you want to avoid copying.

§Examples
use std::ffi::CString;
use syntaqlite_syntax::{Tokenizer, TokenType};

let tokenizer = Tokenizer::new();
let sql = CString::new("SELECT 1").unwrap();
let types: Vec<_> = tokenizer.tokenize_cstr(&sql).map(|t| t.token_type()).collect();

assert!(types.contains(&TokenType::Select));
§Panics

Panics if another cursor from this tokenizer is still active, or if source is not valid UTF-8.