[][src]Crate uwl

A stream of chars for building such as a lexer. Making the step of "iteration between characters" considerably easier. And providing certain utilites for making the code simpler. Respects both ASCII and Unicode.

Example, lexing identifiers, numbers and some punctuation marks:

use uwl::AsciiStream;
use uwl::StrExt;

#[derive(Debug, PartialEq)]
enum TokenKind {
    Ident,
    Number,
    Question,
    Exclamation,
    Comma,
    Point,

    // An invalid token
    Illegal,
}

#[derive(Debug, PartialEq)]
struct Token<'a> {
    kind: TokenKind,
    lit: &'a str,
}

impl<'a> Token<'a> {
    fn new(kind: TokenKind, lit: &'a str) -> Self {
        Token {
            kind,
            lit,
        }
    }
}

fn lex<'a>(stream: &mut AsciiStream<'a>) -> Option<Token<'a>> {
    if stream.at_end() {
        return None;
    }

    Some(match stream.current()? {
        // Ignore whitespace.
        s if s.is_whitespace() => {
            stream.next()?;
            return lex(stream);
        },
        s if s.is_alphabetic() => Token::new(TokenKind::Ident, stream.take_while(|s| s.is_alphabetic())),
        s if s.is_numeric() => Token::new(TokenKind::Number, stream.take_while(|s| s.is_numeric())),
        "?" => Token::new(TokenKind::Question, stream.next()?),
        "!" => Token::new(TokenKind::Exclamation, stream.next()?),
        "," => Token::new(TokenKind::Comma, stream.next()?),
        "." => Token::new(TokenKind::Point, stream.next()?),
        _ => Token::new(TokenKind::Illegal, stream.next()?),
    })
}

fn main() {
    let mut stream = AsciiStream::new("Hello, world! ...world? Hello?");

    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Ident, "Hello")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Comma, ",")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Ident, "world")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Exclamation, "!")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Point, ".")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Point, ".")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Point, ".")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Ident, "world")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Question, "?")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Ident, "Hello")));
    assert_eq!(lex(&mut stream), Some(Token::new(TokenKind::Question, "?")));

    // Reached the end
    assert_eq!(lex(&mut stream), None);
}

Structs

Ascii
Stream

A stream of "chars". Handles ASCII and/or Unicode depending on the Advancer

Unicode

Traits

Advancer

How the stream should be constitualizing "chars" and how many bytes should it advance per "char".

StrExt

Brings over some is_* methods from char to &str, and some methods for identifiers/symbols.

Type Definitions

AsciiStream

A stream of chars. Handles only ASCII.

UnicodeStream

A stream of chars. Handles both ASCII and Unicode.