i18n_lexer-rizzen-yazston 0.6.1

The `i18n_lexer` crate of the Internationalisation project.
Documentation

= i18n_lexer Rizzen Yazston :DataProvider: https://docs.rs/icu_provider/1.2.0/icu_provider/trait.DataProvider.html :url-unicode: https://home.unicode.org/ :CLDR: https://cldr.unicode.org/ :icu4x: https://github.com/unicode-org/icu4x

String lexer and resultant tokens.

The Lexer is initialised using any data provider implementing the {DataProvider}[DataProvider] trait to an {url-unicode}[Unicode Consortium] {url-unicode}[CLDR] data repository (even a custom database). Usually the repository is just a local copy of the CLDR in the application's data directory. Once the Lexer has been initialised it may be used to tokenise strings, without needing to re-initialising the Lexer before use.

Consult the {icu4x}[ICU4X] website for instructions on generating a suitable data repository for the application, by leaving out data that is not used by the application.

Strings are tokenised using the method tokenise() taking string slice and a vector containing grammar syntax characters.

== Acknowledgement

Stefano Angeleri for advice on various design aspects of implementing the components of the internationalisation project, and also providing the Italian translation of error message strings.

== Cargo.toml

[dependencies]
i18n_icu-rizzen-yazston = "0.6.1"
icu_provider = "1.2.0"

# These are required for the DataProvider.
icu_properties = "1.2.0"
icu_segmenter = "1.2.0"
icu_plurals = "1.2.0"
icu_decimal = "1.2.0"
icu_calendar = "1.2.0"
icu_datetime = "1.2.0"

# This is required for the DataProvider.
[dependencies.fixed_decimal]
version = "0.5.3"
# Needed for floating point support.
features = [ "ryu" ]

== Examples

use i18n_icu::IcuDataProvider;
use i18n_lexer::{Token, TokenType, tokenise};
use icu_testdata::buffer;
use icu_provider::serde::AsDeserializingBufferProvider;
use std::rc::Rc;
use std::error::Error;

fn test_tokenise() -> Result<(), Box<dyn Error>> {
    let buffer_provider = buffer();
    let data_provider = buffer_provider.as_deserializing();
    let tokens = tokenise(
        "String contains a {placeholder}.",
        &vec![ '{', '}' ],
        &Rc::new( icu_data_provider ),
    );
    let mut grammar = 0;
    assert_eq!( tokens.0.iter().count(), 10, "Supposed to be a total of 10 tokens." );
    for token in tokens.0.iter() {
        if token.token_type == TokenType::Grammar {
            grammar += 1;
        }
    }
    assert_eq!( grammar, 2, "Supposed to be 2 grammar tokens." );
    Ok( () )
}