Struct icu_locid::Locale

source ·
pub struct Locale {
    pub id: LanguageIdentifier,
    pub extensions: Extensions,
}
Expand description

A core struct representing a Unicode Locale Identifier.

A locale is made of two parts:

  • Unicode Language Identifier
  • A set of Unicode Extensions

Locale exposes all of the same fields and methods as LanguageIdentifier, and on top of that is able to parse, manipulate and serialize unicode extension fields.

Examples

use icu_locid::{
    extensions_unicode_key as key, extensions_unicode_value as value,
    locale, subtags_language as language, subtags_region as region,
};

let loc = locale!("en-US-u-ca-buddhist");

assert_eq!(loc.id.language, language!("en"));
assert_eq!(loc.id.script, None);
assert_eq!(loc.id.region, Some(region!("US")));
assert_eq!(loc.id.variants.len(), 0);
assert_eq!(
    loc.extensions.unicode.keywords.get(&key!("ca")),
    Some(&value!("buddhist"))
);

Parsing

Unicode recognizes three levels of standard conformance for a locale:

  • well-formed - syntactically correct
  • valid - well-formed and only uses registered language subtags, extensions, keywords, types…
  • canonical - valid and no deprecated codes or structure.

At the moment parsing normalizes a well-formed locale identifier converting _ separators to - and adjusting casing to conform to the Unicode standard.

Any bogus subtags will cause the parsing to fail with an error. No subtag validation or canonicalization is performed.

Examples

use icu::locid::{subtags::*, Locale};

let loc: Locale = "eN_latn_Us-Valencia_u-hC-H12"
    .parse()
    .expect("Failed to parse.");

assert_eq!(loc.id.language, "en".parse::<Language>().unwrap());
assert_eq!(loc.id.script, "Latn".parse::<Script>().ok());
assert_eq!(loc.id.region, "US".parse::<Region>().ok());
assert_eq!(
    loc.id.variants.get(0),
    "valencia".parse::<Variant>().ok().as_ref()
);

Fields§

§id: LanguageIdentifier

The basic language/script/region components in the locale identifier along with any variants.

§extensions: Extensions

Any extensions present in the locale identifier.

Implementations§

A constructor which takes a utf8 slice, parses it and produces a well-formed Locale.

Examples
use icu::locid::Locale;

Locale::try_from_bytes(b"en-US-u-hc-h12").unwrap();

The default undefined locale “und”. Same as default().

Examples
use icu::locid::Locale;

assert_eq!(Locale::default(), Locale::UND);

This is a best-effort operation that performs all available levels of canonicalization.

At the moment the operation will normalize casing and the separator, but in the future it may also validate and update from deprecated subtags to canonical ones.

Examples
use icu::locid::Locale;

assert_eq!(
    Locale::canonicalize("pL_latn_pl-U-HC-H12").as_deref(),
    Ok("pl-Latn-PL-u-hc-h12")
);

Compare this Locale with BCP-47 bytes.

The return value is equivalent to what would happen if you first converted this Locale to a BCP-47 string and then performed a byte comparison.

This function is case-sensitive and results in a total order, so it is appropriate for binary search. The only argument producing Ordering::Equal is self.to_string().

Examples
use icu::locid::Locale;
use std::cmp::Ordering;

let bcp47_strings: &[&str] = &[
    "pl-Latn-PL",
    "und",
    "und-fonipa",
    "und-t-m0-true",
    "und-u-ca-hebrew",
    "und-u-ca-japanese",
    "zh",
];

for ab in bcp47_strings.windows(2) {
    let a = ab[0];
    let b = ab[1];
    assert!(a.cmp(b) == Ordering::Less);
    let a_loc = a.parse::<Locale>().unwrap();
    assert!(a_loc.strict_cmp(a.as_bytes()) == Ordering::Equal);
    assert!(a_loc.strict_cmp(b.as_bytes()) == Ordering::Less);
}

Compare this Locale with an iterator of BCP-47 subtags.

This function has the same equality semantics as Locale::strict_cmp. It is intended as a more modular version that allows multiple subtag iterators to be chained together.

For an additional example, see SubtagOrderingResult.

Examples
use icu::locid::locale;
use std::cmp::Ordering;

let subtags: &[&[u8]] =
    &[b"ca", b"ES", b"valencia", b"u", b"ca", b"hebrew"];

let loc = locale!("ca-ES-valencia-u-ca-hebrew");
assert_eq!(
    Ordering::Equal,
    loc.strict_cmp_iter(subtags.iter().copied()).end()
);

let loc = locale!("ca-ES-valencia");
assert_eq!(
    Ordering::Less,
    loc.strict_cmp_iter(subtags.iter().copied()).end()
);

let loc = locale!("ca-ES-valencia-u-nu-arab");
assert_eq!(
    Ordering::Greater,
    loc.strict_cmp_iter(subtags.iter().copied()).end()
);

Compare this Locale with a potentially unnormalized BCP-47 string.

The return value is equivalent to what would happen if you first parsed the BCP-47 string to a Locale and then performed a structucal comparison.

Examples
use icu::locid::Locale;
use std::cmp::Ordering;

let bcp47_strings: &[&str] = &[
    "pl-LaTn-pL",
    "uNd",
    "UND-FONIPA",
    "UnD-t-m0-TrUe",
    "uNd-u-CA-Japanese",
    "ZH",
];

for a in bcp47_strings {
    assert!(a.parse::<Locale>().unwrap().normalizing_eq(a));
}

Trait Implementations§

Converts this type into a mutable reference of the (usually inferred) input type.
Converts this type into a shared reference of the (usually inferred) input type.
Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more
Returns the “default value” for a type. Read more

This trait is implemented for compatibility with fmt!. To create a string, Writeable::write_to_string is usually more efficient.

Formats the value using the given formatter. Read more

Examples

use icu::locid::Locale;
use icu::locid::{
    locale, subtags_language as language, subtags_region as region,
    subtags_script as script,
};

assert_eq!(
    Locale::from((
        language!("en"),
        Some(script!("Latn")),
        Some(region!("US"))
    )),
    locale!("en-Latn-US")
);
Converts to this type from the input type.

Examples

use icu::locid::Locale;
use icu::locid::{locale, subtags_language as language};

assert_eq!(Locale::from(language!("en")), locale!("en"));
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.

Examples

use icu::locid::Locale;
use icu::locid::{locale, subtags_region as region};

assert_eq!(Locale::from(Some(region!("US"))), locale!("und-US"));
Converts to this type from the input type.

Examples

use icu::locid::Locale;
use icu::locid::{locale, subtags_script as script};

assert_eq!(Locale::from(Some(script!("latn"))), locale!("und-Latn"));
Converts to this type from the input type.
The associated error which can be returned from parsing.
Parses a string s to return a value of this type. Read more
Feeds this value into the given Hasher. Read more
Feeds a slice of this type into the given Hasher. Read more
This method tests for self and other values to be equal, and is used by ==.
This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Writes a string to the given sink. Errors from the sink are bubbled up. The default implementation delegates to write_to_parts, and discards any Part annotations.
Returns a hint for the number of UTF-8 bytes that will be written to the sink. Read more
Creates a new String with the data from this Writeable. Like ToString, but smaller and faster. Read more
Write bytes and Part annotations to the given sink. Errors from the sink are bubbled up. The default implementation delegates to write_to, and doesn’t produce any Part annotations.

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
Converts the given value to a String. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.