Struct locale_config::LanguageRange [] [src]

pub struct LanguageRange<'a> { /* fields omitted */ }

Language and culture identifier.

This object holds a RFC4647 extended language range.

The internal data may be owned or shared from object with lifetime 'a. The lifetime can be extended using the into_static() method, which internally clones the data as needed.

Syntax

The range is composed of --separated alphanumeric subtags, possibly replaced by *s. It might be empty.

In agreement with RFC4647, this object only requires that the tag matches:

language_tag = (alpha{1,8} | "*")
               ("-" (alphanum{1,8} | "*"))*

The exact interpretation is up to the downstream localization provider, but it expected that it will be matched against a normalized RFC5646 language tag, which has the structure:

language_tag    = language
                  ("-" script)?
                  ("-" region)?
                  ("-" variant)*
                  ("-" extension)*
                  ("-" private)?

language        = alpha{2,3} ("-" alpha{3}){0,3}

script          = aplha{4}

region          = alpha{2}
                | digit{3}

variant         = alphanum{5,8}
                | digit alphanum{3}

extension       = [0-9a-wyz] ("-" alphanum{2,8})+

private         = "x" ("-" alphanum{1,8})+
  • language is an ISO639 2-letter or, where not defined, 3-letter code. A code for macro-language might be followed by code of specific dialect.
  • script is an ISO15924 4-letter code.
  • region is either an ISO3166 2-letter code or, for areas other than countries, UN M.49 3-digit numeric code.
  • variant is a string indicating variant of the language.
  • extension and private define additional options. The private part has same structure as the Unicode -u- extension. Available options are documented for the facets that use them.

The values obtained by inspecting the system are normalized according to those rules.

The content will be case-normalized as recommended in RFC5646 §2.1.1, namely:

  • language is written in lowercase,
  • script is written with first capital,
  • country is written in uppercase and
  • all other subtags are written in lowercase.

When detecting system configuration, additional options that may be generated under the -u- extension currently are:

  • cf — Currency format (account for parenthesized negative values, standard for minus sign).
  • fw — First day of week (mon to sun).
  • hc — Hour cycle (h12 for 1–12, h23 for 0–23).
  • ms — Measurement system (metric or ussystem).
  • nu — Numbering system—only decimal systems are currently used.
  • va — Variant when locale is specified in Unix format and the tag after @ does not correspond to any variant defined in Language subtag registry.

And under the -x- extension, following options are defined:

  • df — Date format:

    • iso: Short date should be in ISO format of yyyy-MM-dd.

    For example -df-iso.

  • dm — Decimal separator for monetary:

    Followed by one or more Unicode codepoints in hexadecimal. For example -dm-002d means to use comma.

  • ds — Decimal separator for numbers:

    Followed by one or more Unicode codepoints in hexadecimal. For example -ds-002d means to use comma.

  • gm — Group (thousand) separator for monetary:

    Followed by one or more Unicode codepoints in hexadecimal. For example -dm-00a0 means to use non-breaking space.

  • gs — Group (thousand) separator for numbers:

    Followed by one or more Unicode codepoints in hexadecimal. For example -ds-00a0 means to use non-breaking space.

  • ls — List separator:

    Followed by one or more Unicode codepoints in hexadecimal. For example, -ds-003b means to use a semicolon.

Methods

impl<'a> LanguageRange<'a>
[src]

Construct LanguageRange from string, with normalization.

LanguageRange must follow the RFC4647 syntax. It will be case-normalized as recommended in RFC5646 §2.1.1, namely:

  • language, if recognized, is written in lowercase,
  • script, if recognized, is written with first capital,
  • country, if recognized, is written in uppercase and
  • all other subtags are written in lowercase.

Return LanguageRange for the invariant locale.

Invariant language is identified simply by empty string.

Clone the internal data to extend lifetime.

Create new instance sharing the internal data.

Create language tag from Unix/Linux/GNU locale tag.

Unix locale tags have the form

language [ _ region ] [ . encoding ] [ @ variant ]

The language and region have the same format as RFC5646. Encoding is not relevant here, since Rust always uses Utf-8. That leaves variant, which is unfortunately rather free-form. So this function will translate known variants to corresponding RFC5646 subtags and represent anything else with Unicode POSIX variant (-u-va-) extension.

Note: This function is public here for benefit of applications that may come across this kind of tags from other sources than system configuration.

Trait Implementations

impl<'a> Clone for LanguageRange<'a>
[src]

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl<'a> Debug for LanguageRange<'a>
[src]

Formats the value using the given formatter.

impl<'a> Eq for LanguageRange<'a>
[src]

impl<'a> Hash for LanguageRange<'a>
[src]

Feeds this value into the state given, updating the hasher as necessary.

Feeds a slice of this type into the state provided.

impl<'a> PartialEq for LanguageRange<'a>
[src]

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

impl<'a> AsRef<str> for LanguageRange<'a>
[src]

Performs the conversion.

impl<'a> Display for LanguageRange<'a>
[src]

Formats the value using the given formatter.