Module icu_data::ucm[][src]

Expand description

This module contains a UniCode Mapping (.ucm) file format parser and all of the data files available in the Unicode Consortium’s icu-data repository. For a list, see KNOWN_CHARSETS.

Most uses of this library should look like this:

use icu_data::ucm::{request_mapping_file, parser::parse as parse_ucm};

let f = request_mapping_file("java-EUC_JP-1.3_P").unwrap(); // holds the .ucm file as a String
let enc = parse_ucm(&f).unwrap(); // holds an `Encoding`
/* ... */

If you only want a single encoding, they’re all in the module named mappings. They are all lazy_static types, so are only evaluated when used. The evaluation of them can panic, because it is just the code above, but they all work on my machine, and will only ever panic if Brotli decompression or tar metadata parsing fails.

Example:

use icu_data::ucm::mappings;
assert_eq!(mappings::JAVA_EUC_JP_1_3_P.codepoints.len(), 13139);

Modules

mappings

Lazilly evaluated static’s, holding an Encoding for each encoding

parser

A .ucm file format (UniCode Mapping) Pest parser

Structs

Codepoint

This represents a CHARMAP row in a .ucm (UniCode Mapping) file.

Encoding

This represents a single .ucm (UniCode Mapping) file.

Enums

EquivalenceType

The “equivalence type” of the Unicode codepoint to the bytestring in the Encoding. The equivalence types are defined by the Unicode consortium as such:

IcuDataError

Error type. You should only ever expect to see UnknownMappingRequested unless you’re doing development on the library.

Statics

KNOWN_CHARSETS

This is a list of all of the encodings known to request_mapping_file, without .ucm.

Traits

PestParser

A trait with a single method that parses strings.

Functions

request_mapping_file

Given the name of an encoding known to ICU, return its raw UCM data as a String.