Expand description
icu_datagen is a library to generate data files that can be used in ICU4X data providers.
Data files can be generated either programmatically (i.e. in build.rs), or through a
command-line utility.
Also see our datagen tutorial.
§Examples
§Rust API
use icu_datagen::blob_exporter::*;
use icu_datagen::prelude::*;
use std::fs::File;
DatagenDriver::new()
.with_keys([icu::list::provider::AndListV1Marker::KEY])
.with_locales_and_fallback([LocaleFamily::FULL], Default::default())
.export(
&DatagenProvider::new_latest_tested(),
BlobExporter::new_v2_with_sink(Box::new(
File::create("data.postcard").unwrap(),
)),
)
.unwrap();§Command line
The command line interface can be installed through Cargo.
$ cargo install icu_datagenOnce the tool is installed, you can invoke it like this:
$ icu4x-datagen --keys all --locales de en-AU --format blob --out data.postcardMore details can be found by running --help.
§Cargo features
This crate has a lot of dependencies, some of which are not required for all operating modes. These default Cargo features can be disabled to reduce dependencies:
baked_exporter- enables the
baked_exportermodule - enables the
--format modCLI argument
- enables the
blob_exporter- enables the
blob_exportermodule, a reexport oficu_provider_blob::export - enables the
--format blobCLI argument
- enables the
fs_exporter- enables the
fs_exportermodule, a reexport oficu_provider_fs::export - enables the
--format dirCLI argument
- enables the
networking- enables methods on
DatagenProviderthat fetch source data from the network - enables the
--cldr-tag,--icu-export-tag, and--segmenter-lstm-tagCLI arguments that download data
- enables methods on
rayon- enables parallelism during export
use_wasm/use_icu4c- see the documentation on
icu_codepointtrie_builder
- see the documentation on
bin- required by the CLI and enabled by default to make
cargo installwork
- required by the CLI and enabled by default to make
legacy_api- enables the deprecated pre-1.3 API
- enabled by default for semver stability
- will be removed in 2.0.
icu_experimental- enables data generation for keys defined in the unstable
icu_experimentalcrate - note that this features affects the behaviour of
all_keys
- enables data generation for keys defined in the unstable
The meta-feature experimental_components is available to activate all experimental components.
Modules§
- baked_
exporter - A data exporter that bakes the data into Rust code.
- blob_
exporter - Data exporter that creates a binary blob for use with
BlobDataProvider. - fs_
exporter - Data exporter that creates a file system structure for use with
FsDataProvider. - prelude
- A prelude for using the datagen API
- syntax
Deprecated Out::Fsserialization formats.
Structs§
- Datagen
Driver - Configuration for a data export operation.
- Datagen
Provider - An
ExportableProviderbacked by raw CLDR and ICU data. - Fallback
Options - Options bag configuring locale inclusion and behavior when runtime fallback is enabled.
- Locale
Family - A family of locales to export.
- NoFallback
Options - Options bag configuring locale inclusion and behavior when runtime fallback is disabled.
- Source
Data Deprecated - Bag of options for
datagen.
Enums§
- Collation
HanDatabase - Specifies the collation Han database to use.
- Coverage
Level - A language’s CLDR coverage level.
- Deduplication
Strategy - Choices for determining the deduplication of locales for exported data payloads.
- Fallback
Mode - Defines how fallback will apply to the generated data.
- Out
Deprecated - The output format for
datagen. - Runtime
Fallback Location - Choices for the code location of runtime fallback.
Functions§
- all_
keys - List of all keys that are available.
- all_
keys_ with_ experimental Deprecated - Same as
all_keys. - datagen
Deprecated - Runs data generation
- is_
missing_ cldr_ error Deprecated - Identifies errors that are due to missing CLDR data.
- is_
missing_ icuexport_ error Deprecated - Identifies errors that are due to missing ICU export data.
- key
- Parses a human-readable key identifier into a
DataKey. - keys
- Parses a list of human-readable key identifiers and returns a
list of
DataKeys. - keys_
from_ bin - Parses a compiled binary and returns a list of
DataKeys that it uses at runtime. - keys_
from_ file Deprecated - Parses a file of human-readable key identifiers and returns a
list of
DataKeys.