Crate jmdict[][src]

The JMdict file is a comprehensive multilingual dictionary of the Japanese language. The original JMdict file, included in this repository (and hence, in releases of this crate) comes as XML. Instead of stuffing the XML in the binary directly, this crate parses the XML at compile-time and generates an optimized representation that is compiled into the binary. The crate’s API affords type-safe access to this embedded database.

WARNING: Licensing on database files

The database files compiled into the crate are licensed from the Electronic Dictionary Research and Development Group under Creative Commons licenses. Applications linking this crate directly oder indirectly must display appropriate copyright notices to users. Please refer to the EDRDG’s license statement for details.

Basic usage

The database is accessed through the entries() function which provides an iterator over all database entries compiled into the application. While traversing the database and its entries, you will find that, whenever you expect a list of something, you will get an iterator instead. These iterators provide an abstraction between you as the user of the library, and the physical representation of the database as embedded in the binary.

The following example looks up the reading for お母さん in the database:

let kanji_form = "お母さん";

let entry = jmdict::entries().find(|e| {
    e.kanji_elements().any(|k| k.text == kanji_form)
}).unwrap();

let reading_form = entry.reading_elements().next().unwrap().text;
assert_eq!(reading_form, "おかあさん");

Cargo features

Common configurations

  • The default feature includes the most common words (about 30000 entries) and only their English translations.
  • The full feature includes everything in the JMdict.

Entry selection

  • The scope-uncommon feature includes uncommon words and glosses.
  • The scope-archaic feature includes glosses with the “archaic” label. If disabled, the PartOfSpeech enum will not include variants that are only relevant for archaic vocabulary, such as obsolete conjugation patterns. (The AllPartOfSpeech enum always contains all variants.)

Target languages

At least one target language must be selected. Selecting a target language will include all available translations in that language. Entries that do not have any translation in any of the selected languages will be skipped.

  • translations-eng: English (included in default)
  • translations-dut: Dutch
  • translations-fre: French
  • translations-ger: German
  • translations-hun: Hungarian
  • translations-rus: Russian
  • translations-slv: Slovenian
  • translations-spa: Spanish
  • translations-swe: Swedish

The GlossLanguage enum will only contain variants corresponding to the enabled target languages. For example, in the default configuration, GlossLanguage::English will be the only variant. (The AllGlossLanguage enum always contains all variants.)

Crippled builds: db-minimal

When the db-minimal feature is enabled, only a severly reduced portion of the JMdict will be parsed (to be exact, only chunks 000, 100 and 999). This is also completely useless for actual usage, but allows for quick edit-compile-test cycles while working on this crate’s code.

Crippled builds: db-empty

When the db-empty feature is enabled, downloading and parsing of the JMdict contents is disabled entirely. The crate is compiled as usual, but entries() will be an empty list. This is useful for documentation builds like for docs.rs, where --all-features is given.

Structs

Dialects

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

DisabledVariant

Error type for all enum conversions of the form impl TryFrom<AllFoo> for Foo.

Entries

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

Entry

An entry in the JMdict dictionary.

Gloss

A particular translation or explanation for a Japanese word or phrase in a different language.

Glosses

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

KanjiElement

A representation of a dictionary entry using kanji or other non-kana scripts.

KanjiElements

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

KanjiInfos

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

LoanwordSource

A source word in other language which a particular Sense of an Entry has been borrowed from.

LoanwordSources

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

PartsOfSpeech

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

Priority

Relative priority of a ReadingElement or KanjiElement.

ReadingElement

A representation of a dictionary entry using only kana.

ReadingElements

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

ReadingInfos

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

Sense

The translational equivalent of a Japanese word or phrase.

SenseInfos

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

SenseTopics

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

Senses

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

Strings

An iterator providing fast access to objects in the database. Instances of this iterator can be copied cheaply.

Enums

AllGlossLanguage

The language of a particular Gloss. This enum contains all possible variants, including those that have been disabled by compile-time flags in enum GlossLanguage.

AllPartOfSpeech

Where a word can appear in a sentence for a particular Sense of the word. This enum contains all possible variants, including those that have been disabled by compile-time flags in enum PartOfSpeech.

Dialect

Dialect of Japanese in which a certain vocabulary occurs.

GlossLanguage

The language of a particular Gloss.

GlossType

Type of gloss.

KanjiInfo

Information regarding a certain KanjiElement.

PartOfSpeech

Where a word can appear in a sentence for a particular Sense of the word.

PriorityInCorpus

PriorityInCorpus appears in struct Priority. It describes how often a dictionary entry appears in a certain corpus of text.

ReadingInfo

Information regarding a certain ReadingElement.

SenseInfo

Information regarding a certain Sense.

SenseTopic

Field of study where a certain Sense originates.

Traits

Enum

Common methods provided by all enums in this crate.

Functions

entries

Returns an iterator over all entries in the database.