Module text

Module text 

Source
Expand description

This module contains reusable components for encoding and decoding text in DICOM data structures, including support for character repertoires.

At the moment the following character sets are supported:

Character Setdecoding supportencoding support
ISO-IR 6 (default)
ISO-IR 13 (WINDOWS_31J): The JIS X 0201-1976 character set (Japanese single-byte)
ISO-IR 87 (ISO_2022_JP): The JIS X 0208-1990 character set (Japanese multi-byte)
ISO-IR 100 (ISO-8859-1): Right-hand part of the Latin alphabet no. 1, the Western Europe character set
ISO-IR 101 (ISO-8859-2): Right-hand part of the Latin alphabet no. 2, the Central/Eastern Europe character set
ISO-IR 109 (ISO-8859-3): Right-hand part of the Latin alphabet no. 3, the South Europe character set
ISO-IR 110 (ISO-8859-4): Right-hand part of the Latin alphabet no. 4, the North Europe character set
ISO-IR 126 (ISO-8859-7): The Latin/Greek character set
ISO-IR 127 (ISO-8859-6): The Latin/Arabic character set
ISO-IR 138 (ISO-8859-8): The Latin/Hebrew character set
ISO-IR 144 (ISO-8859-5): The Latin/Cyrillic character set
ISO-IR 148 (ISO-8859-9): Latin no. 5, the Turkish character setxx
ISO-IR 149 (WINDOWS_949): The KS X 1001 character set (Korean)
ISO-IR 159: The JIS X 0212-1990 character set (supplementary Japanese characters)xx
ISO-IR 166 (WINDOWS_874): The TIS 620-2533 character set (Thai)
ISO-IR 192: The Unicode character set based on the UTF-8 encoding
GB18030: The Simplified Chinese character set
GB2312: Simplified Chinese character set
GBK: Simplified Chinese character set
These capabilities are available through SpecificCharacterSet.

Structs§

DefaultCharacterSetCodec
Data type representing the default character set.
GBKCharacterSetCodec
Data type for the GBK character set encoding.
Gb18030CharacterSetCodec
Data type for the GB18030 character set encoding.
IsoIr13CharacterSetCodec
Data type for the ISO_IR 13 character set encoding.
IsoIr87CharacterSetCodec
Data type for the ISO_IR 87 character set encoding.
IsoIr100CharacterSetCodec
Data type for the ISO_IR 100 character set encoding.
IsoIr101CharacterSetCodec
Data type for the ISO_IR 101 character set encoding.
IsoIr109CharacterSetCodec
Data type for the ISO_IR 109 character set encoding.
IsoIr110CharacterSetCodec
Data type for the ISO_IR 110 character set encoding.
IsoIr126CharacterSetCodec
Data type for the ISO_IR 126 character set encoding.
IsoIr127CharacterSetCodec
Data type for the ISO_IR 127 character set encoding.
IsoIr138CharacterSetCodec
Data type for the ISO_IR 138 character set encoding.
IsoIr144CharacterSetCodec
Data type for the ISO_IR 144 character set encoding.
IsoIr149CharacterSetCodec
Data type for the ISO_IR 149 character set encoding.
IsoIr166CharacterSetCodec
Data type for the ISO_IR 166 character set encoding.
SpecificCharacterSet
A descriptor for a specific character set, taking part in text encoding and decoding as per PS3.5 ch 6 6.1.
Utf8CharacterSetCodec
Data type for the ISO_IR 192 character set encoding.

Enums§

DecodeTextError
An error type for text decoding issues.
EncodeTextError
An error type for text encoding issues.
TextValidationOutcome
The result of a text validation procedure (please see validate_iso_8859).

Traits§

TextCodec
A holder of encoding and decoding mechanisms for text in DICOM content, which according to the standard, depends on the specific character set.

Functions§

validate_cs
Check whether the given byte slice contains only valid characters for a Code String value representation.
validate_da
Check whether the given byte slice contains only valid characters for a Date value representation.
validate_dt
Check whether the given byte slice contains only valid characters for a Date Time value representation.
validate_iso_8859
Check whether the given byte slice contains valid text from the default character repertoire.
validate_tm
Check whether the given byte slice contains only valid characters for a Time value representation.