pub enum DataType {
Show 33 variants
Null,
Boolean,
Int8,
Int16,
Int32,
Int64,
UInt8,
UInt16,
UInt32,
UInt64,
Float16,
Float32,
Float64,
Timestamp(TimeUnit, Option<String>),
Date32,
Date64,
Time32(TimeUnit),
Time64(TimeUnit),
Duration(TimeUnit),
Interval(IntervalUnit),
Binary,
FixedSizeBinary(i32),
LargeBinary,
Utf8,
LargeUtf8,
List(Box<Field>),
FixedSizeList(Box<Field>, i32),
LargeList(Box<Field>),
Struct(Vec<Field>),
Union(Vec<Field>, Vec<i8>, UnionMode),
Dictionary(Box<DataType>, Box<DataType>),
Decimal(usize, usize),
Map(Box<Field>, bool),
}
Expand description
The set of datatypes that are supported by this implementation of Apache Arrow.
The Arrow specification on data types includes some more types.
See also Schema.fbs
for Arrow’s specification.
The variants of this enum include primitive fixed size types as well as parametric or nested types. Currently the Rust implementation supports the following nested types:
List<T>
Struct<T, U, V, ...>
Nested types can themselves be nested within other arrays. For more information on these types please see the physical memory layout of Apache Arrow.
Variants
Null
Null type
Boolean
A boolean datatype representing the values true
and false
.
Int8
A signed 8-bit integer.
Int16
A signed 16-bit integer.
Int32
A signed 32-bit integer.
Int64
A signed 64-bit integer.
UInt8
An unsigned 8-bit integer.
UInt16
An unsigned 16-bit integer.
UInt32
An unsigned 32-bit integer.
UInt64
An unsigned 64-bit integer.
Float16
A 16-bit floating point number.
Float32
A 32-bit floating point number.
Float64
A 64-bit floating point number.
Timestamp(TimeUnit, Option<String>)
A timestamp with an optional timezone.
Time is measured as a Unix epoch, counting the seconds from 00:00:00.000 on 1 January 1970, excluding leap seconds, as a 64-bit integer.
The time zone is a string indicating the name of a time zone, one of:
- As used in the Olson time zone database (the “tz database” or “tzdata”), such as “America/New_York”
- An absolute time zone offset of the form +XX:XX or -XX:XX, such as +07:30
Date32
A 32-bit date representing the elapsed time since UNIX epoch (1970-01-01) in days (32 bits).
Date64
A 64-bit date representing the elapsed time since UNIX epoch (1970-01-01) in milliseconds (64 bits). Values are evenly divisible by 86400000.
Time32(TimeUnit)
A 32-bit time representing the elapsed time since midnight in the unit of TimeUnit
.
Time64(TimeUnit)
A 64-bit time representing the elapsed time since midnight in the unit of TimeUnit
.
Duration(TimeUnit)
Measure of elapsed time in either seconds, milliseconds, microseconds or nanoseconds.
Interval(IntervalUnit)
A “calendar” interval which models types that don’t necessarily have a precise duration without the context of a base timestamp (e.g. days can differ in length during day light savings time transitions).
Binary
Opaque binary data of variable length.
FixedSizeBinary(i32)
Opaque binary data of fixed size. Enum parameter specifies the number of bytes per value.
LargeBinary
Opaque binary data of variable length and 64-bit offsets.
Utf8
A variable-length string in Unicode with UTF-8 encoding.
LargeUtf8
A variable-length string in Unicode with UFT-8 encoding and 64-bit offsets.
List(Box<Field>)
A list of some logical data type with variable length.
FixedSizeList(Box<Field>, i32)
A list of some logical data type with fixed length.
LargeList(Box<Field>)
A list of some logical data type with variable length and 64-bit offsets.
Struct(Vec<Field>)
A nested datatype that contains a number of sub-fields.
Union(Vec<Field>, Vec<i8>, UnionMode)
A nested datatype that can represent slots of differing types. Components:
Field
for each possible child type the Union can hold- The corresponding
type_id
used to identify which Field - The type of union (Sparse or Dense)
Dictionary(Box<DataType>, Box<DataType>)
A dictionary encoded array (key_type
, value_type
), where
each array element is an index of key_type
into an
associated dictionary of value_type
.
Dictionary arrays are used to store columns of value_type
that contain many repeated values using less memory, but with
a higher CPU overhead for some operations.
This type mostly used to represent low cardinality string arrays or a limited set of primitive types as integers.
Decimal(usize, usize)
Exact decimal value with precision and scale
- precision is the total number of digits
- scale is the number of digits past the decimal
For example the number 123.45 has precision 5 and scale 2.
Map(Box<Field>, bool)
A Map is a logical nested type that is represented as
List<entries: Struct<key: K, value: V>>
The keys and values are each respectively contiguous.
The key and value types are not constrained, but keys should be
hashable and unique.
Whether the keys are sorted can be set in the bool
after the Field
.
In a field with Map type, the field has a child Struct field, which then has two children: key type and the second the value type. The names of the child fields may be respectively “entries”, “key”, and “value”, but this is not enforced.
Implementations
sourceimpl DataType
impl DataType
sourcepub fn is_numeric(t: &DataType) -> bool
pub fn is_numeric(t: &DataType) -> bool
Returns true if this type is numeric: (UInt*, Unit*, or Float*).
sourcepub fn is_dictionary_key_type(t: &DataType) -> bool
pub fn is_dictionary_key_type(t: &DataType) -> bool
Returns true if this type is valid as a dictionary key
(e.g. super::ArrowDictionaryKeyType
sourcepub fn equals_datatype(&self, other: &DataType) -> bool
pub fn equals_datatype(&self, other: &DataType) -> bool
Compares the datatype with another, ignoring nested field names and metadata.
Trait Implementations
sourceimpl<'de> Deserialize<'de> for DataType
impl<'de> Deserialize<'de> for DataType
sourcefn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
sourceimpl Ord for DataType
impl Ord for DataType
sourceimpl PartialOrd<DataType> for DataType
impl PartialOrd<DataType> for DataType
sourcefn partial_cmp(&self, other: &DataType) -> Option<Ordering>
fn partial_cmp(&self, other: &DataType) -> Option<Ordering>
This method returns an ordering between self
and other
values if one exists. Read more
1.0.0 · sourcefn lt(&self, other: &Rhs) -> bool
fn lt(&self, other: &Rhs) -> bool
This method tests less than (for self
and other
) and is used by the <
operator. Read more
1.0.0 · sourcefn le(&self, other: &Rhs) -> bool
fn le(&self, other: &Rhs) -> bool
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
sourceimpl TryFrom<&'_ DataType> for FFI_ArrowSchema
impl TryFrom<&'_ DataType> for FFI_ArrowSchema
sourceimpl TryFrom<&'_ FFI_ArrowSchema> for DataType
impl TryFrom<&'_ FFI_ArrowSchema> for DataType
sourcefn try_from(c_schema: &FFI_ArrowSchema) -> Result<Self>
fn try_from(c_schema: &FFI_ArrowSchema) -> Result<Self>
type Error = ArrowError
type Error = ArrowError
The type returned in the event of a conversion error.
sourceimpl TryFrom<DataType> for FFI_ArrowSchema
impl TryFrom<DataType> for FFI_ArrowSchema
impl Eq for DataType
impl StructuralEq for DataType
impl StructuralPartialEq for DataType
Auto Trait Implementations
impl RefUnwindSafe for DataType
impl Send for DataType
impl Sync for DataType
impl Unpin for DataType
impl UnwindSafe for DataType
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
sourceimpl<Q, K> Equivalent<K> for Q where
Q: Eq + ?Sized,
K: Borrow<Q> + ?Sized,
impl<Q, K> Equivalent<K> for Q where
Q: Eq + ?Sized,
K: Borrow<Q> + ?Sized,
sourcefn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
Compare self to key
and return true
if they are equal.