Expand description
Elasticsearch Core Types
An implementation of Elasticsearch data types and document mapping.
This library provides tools for defining and using Elasticsearch type mappings, where correctness is enforced by Rust’s type system. The mapping information is then used when serialising and deserialising your types. Annotating type fields with mapping metadata has no impact at runtime.
This library makes extensive use of serde
.
§Supported Versions
elastic_types | Elasticsearch |
---|---|
0.x | 5.x |
§Usage
This crate is on crates.io.
To get started, add elastic_types
and elastic_types_derive
to your Cargo.toml
:
[dependencies]
elastic_types = version = "*"
elastic_types_derive = "*"
And reference it in your crate root:
#[macro_use]
extern crate elastic_types_derive;
extern crate elastic_types;
§Map Your Types
Derive ElasticType
on your Elasticsearch-mappable types:
#[derive(Serialize, ElasticType)]
pub struct MyType {
pub my_date: Date<DefaultDateMapping>,
pub my_num: i32
}
You can then serialise your mapping as json using an IndexDocumentMapping
wrapper:
let mapping = serde_json::to_string(&MyType::index_mapping()).unwrap();
This will produce the following result:
{
"properties": {
"my_date": {
"type": "date",
"format": "basic_date_time"
},
"my_num": {
"type": "integer"
}
}
}
§Mapping structs as fields
Of course, structs that derive ElasticType
can also be used as fields in other Elasticsearch types:
#[derive(Serialize, Deserialize, ElasticType)]
pub struct MyOtherType {
pub my_type: MyType
}
Our mapping for MyOtherType
then looks like:
{
"properties": {
"my_type": {
"type": "nested",
"properties": {
"my_date": {
"type": "date",
"format": "basic_date_time"
},
"my_num": {
"type": "integer"
}
}
}
}
}
§Mapping Option
and Vec
Elasticsearch doesn’t differentiate between nullable types or collections, so it’s also possible
to derive mapping from Option
or Vec
types:
#[derive(Serialize, Deserialize, ElasticType)]
pub struct MyType {
pub my_date: Option<Date<DefaultDateMapping>>,
pub my_num: Vec<i32>
}
This produces the same mapping as before.
See the document
mod for more details.
§Overloading default mapping
You can override the default mapping for Elasticsearch’s core datatypes by implementing
the appropriate trait. In the below example, we create a custom boolean
mapping:
#[derive(Default)]
struct MyMapping;
impl BooleanMapping for MyMapping {
fn boost() -> Option<f32> { Some(1.04) }
}
For more details about the supported core datatypes and how to use them, see here.
§Serialise Your Types
Types that derive ElasticType
are themselves serialisable, which can be very helpful when using
types like date
with special formats.
Take the following document:
{
"id": 15,
"timestamp": 1435935302478,
"title": "my timestamped object"
}
Using the Date<DefaultDateMapping<EpochMillis>>
type for the timestamp
, we can correctly deserialise the document as a strongly typed
object:
#[derive(Serialize, Deserialize, ElasticType)]
struct MyType {
id: i32,
timestamp: Timestamp,
title: String
}
type Timestamp = Date<DefaultDateMapping<EpochMillis>>;
let de: MyType = serde_json::from_str(json).unwrap();
assert_eq!(2015, de.timestamp.year());
§A Complete Example
Before digging in to the API, consider the following complete example for defining and mapping a
type called Article
.
As json
, the Article
type should look something like this:
{
"id": 1,
"title": "An article",
"content": "Some prose for this article.",
"timestamp": 1435935302478,
"geoip": {
"ip": "10.0.0.1",
"loc": [ -71.34, 41.12 ]
}
}
Our Cargo.toml
specifies the dependencies as above:
[dependencies]
elastic_types = "*"
elastic_types_derive = "*"
And our main.rs
contains the following:
#[macro_use]
extern crate serde_derive;
extern crate serde_json;
extern crate serde;
#[macro_use]
extern crate elastic_types_derive;
#[macro_use]
extern crate elastic_types;
use elastic_types::prelude::*;
// Our main datatype, `article`
#[derive(Serialize, Deserialize, ElasticType)]
struct Article {
pub id: i32,
pub title: String,
pub content: Text<ContentMapping>,
pub timestamp: Option<Date<TimestampMapping>>,
pub geoip: GeoIp
}
// A second datatype, `geoip`
#[derive(Serialize, Deserialize, ElasticType)]
struct GeoIp {
pub ip: ::std::net::Ipv4Addr,
pub loc: GeoPoint<DefaultGeoPointMapping>
}
// Mappings for our datatype fields
#[derive(Default)]
struct ContentMapping;
impl TextMapping for ContentMapping {
fn analyzer() -> Option<&'static str> {
Some("content_text")
}
}
#[derive(Default)]
struct TimestampMapping;
impl DateMapping for TimestampMapping {
type Format = EpochMillis;
fn null_value() -> Option<Date<Self>> {
Some(Date::now())
}
}
fn main() {
println!("\"{}\":{{ {} }}",
Article::name(),
serde_json::to_string(&Article::index_mapping()).unwrap()
);
}
The above example defines a struct
called Article
with a few fields:
- A default
integer
field calledid
- A default
string
field calledtitle
- A
text
field with a custom analyser calledcontent
- A
date
field with theepoch_millis
format that defaults to the time the index was created calledtimestamp
- An object field called
GeoIp
with defaultip
andgeo_point
fields.
Go ahead and run that sample and see what it outputs. In case you’re interested, it’ll look something like this (minus the whitespace):
"article": {
"properties": {
"id":{
"type": "integer"
},
"title": {
"type":"text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"analyzer": "content_text"
},
"timestamp": {
"type": "date",
"format": "epoch_millis",
"null_value": "1435935302478"
},
"geoip": {
"type": "nested",
"properties": {
"ip": {
"type": "ip"
},
"loc": {
"type": "geo_point"
}
}
}
}
}
The mapping is constructed by inspecting the type parameters of the fields on Article
and GeoIp
at compile-time.
This mapping is then serialised by serde
at runtime.
§Types
Types in Elasticsearch are a combination of source and mapping.
The source is the data (like 42
or "my string"
) and the mapping is metadata about how to
interpret and use the data (like the format of a date string).
The approach elastic_types
takes to types is to bundle the mapping up as a Zero Sized Type,
which is then bound to a field type as a generic parameter. For example:
Boolean<MyMapping>
The source type is boolean
and the mapping is MyMapping
.
All document types implement DocumentType
with an associated Mapping: DocumentMapping
type.
The following table illustrates the types provided by elastic_types
:
Elasticsearch Type | Rust Type (Default Mapping) | Crate | Rust Type (Custom Mapping) | Format Type |
---|---|---|---|---|
object | - | - | type implementing DocumentType<M> | - |
integer | i32 | std | Integer<M> | - |
long | i64 | std | Long<M> | - |
short | i16 | std | Short<M> | - |
byte | i8 | std | Byte<M> | - |
float | f32 | std | Float<M> | - |
double | f64 | std | Double<M> | - |
keyword | - | - | Keyword<M> | - |
text | String | std | Text<M> | - |
boolean | bool | std | Boolean<M> | - |
ip | Ipv4Addr | std | Ip<M> | - |
date | DateTime<Utc> | chrono | Date<M> | DateFormat |
geo_point | Point | geo | GeoPoint<M> | GeoPointFormat |
geo_shape | - | geojson | GeoShape<M> | - |
§Mapping
Having the mapping available at compile-time captures the fact that a mapping is static and tied to the data type.
Where there’s a std
type that’s equivalent to an Elasticsearch type (like i32
for integer
),
a default mapping is implemented for that type.
That means you can use primitives in your structs and have them mapped to the correct type in Elasticsearch.
If you want to provide your own mapping for a std
type, there’s also a struct provided by elastic_types
that wraps the std
type but also takes an explicit mapping (like Integer
which implements Deref<Target = i32>
).
Where there isn’t a std
type available (like date
), an external crate is used and an implementation of
that type is provided (like Date
, which implements Deref<Target = chrono::DateTime<Utc>>
).
§Formats
For some types (like Date
), it’s helpful to have an extra generic parameter that describes the
way the data can be interpreted. For most types the format isn’t exposed, because there aren’t any alternative formats available.
This is a particularly helpful feature for serialisation.
§Links
Modules§
- Implementation of the Elasticsearch
boolean
types. - Implementation of the Elasticsearch
date
type. - Base requirements for indexable document mappings.
- Implementation of the Elasticsearch
geo
types. - Implementation of the Elasticsearch
ip
type. - Implementation of the Elasticsearch
number
types. - Includes all data types.
- Implementation of the Elasticsearch
keyword
andtext
types.