Elasticsearch Core Types
An implementation of Elasticsearch data types and document mapping.
This library provides tools for defining and using Elasticsearch type mappings, where correctness is enforced by Rust's type system. The mapping information is then used when serialising and deserialising your types. Annotating type fields with mapping metadata has no impact at runtime.
This library makes extensive use of serde
.
Supported Versions
elastic_types |
Elasticsearch |
---|---|
0.x |
5.x |
Usage
This crate is on crates.io.
There are two ways to reference elastic_types
in your projects, depending on whether you're on
the stable
/beta
or nightly
channels.
Builds on nightly
benefit from compile-time codegen for better performance and easier
mapping definitions.
Nightly
To get started, add elastic_types
and elastic_types_derive
to your Cargo.toml
:
[dependencies]
elastic_types = { version = "*", features = "nightly" }
elastic_types_derive = "*"
And reference it in your crate root:
#![feature(proc_macro)]
#[macro_use]
extern crate elastic_types_derive;
#[macro_use]
extern crate elastic_types;
Stable
To get started, add elastic_types
to your Cargo.toml
:
[dependencies]
elastic_types = "*"
And reference it in your crate root:
#[macro_use]
extern crate elastic_types;
Map Your Types
This section shows you how to add mapping metadata on the nightly
channel.
For mapping on stable
, see here.
NOTE: Once Macros 1.1 is stabilised, you'll be able to use
elastic_types_derive
on thestable
channel.
Derive ElasticType
on your Elasticsearch-mappable types:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde;
# use elastic_types::prelude::*;
#[derive(Serialize, ElasticType)]
pub struct MyType {
pub my_date: Date<DefaultDateFormat>,
pub my_num: i32
}
# fn main() {
# }
You can then serialise your mapping as json using a Document
wrapper:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde_json;
# extern crate serde;
# use elastic_types::prelude::*;
# #[derive(Serialize, Deserialize, ElasticType)]
# pub struct MyType {
# pub my_date: Date<DefaultDateFormat>,
# pub my_string: String,
# pub my_num: i32
# }
# fn main() {
let mapping = serde_json::to_string(&Document::from(MyType::mapping())).unwrap();
# }
This will produce the following result:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde_json;
# extern crate serde;
# use elastic_types::prelude::*;
# #[derive(Serialize, Deserialize, ElasticType)]
# pub struct MyType {
# pub my_date: Date<DefaultDateFormat>,
# pub my_num: i32
# }
# fn main() {
# let mapping = serde_json::to_string(&Document::from(MyType::mapping())).unwrap();
# let json = json_str!(
{
"properties": {
"my_date": {
"type": "date",
"format": "basic_date_time"
},
"my_num": {
"type": "integer"
}
}
}
# );
# assert_eq!(json, mapping);
# }
Mapping structs as fields
Of course, structs that derive ElasticType
can also be used as fields in other Elasticsearch types:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde;
# use elastic_types::prelude::*;
# #[derive(Serialize, Deserialize, ElasticType)]
# pub struct MyType {
# pub my_date: Date<DefaultDateFormat>,
# pub my_num: i32
# }
#[derive(Serialize, Deserialize, ElasticType)]
pub struct MyOtherType {
pub my_type: MyType
}
# fn main() {
# }
Our mapping for MyOtherType
then looks like:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde_json;
# extern crate serde;
# use elastic_types::prelude::*;
# #[derive(Serialize, Deserialize, ElasticType)]
# pub struct MyType {
# pub my_date: Date<DefaultDateFormat>,
# pub my_num: i32
# }
# #[derive(Serialize, Deserialize, ElasticType)]
# pub struct MyOtherType {
# pub my_type: MyType
# }
# fn main() {
# let mapping = serde_json::to_string(&Document::from(MyOtherType::mapping())).unwrap();
# let json = json_str!(
{
"properties": {
"my_type": {
"type": "nested",
"properties": {
"my_date": {
"type": "date",
"format": "basic_date_time"
},
"my_num": {
"type": "integer"
}
}
}
}
}
# );
# assert_eq!(json, mapping);
# }
Mapping Option
and Vec
Elasticsearch doesn't differentiate between nullable types or collections, so it's also possible
to derive mapping from Option
or Vec
types:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde;
# use elastic_types::prelude::*;
#[derive(Serialize, Deserialize, ElasticType)]
pub struct MyType {
pub my_date: Option<Date<DefaultDateFormat>>,
pub my_num: Vec<i32>
}
# fn main() {
# }
This produces the same mapping as before.
See the object
mod for more details.
Overloading default mapping
You can override the default mapping for Elasticsearch's core datatypes by implementing
the appropriate trait. In the below example, we create a custom boolean
mapping:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde;
# use elastic_types::prelude::*;
#[derive(Default)]
struct MyMapping;
impl BooleanMapping for MyMapping {
fn boost() -> Option<f32> { Some(1.04) }
}
# fn main() {
# }
For more details about the supported core datatypes and how to use them, see here.
Serialise Your Types
Types that derive ElasticType
are themselves serialisable, which can be very helpful when using
types like date
with special formats.
Take the following document:
{
"id": 15,
"timestamp": 1435935302478,
"title": "my timestamped object"
}
Using the Date<EpochMillis>
type for the timestamp
, we can correctly deserialise the document as a strongly typed
object:
# #![feature(proc_macro)]
# #[macro_use]
# extern crate json_str;
# #[macro_use]
# extern crate serde_derive;
# #[macro_use]
# extern crate elastic_types_derive;
# #[macro_use]
# extern crate elastic_types;
# extern crate serde;
# extern crate serde_json;
# use elastic_types::prelude::*;
#[derive(Serialize, Deserialize, ElasticType)]
struct MyType {
id: i32,
timestamp: Date<EpochMillis>,
title: String
}
# fn main() {
# let json = "{\"id\": 15,\"timestamp\": 1435935302478,\"title\": \"my timestamped object\"}";
let de: MyType = serde_json::from_str(json).unwrap();
assert_eq!(2015, de.timestamp.year());
# }
A Complete Example
Before digging in to the API, consider the following complete example for defining and mapping a
type called Article
on nightly
.
As json
, the Article
type should look something like this:
{
"id": 1,
"title": "An article",
"content": "Some prose for this article.",
"timestamp": 1435935302478,
"geoip": {
"ip": "10.0.0.1",
"loc": [ -71.34, 41.12 ]
}
}
Our Cargo.toml
specifies the dependencies as above:
[dependencies]
elastic_types = { version = "*", features = "nightly" }
elastic_types_derive = "*"
And our main.rs
contains the following:
#![feature(proc_macro)]
#[macro_use]
extern crate serde_derive;
extern crate serde_json;
extern crate serde;
#[macro_use]
extern crate elastic_types_derive;
#[macro_use]
extern crate elastic_types;
use elastic_types::prelude::*;
// Our main datatype, `article`
#[derive(Serialize, Deserialize, ElasticType)]
struct Article {
pub id: i32,
pub title: String,
pub content: Text<ContentMapping>,
pub timestamp: Option<Date<EpochMillis, TimestampMapping>>,
pub geoip: GeoIp
}
// A second datatype, `geoip`
#[derive(Serialize, Deserialize, ElasticType)]
struct GeoIp {
pub ip: ::std::net::Ipv4Addr,
pub loc: GeoPoint<DefaultGeoPointFormat>
}
// Mappings for our datatype fields
#[derive(Default)]
struct ContentMapping;
impl TextMapping for ContentMapping {
fn analyzer() -> Option<&'static str> {
Some("content_text")
}
}
#[derive(Default)]
struct TimestampMapping;
impl DateMapping for TimestampMapping {
type Format = EpochMillis;
fn null_value() -> Option<Date<EpochMillis, Self>> {
Some(Date::now())
}
}
fn main() {
println!("\"{}\":{{ {} }}",
Article::name(),
serde_json::to_string(&Document::from(Article::mapping())).unwrap()
);
}
The above example defines a struct
called Article
with a few fields:
- A default
integer
field calledid
- A default
string
field calledtitle
- A
text
field with a custom analyser calledcontent
- A
date
field with theepoch_millis
format that defaults to the time the index was created calledtimestamp
- An object field called
GeoIp
with defaultip
andgeo_point
fields.
Go ahead and run that sample and see what it outputs. In case you're interested, it'll look something like this (minus the whitespace):
"article": {
"properties": {
"id":{
"type": "integer"
},
"title": {
"type":"text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"analyzer": "content_text"
},
"timestamp": {
"type": "date",
"format": "epoch_millis",
"null_value": "1435935302478"
},
"geoip": {
"type": "nested",
"properties": {
"ip": {
"type": "ip"
},
"loc": {
"type": "geo_point"
}
}
}
}
}
The mapping is constructed by inspecting the type parameters of the fields on Article
and GeoIp
at compile-time.
This mapping is then serialised by serde
at runtime.
Types
Types in Elasticsearch are a combination of source and mapping.
The source is the data (like 42
or "my string"
) and the mapping is metadata about how to
interpret and use the data (like the format of a date string).
The approach elastic_types
takes to types is to bundle the mapping up as a Zero Sized Type,
which is then bound to a field type as a generic parameter. For example:
Boolean<MyMapping>
The source type is boolean
and the mapping is MyMapping
.
All Elasticsearch types implement the base FieldType<M: FieldMapping<F>, F>
trait
where M
is the mapping and F
is a type-specific format.
The following table illustrates the types provided by elastic_types
:
Elasticsearch Type | Rust Type (Default Mapping) | Crate | Rust Type (Custom Mapping) | Format Type |
---|---|---|---|---|
object |
- | - | type implementing DocumentType<M> |
- |
integer |
i32 |
std |
Integer<M> |
- |
long |
i64 |
std |
Long<M> |
- |
short |
i16 |
std |
Short<M> |
- |
byte |
i8 |
std |
Byte<M> |
- |
float |
f32 |
std |
Float<M> |
- |
double |
f64 |
std |
Double<M> |
- |
keyword |
- | - | Keyword<M> |
- |
text |
String |
std |
Text<M> |
- |
boolean |
bool |
std |
Boolean<M> |
- |
ip |
Ipv4Addr |
std |
Ip<M> |
- |
date |
DateTime<UTC> |
chrono |
Date<F, M> |
DateFormat |
geo_point |
Point |
geo |
GeoPoint<F, M> |
GeoPointFormat |
geo_shape |
- | geojson |
GeoShape<M> |
- |
Mapping
Having the mapping available at compile-time captures the fact that a mapping is static and tied to the data type.
Where there's a std
type that's equivalent to an Elasticsearch type (like i32
for integer
),
a default mapping is implemented for that type.
That means you can use primitives in your structs and have them mapped to the correct type in Elasticsearch.
If you want to provide your own mapping for a std
type, there's also a struct provided by elastic_types
that wraps the std
type but also takes an explicit mapping (like Integer
which implements Deref<Target = i32>
).
Where there isn't a std
type available (like date
), an external crate is used and an implementation of
that type is provided (like Date
, which implements Deref<Target = chrono::DateTime<UTC>>
).
Formats
For some types (like Date
), it's helpful to have an extra generic parameter that describes the
way the data can be interpreted. For most types the format isn't exposed, because there aren't any alternative formats available.
This is a particularly helpful feature for serialisation.