Crate elastic_types [] [src]

Elasticsearch Core Types

An implementation of Elasticsearch data types and document mapping.

This library provides tools for defining and using Elasticsearch type mappings, where correctness is enforced by Rust's type system. The mapping information is then used when serialising and deserialising your types. Annotating type fields with mapping metadata has no impact at runtime.

This library makes extensive use of serde.

Supported Versions

elastic_types Elasticsearch
0.x 5.x

Usage

This crate is on crates.io.

To get started, add elastic_types and elastic_types_derive to your Cargo.toml:

Be careful when using this code, it's not being tested!
[dependencies]
elastic_types = version = "*"
elastic_types_derive = "*"

And reference it in your crate root:

Be careful when using this code, it's not being tested!
#[macro_use]
extern crate elastic_types_derive;
extern crate elastic_types;

Map Your Types

Derive ElasticType on your Elasticsearch-mappable types:

#[derive(Serialize, ElasticType)]
pub struct MyType {
    pub my_date: Date<DefaultDateMapping>,
    pub my_num: i32
}

You can then serialise your mapping as json using an IndexDocumentMapping wrapper:

let mapping = serde_json::to_string(&MyType::index_mapping()).unwrap();

This will produce the following result:

{
    "properties": {
        "my_date": {
            "type": "date",
            "format": "basic_date_time"
        },
        "my_num": {
            "type": "integer"
        }
    }
}

Mapping structs as fields

Of course, structs that derive ElasticType can also be used as fields in other Elasticsearch types:

#[derive(Serialize, Deserialize, ElasticType)]
pub struct MyOtherType {
    pub my_type: MyType
}

Our mapping for MyOtherType then looks like:

{
    "properties": {
        "my_type": {
            "type": "nested",
            "properties": {
                "my_date": {
                    "type": "date",
                    "format": "basic_date_time"
                },
                "my_num": {
                    "type": "integer"
                }
            }
        }
    }
}

Mapping Option and Vec

Elasticsearch doesn't differentiate between nullable types or collections, so it's also possible to derive mapping from Option or Vec types:

#[derive(Serialize, Deserialize, ElasticType)]
pub struct MyType {
    pub my_date: Option<Date<DefaultDateMapping>>,
    pub my_num: Vec<i32>
}

This produces the same mapping as before. See the document mod for more details.

Overloading default mapping

You can override the default mapping for Elasticsearch's core datatypes by implementing the appropriate trait. In the below example, we create a custom boolean mapping:

#[derive(Default)]
struct MyMapping;
impl BooleanMapping for MyMapping {
    fn boost() -> Option<f32> { Some(1.04) }
}

For more details about the supported core datatypes and how to use them, see here.

Serialise Your Types

Types that derive ElasticType are themselves serialisable, which can be very helpful when using types like date with special formats. Take the following document:

Be careful when using this code, it's not being tested!
{
    "id": 15,
    "timestamp": 1435935302478,
    "title": "my timestamped object"
}

Using the Date<DefaultDateMapping<EpochMillis>> type for the timestamp, we can correctly deserialise the document as a strongly typed object:

#[derive(Serialize, Deserialize, ElasticType)]
struct MyType {
    id: i32,
    timestamp: Timestamp,
    title: String
}

type Timestamp = Date<DefaultDateMapping<EpochMillis>>;

let de: MyType = serde_json::from_str(json).unwrap();

assert_eq!(2015, de.timestamp.year());

A Complete Example

Before digging in to the API, consider the following complete example for defining and mapping a type called Article. As json, the Article type should look something like this:

Be careful when using this code, it's not being tested!
{
    "id": 1,
    "title": "An article",
    "content": "Some prose for this article.",
    "timestamp": 1435935302478,
    "geoip": {
        "ip": "10.0.0.1",
        "loc": [ -71.34, 41.12 ]
    }
}

Our Cargo.toml specifies the dependencies as above:

Be careful when using this code, it's not being tested!
[dependencies]
elastic_types = "*"
elastic_types_derive = "*"

And our main.rs contains the following:

#[macro_use]
extern crate serde_derive;
extern crate serde_json;
extern crate serde;

#[macro_use]
extern crate elastic_types_derive;
#[macro_use]
extern crate elastic_types;

use elastic_types::prelude::*;

// Our main datatype, `article`

#[derive(Serialize, Deserialize, ElasticType)]
struct Article {
    pub id: i32,
    pub title: String,
    pub content: Text<ContentMapping>,
    pub timestamp: Option<Date<TimestampMapping>>,
    pub geoip: GeoIp
}


// A second datatype, `geoip`

#[derive(Serialize, Deserialize, ElasticType)]
struct GeoIp {
    pub ip: ::std::net::Ipv4Addr,
    pub loc: GeoPoint<DefaultGeoPointMapping>
}


// Mappings for our datatype fields

#[derive(Default)]
struct ContentMapping;
impl TextMapping for ContentMapping {
    fn analyzer() -> Option<&'static str> {
        Some("content_text")
    }
}

#[derive(Default)]
struct TimestampMapping;
impl DateMapping for TimestampMapping {
    type Format = EpochMillis;

    fn null_value() -> Option<Date<Self>> {
        Some(Date::now())
    }
}

fn main() {
    println!("\"{}\":{{ {} }}",
        Article::name(),
        serde_json::to_string(&Article::index_mapping()).unwrap()
    );
}

The above example defines a struct called Article with a few fields:

  • A default integer field called id
  • A default string field called title
  • A text field with a custom analyser called content
  • A date field with the epoch_millis format that defaults to the time the index was created called timestamp
  • An object field called GeoIp with default ip and geo_point fields.

Go ahead and run that sample and see what it outputs. In case you're interested, it'll look something like this (minus the whitespace):

Be careful when using this code, it's not being tested!
"article": {
    "properties": {
        "id":{
            "type": "integer"
        },
        "title": {
            "type":"text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                }
            }
        },
        "content": {
            "type": "text",
            "analyzer": "content_text"
        },
        "timestamp": {
            "type": "date",
            "format": "epoch_millis",
            "null_value": "1435935302478"
        },
        "geoip": {
            "type": "nested",
            "properties": {
                "ip": {
                    "type": "ip"
                },
                "loc": {
                    "type": "geo_point"
                }
            }
        }
    }
}

The mapping is constructed by inspecting the type parameters of the fields on Article and GeoIp at compile-time. This mapping is then serialised by serde at runtime.

Types

Types in Elasticsearch are a combination of source and mapping. The source is the data (like 42 or "my string") and the mapping is metadata about how to interpret and use the data (like the format of a date string).

The approach elastic_types takes to types is to bundle the mapping up as a Zero Sized Type, which is then bound to a field type as a generic parameter. For example:

Be careful when using this code, it's not being tested!
Boolean<MyMapping>

The source type is boolean and the mapping is MyMapping.

All document types implement DocumentType with an associated Mapping: DocumentMapping type.

The following table illustrates the types provided by elastic_types:

Elasticsearch Type Rust Type (Default Mapping) Crate Rust Type (Custom Mapping) Format Type
object - - type implementing DocumentType<M> -
integer i32 std Integer<M> -
long i64 std Long<M> -
short i16 std Short<M> -
byte i8 std Byte<M> -
float f32 std Float<M> -
double f64 std Double<M> -
keyword - - Keyword<M> -
text String std Text<M> -
boolean bool std Boolean<M> -
ip Ipv4Addr std Ip<M> -
date DateTime<Utc> chrono Date<M> DateFormat
geo_point Point geo GeoPoint<M> GeoPointFormat
geo_shape - geojson GeoShape<M> -

Mapping

Having the mapping available at compile-time captures the fact that a mapping is static and tied to the data type.

Where there's a std type that's equivalent to an Elasticsearch type (like i32 for integer), a default mapping is implemented for that type. That means you can use primitives in your structs and have them mapped to the correct type in Elasticsearch. If you want to provide your own mapping for a std type, there's also a struct provided by elastic_types that wraps the std type but also takes an explicit mapping (like Integer which implements Deref<Target = i32>).

Where there isn't a std type available (like date), an external crate is used and an implementation of that type is provided (like Date, which implements Deref<Target = chrono::DateTime<Utc>>).

Formats

For some types (like Date), it's helpful to have an extra generic parameter that describes the way the data can be interpreted. For most types the format isn't exposed, because there aren't any alternative formats available. This is a particularly helpful feature for serialisation.

Links

Modules

boolean

Implementation of the Elasticsearch boolean types.

date

Implementation of the Elasticsearch date type.

document

Base requirements for indexable document mappings.

geo

Implementation of the Elasticsearch geo types.

ip

Implementation of the Elasticsearch ip type.

number

Implementation of the Elasticsearch number types.

prelude

Includes all data types.

string

Implementation of the Elasticsearch keyword and text types.