Crate bson

Crate bson 

Source
Expand description

BSON, short for Binary JSON, is a binary-encoded serialization of JSON-like documents. Like JSON, BSON supports the embedding of documents and arrays within other documents and arrays. BSON also contains extensions that allow representation of data types that are not part of the JSON spec. For example, BSON has a datetime type and a binary data type.

// JSON equivalent
{"hello": "world"}

// BSON encoding
\x16\x00\x00\x00                   // total document size
\x02                               // 0x02 = type String
hello\x00                          // field name
\x06\x00\x00\x00world\x00          // field value
\x00                               // 0x00 = type EOO ('end of object')

BSON is the primary data representation for MongoDB, and this crate is used in the mongodb driver crate in its API and implementation.

For more information about BSON itself, see bsonspec.org.

§Installation

§Requirements

  • Rust 1.64+

§Importing

This crate is available on crates.io. To use it in your application, simply add it to your project’s Cargo.toml.

[dependencies]
bson = "3.0.0"

Note that if you are using bson through the mongodb crate, you do not need to specify it in your Cargo.toml, since the mongodb crate already re-exports it.

§Feature Flags
FeatureDescriptionDefault
chrono-0_4Enable support for v0.4 of the chrono crate in the public API.no
jiff-0_2Enable support for v0.2 of the jiff crate in the public API.no
uuid-1Enable support for v1.x of the uuid crate in the public API.no
time-0_3Enable support for v0.3 of the time crate in the public API.no
serdeEnable integration with the serde serialization/deserialization framework.no
serde_with-3Enable serde_with type conversion utilities in the public API.no
serde_path_to_errorEnable support for error paths via integration with serde_path_to_error. This is an unstable feature and any breaking changes to serde_path_to_error may affect usage of it via this feature.no
compat-3-0-0Required for future compatibility if default features are disabled.yes
large_datesIncrease the supported year range for some bson::DateTime utilities from +/-9,999 (inclusive) to +/-999,999 (inclusive). Note that enabling this feature can impact performance and introduce parsing ambiguities.no
serde_json-1Enable support for v1.x of the serde_json crate in the public API.no

§BSON values

Many different types can be represented as a BSON value, including 32-bit and 64-bit signed integers, 64 bit floating point numbers, strings, datetimes, embedded documents, and more. To see a full list of possible BSON values, see the BSON specification. The various possible BSON values are modeled in this crate by the Bson enum.

§Creating Bson instances

Bson values can be instantiated directly or via the bson! macro:

use bson::{bson, Bson};

let string = Bson::String("hello world".to_string());
let int = Bson::Int32(5);
let array = Bson::Array(vec![Bson::Int32(5), Bson::Boolean(false)]);

let string: Bson = "hello world".into();
let int: Bson = 5i32.into();

let string = bson!("hello world");
let int = bson!(5);
let array = bson!([5, false]);

bson! has supports both array and object literals, and it automatically converts any values specified to Bson, provided they are Into<Bson>.

§Bson value unwrapping

Bson has a number of helper methods for accessing the underlying native Rust types. These helpers can be useful in circumstances in which the specific type of a BSON value is known ahead of time.

e.g.:

use bson::{bson, Bson};

let value = Bson::Int32(5);
let int = value.as_i32(); // Some(5)
let bool = value.as_bool(); // None

let value = bson!([true]);
let array = value.as_array(); // Some(&Vec<Bson>)

§BSON documents

BSON documents are ordered maps of UTF-8 encoded strings to BSON values. They are logically similar to JSON objects in that they can contain subdocuments, arrays, and values of several different types. This crate models BSON documents via the Document struct.

§Creating Documents

Documents can be created directly either from a byte reader containing BSON data or via the doc! macro:

use bson::{doc, Document};
use std::io::Read;

let mut bytes = hex::decode("0C0000001069000100000000").unwrap();
let doc = Document::from_reader(&mut bytes.as_slice()).unwrap(); // { "i": 1 }

let doc = doc! {
   "hello": "world",
   "int": 5,
   "subdoc": { "cat": true },
};

doc! works similarly to bson!, except that it always returns a Document rather than a Bson.

§Document member access

Document has a number of methods on it to facilitate member access:

use bson::doc;

let doc = doc! {
   "string": "string",
   "bool": true,
   "i32": 5,
   "doc": { "x": true },
};

// attempt get values as untyped Bson
let none = doc.get("asdfadsf"); // None
let value = doc.get("string"); // Some(&Bson::String("string"))

// attempt to get values with explicit typing
let string = doc.get_str("string"); // Ok("string")
let subdoc = doc.get_document("doc"); // Some(Document({ "x": true }))
let error = doc.get_i64("i32"); // Err(...)

§Integration with serde

While it is possible to work with documents and BSON values directly, it will often introduce a lot of boilerplate for verifying the necessary keys are present and their values are the correct types. Enabling the serde feature provides integration with the serde crate that maps BSON data into Rust data structs largely automatically, removing the need for all that boilerplate.

e.g.:

use serde::{Deserialize, Serialize};
use bson::{bson, Bson};

#[derive(Serialize, Deserialize)]
struct Person {
    name: String,
    age: i32,
    phones: Vec<String>,
}

// Some BSON input data as a [`Bson`].
let bson_data: Bson = bson!({
    "name": "John Doe",
    "age": 43,
    "phones": [
        "+44 1234567",
        "+44 2345678"
    ]
});

// Deserialize the Person struct from the BSON data, automatically
// verifying that the necessary keys are present and that they are of
// the correct types.
let mut person: Person = bson::deserialize_from_bson(bson_data).unwrap();

// Do things just like with any other Rust data structure.
println!("Redacting {}'s record.", person.name);
person.name = "REDACTED".to_string();

// Get a serialized version of the input data as a [`Bson`].
let redacted_bson = bson::serialize_to_bson(&person).unwrap();

Any types that implement Serialize and Deserialize can be used in this way. Doing so helps separate the “business logic” that operates over the data from the (de)serialization logic that translates the data to/from its serialized form. This can lead to more clear and concise code that is also less error prone.

When serializing values that cannot be represented in BSON, or deserialzing from BSON that does not match the format expected by the type, the default error will only report the specific field that failed. To aid debugging, enabling the serde_path_to_error feature will augment errors with the full field path from root object to failing field. This feature does incur a small CPU and memory overhead during (de)serialization and should be enabled with care in performance-sensitive environments.

§Embedding BSON Value Types

The serde feature also enables implementations of Serialize and Deserialize for the Rust types provided by this crate that represent BSON values, allowing them to be embedded in domain-specific structs as appropriate:

use serde::{Deserialize, Serialize};
use bson::{bson, Bson, oid::ObjectId};

#[derive(Serialize, Deserialize)]
struct Person {
    id: ObjectId,
    name: String,
    age: i32,
    phones: Vec<String>,
}

let bson_data: Bson = bson!({
    "id": ObjectId::new(),
    "name": "John Doe",
    "age": 43,
    "phones": [
        "+44 1234567",
        "+44 2345678"
    ]
});

let person: Person = bson::deserialize_from_bson(bson_data).unwrap();

§Encoding vs. Serialization

With the serde feature enabled, a BSON document can be converted to its wire-format byte representation in multiple ways:

use bson::{doc, serialize_to_vec};
let my_document = doc! { "hello": "bson" };
let encoded = my_document.to_vec()?;
let serialized = serialize_to_vec(&my_document)?;

We recommend that, where possible, documents be converted to byte form using the encoding methods (Document::to_vec/Document::to_writer); this is more efficient as it avoids the intermediate serde data model representation. This also applies to decoding; prefer Document::from_reader over deserialize_from_reader / deserialize_from_slice.

§Serializer Compatibility

The implementations of Serialize and Deserialize for BSON value types are tested with the serde [de]serializers provided by this crate and by the serde_json crate. Compatibility with formats provided by other crates is not guaranteed and the data produced by serializing BSON values to other formats may change when this crate is updated.

§Working with Extended JSON

MongoDB Extended JSON (extJSON) is a format of JSON that allows for the encoding of BSON type information. Normal JSON cannot unambiguously represent all BSON types losslessly, so an extension was designed to include conventions for representing those types.

For example, a BSON binary is represented by the following format:

{
   "$binary": {
       "base64": <base64 encoded payload as a string>,
       "subType": <subtype as a one or two character hex string>,
   }
}

For more information on extJSON and the complete list of translations, see the official MongoDB documentation.

All MongoDB drivers and BSON libraries interpret and produce extJSON, so it can serve as a useful tool for communicating between applications where raw BSON bytes cannot be used (e.g. via JSON REST APIs). It’s also useful for representing BSON data as a string.

§Canonical and Relaxed Modes

There are two modes of extJSON: “Canonical” and “Relaxed”. They are the same except for the following differences:

  • In relaxed mode, all BSON numbers are represented by the JSON number type, rather than the object notation.
  • In relaxed mode, the string in the datetime object notation is RFC 3339 (ISO-8601) formatted (if the date is after 1970).

e.g.

let doc = bson!({ "x": 5, "d": bson::DateTime::now() });

println!("relaxed: {}", doc.clone().into_relaxed_extjson());
// relaxed: "{"x":5,"d":{"$date":"2020-06-01T22:19:13.075Z"}}"

println!("canonical: {}", doc.into_canonical_extjson());
// canonical: {"x":{"$numberInt":"5"},"d":{"$date":{"$numberLong":"1591050020711"}}}

Canonical mode is useful when BSON values need to be round tripped without losing any type information. Relaxed mode is more useful when debugging or logging BSON data.

§Deserializing from Extended JSON

Extended JSON can be deserialized into a Bson value using the TryFrom implementation for serde_json::Value. This implementation accepts both canonical and relaxed extJSON, and the two modes can be mixed within a single representation.

e.g.

let json_doc = json!({ "x": 5i32, "y": { "$numberInt": "5" }, "z": { "subdoc": "hello" } });
let bson: Bson = json_doc.try_into().unwrap(); // Bson::Document(...)

let json_date = json!({ "$date": { "$numberLong": "1590972160292" } });
let bson_date: Bson = json_date.try_into().unwrap(); // Bson::DateTime(...)

let invalid_ext_json = json!({ "$numberLong": 5 });
Bson::try_from(invalid_ext_json).expect_err("5 should be a string");

§Serializing to Extended JSON

A Bson value can be serialized into extJSON using the Bson::into_relaxed_extjson and Bson::into_canonical_extjson methods. The Into<serde_json::Value> implementation for Bson produces relaxed extJSON.

e.g.

let doc = bson!({ "x": 5i32, "_id": oid::ObjectId::new() });

let relaxed_extjson: serde_json::Value = doc.clone().into();
println!("{}", relaxed_extjson); // { "x": 5, "_id": { "$oid": <hexstring> } }

let relaxed_extjson = doc.clone().into_relaxed_extjson();
println!("{}", relaxed_extjson); // { "x": 5, "_id": { "$oid": <hexstring> } }

let canonical_extjson = doc.into_canonical_extjson();
println!("{}", canonical_extjson); // { "x": { "$numberInt": "5" }, "_id": { "$oid": <hexstring> } }

§Working with datetimes

The BSON format includes a datetime type, which is modeled in this crate by the DateTime struct, and the Serialize and Deserialize implementations for this struct produce and parse BSON datetimes when serializing to or deserializing from BSON. The popular crate chrono also provides a DateTime type, but its Serialize and Deserialize implementations operate on strings instead, so when using it with BSON, the BSON datetime type is not used. To work around this, the chrono-0_4 feature flag can be enabled. This flag exposes a number of convenient conversions between bson::DateTime and chrono::DateTime, including the serde_helpers::datetime::FromChrono04DateTime serde helper, which can be used to (de)serialize chrono::DateTimes to/from BSON datetimes, and the From<chrono::DateTime> implementation for Bson, which allows chrono::DateTime values to be used in the doc! and bson! macros.

e.g.

use serde::{Serialize, Deserialize};
use serde_with::serde_as;
use bson::doc;
use bson::serde_helpers::datetime;

#[serde_as]
#[derive(Serialize, Deserialize)]
struct Foo {
    // serializes as a BSON datetime.
    date_time: bson::DateTime,

    // serializes as an RFC 3339 / ISO-8601 string.
    chrono_datetime: chrono::DateTime<chrono::Utc>,

    // serializes as a BSON datetime.
    // this requires the "chrono-0_4" feature flag
    #[serde_as(as = "datetime::FromChrono04DateTime")]
    chrono_as_bson: chrono::DateTime<chrono::Utc>,
}

// this automatic conversion also requires the "chrono-0_4" feature flag
let query = doc! {
    "created_at": chrono::Utc::now(),
};

§Working with UUIDs

See the module level documentation for the uuid module.

§WASM support

This crate compiles to the wasm32-unknown-unknown target; when doing so, the js-sys crate is used for the current timestamp component of ObjectId generation.

§Minimum supported Rust version (MSRV)

The MSRV for this crate is currently 1.81. This will be rarely be increased, and if it ever is, it will only happen in a minor or major version release.

Modules§

binary
Contains functionality related to BSON binary values.
datetime
Module containing functionality related to BSON DateTimes. For more information, see the documentation for the DateTime type.
deserde
Deserializer
decimal128
BSON Decimal128 data type representation
document
A BSON document represented as an associative HashMap with insertion ordering.
error
Contains the error-related types for the bson crate.
oid
Module containing functionality related to BSON ObjectIds. For more information, see the documentation for the ObjectId type.
raw
An API for interacting with raw BSON bytes.
serserde
Serializer
serde_helpersserde
Collection of helper functions for serializing to and deserializing from BSON using Serde.
spec
Constants derived from the BSON Specification Version 1.1.
uuid
UUID support for BSON.

Macros§

bson
Construct a bson::BSON value from a literal.
cstr
Construct a 'static &CStr. The validitiy will be verified at compile-time.
doc
Construct a bson::Document value.
rawbson
Construct a crate::RawBson value from a literal.
rawdoc
Construct a crate::RawDocumentBuf value.

Structs§

Binary
Represents a BSON binary value.
DateTime
Struct representing a BSON datetime. Note: BSON datetimes have millisecond precision.
DbPointer
Represents a DBPointer. (Deprecated)
Decimal128
Struct representing a BSON Decimal128 type.
Deserializerserde
Deserializer for deserializing a Bson value.
Document
A BSON document represented as an associative HashMap with insertion ordering.
JavaScriptCodeWithScope
Represents a BSON code with scope value.
RawArray
A slice of a BSON document containing a BSON array value (akin to std::str). This can be retrieved from a RawDocument via RawDocument::get.
RawArrayBuf
An owned BSON array value (akin to std::path::PathBuf), backed by a buffer of raw BSON bytes. This type can be used to construct owned array values, which can be used to append to RawDocumentBuf or as a field in a Deserialize struct.
RawBinaryRef
A BSON binary value referencing raw bytes stored elsewhere.
RawDbPointerRef
A BSON DB pointer value referencing raw bytes stored elesewhere.
RawDeserializerserde
Deserializer for deserializing raw BSON bytes.
RawDocument
A slice of a BSON document (akin to std::str). This can be created from a RawDocumentBuf or any type that contains valid BSON data, including static binary literals, Vec<u8>, or arrays.
RawDocumentBuf
An owned BSON document (akin to std::path::PathBuf), backed by a buffer of raw BSON bytes. This can be created from a Vec<u8> or a crate::Document.
RawJavaScriptCodeWithScope
A BSON “code with scope” value backed by owned raw BSON.
RawJavaScriptCodeWithScopeRef
A BSON “code with scope” value referencing raw bytes stored elsewhere.
RawRegexRef
A BSON regex referencing raw bytes stored elsewhere.
Regex
Represents a BSON regular expression value.
Serializerserde
Serde Serializer
Timestamp
Represents a BSON timestamp value.
Utf8Lossy
Wrapper type for lossily decoding embedded strings with invalid UTF-8 sequences.
Uuid
A struct modeling a BSON UUID value (i.e. a Binary value with subtype 4).

Enums§

Bson
Possible BSON value types.
RawBson
A BSON value backed by owned raw BSON bytes.
RawBsonRef
A BSON value referencing raw bytes stored elsewhere.
UuidRepresentation
Enum of the possible representations to use when converting between Uuid and Binary. This enum is necessary because the different drivers used to have different ways of encoding UUIDs, with the BSON subtype: 0x03 (UUID old). If a UUID has been serialized with a particular representation, it MUST be deserialized with the same representation.

Functions§

deserialize_from_bsonserde
Deserialize a T from the provided Bson value.
deserialize_from_documentserde
Deserialize a T from the provided Document.
deserialize_from_readerserde
Deserialize an instance of type T from an I/O stream of BSON.
deserialize_from_sliceserde
Deserialize an instance of type T from a slice of BSON bytes.
serialize_to_bsonserde
Encode a T Serializable into a Bson value.
serialize_to_bufferserde
Serialize the given T as a BSON byte vector into the provided byte buffer. This allows reusing the same buffer for multiple serializations.
serialize_to_documentserde
Serialize a T Serializable into a BSON Document.
serialize_to_raw_document_bufserde
Serialize the given T as a RawDocumentBuf.
serialize_to_vecserde
Serialize the given T as a BSON byte vector.

Type Aliases§

Array
Alias for Vec<Bson>.