Tile Structure
Top level structure of a tile is a sequence of layers, where each layer consists of (size, tag, data) tuples:
size: varint- size of the data block in bytes, including the size of thetagfieldtag: varint- identifies the block type, e.g.0x01 = feature table v1,0x02 = raster layer,0x03 = routing table, etc. We only define0x01for now.data: u8[]- the actual data block of the specified size
This approach allows us to easily extend the format in the future by adding new block types, while keeping backward compatibility. Parsers can skip unknown block types by reading the size and moving forward accordingly. For now, we only define 0x01 for vector layers, and possibly a few more if needed.
Note the ordering -- tag is after the size because it is possible to treat it as a single byte for now until the parser supports more than 127 types, and can efficiently skip unknown ones without doing a more expensive varint parsing.
Layer 0x01 - MVT compatibility
Structure of the data if the tag above is 0x01. We should focus this tag on MVT compatibility, offering exactly what we had in MVT, but allowing for a clearly defined set of encodings and other optimizations like tessellation. No new data formats (per vertex data, nested data, 3d geometries, etc.). No extendable encodings - once finalized, 0x01 will only allow what has been specified. This will ensure that if a decoder declares "0x01" support, it will parse every specification-compliant 0x01 layer. For any new features and encodings we will simply use a new tag ID, likely reusing most of the existing encoding/decoding code.
name: string- Name of the layercolumnCount: varint- Number of columns in the layer- each column is defined as:
columnType: varint- same idea astagabove, e.g.1 = id,2 = geometry,3 = int property, etc.- TODO...
Code Structure
Given the raw input bytes, the parser quickly runs over the input slice and only stores references to the streams and their metadata. Later, decoding can be done on-demand, either for all columns, or just for the specific ones. This example is for Id, but the same idea applies to Geometry, and Property entities.
RawIdstruct contains references into the original input data. The values are not decoded, just some metadata is parsed. Most data is stored asStream<'a>instances, which hold references to parts of the original input and tie to input lifetime.OwnedRawIdstruct is auto-generated with the borrowme crate - it has the same fields as theRawIdstruct, but owns its data. This is useful when you want to store aRawIdstruct beyond the lifetime of the original input slice, or when you want to modify it or store the result of the encoding before storing it into a file.DecodedIdstruct is used to store the decoded value. At the moment, onlyDecodedGeometryis implemented, but the same idea applies to other entities. The decoded values are stored in standard Rust types, e.g.Vec<u64>for IDs.Idenum containsRaw(RawId)andDecoded(DecodedId)variants, with values described above. This allowsin-placedecoding, e.g. it is possible to decode just one property column / ID / Geometry, while keeping the rest in their raw form. The enum also has a correspondingborrowme-generatedOwnedId.
Tools
See the mlt tool for various ways to interact with the parser and decoder, including a terminal-based visualizer for exploring MLT files.