Expand description
Defines parser logic and iterators over OSM elements.
§Iterators
At your disposal there are:
- The
BlobIterator- Iterate over.osm.pbfblob segments - The
BlockIterator- Iterate overHeader/Primitiveblocks - The
ElementIterator- Iterate over un-decodedNode,Way,RelationandDenseNodesset. - The
ProcessedElementIterator- Iterate over decodedNodeandWays.
Each of which can be done in series, or in parallel wherever the Parallel trait is implemented.
§Encoding
To understand when to use what iterator, we can understand it as the following.
The file is split into multiple depth levels:
- Blob - Highest depth (contains multiple elements)
- Block - Medium depth
- Element - Lowest depth
The OSM protobuf structure sorts elements as such,
where this component provides each segment’s iterator,
to allow for more control over each level, in terms
of permitting parallel processing, injections, etc.
blob item
└─┬─ block item - Two Block Variants
├─┬─ header block
│ ├── ... metadata
│ └── bounding box
│
└─┬─ primitive block
├── ... metadata
└─┬─ primitive group[] - Elements
├── Node[]
├── DenseNodes
├── Way[]
└── Relation[]The above structure outlines the blob, block
and element subsections of the codec module.
§Blob
The blob item, outlines the blob section. This refers
to the BlobItem structure which sections to the size and
length of each file block it contains, held in the BlobHeader
of which it controls.
A BlobHeader can have two types:
"OSMData"- Data Subcomponent"OSMHeader"- Metadata Subcomponent
An OSMHeader blob refers to a header block on the above
diagram, whilst an OSMData blob is a primitive block.
These blobs are considered “lightweight”, as they contain
small data headers, and simply indicate where the next header
is, so can be used to quickly index the file, removing
the need to parse the entire data structure it contains.
This means we can utilize parallel optimisation to parse our data blocks, whilst knowing their file offset, similar to a Skip List.
§Block
In order to parse a block, we need to know the offset and size
of the block, which is conveniently stored in a BlobHeader.
This BlobHeader has a data field, which contains the data
of our block, the type of block we are decoding is stored as
a literal string, "OSMData" or "OSMHeader".
All blocks can be decoded using the block module, enumerable
using the BlockIterator, which enumerates over each block.
This iterator will return a BlockItem, which can be either
a PrimitiveBlock or a HeaderBlock.
let path = PathBuf::from(DISTRICT_OF_COLUMBIA);
let iterator = BlockIterator::new(path)
.expect("Failed to create iterator");
for block in iter {
// Do something with the block...
}In order to decode each utility item, such as a Node or Way,
an iteration over the BlockItem’s elements is required, which
is done through the ElementIterator.
In order to determine where nodes are positioned, contained,
offset, etc. We utilise the HeaderBlock component, which
contains such data. More information can be found from the
OSM wiki itself, here.
§Element
Lastly, we have elements themselves. These are served in two variants. The first has no extra parsing performed, this is more suited to applications which do not require small memory footprints, or does not need access to every node. Which is likely an uncommon use-case, hence why we have processed elements.
The processed variant unpacks DenseNodes into a vector of
nodes, as well as reformatting Ways to drop unnecessary
information, which keeps the memory footprint small as we don’t
allocate the entire file, instead only the current segments
we are working on.
If we were to use the ProcessedElementIterator, it would
look as follows.
let path = PathBuf::from(DISTRICT_OF_COLUMBIA);
let iter = ProcessedElementIterator::new(path).expect("Could not create iterator");
let nodes = iter.map_red(|item| {
match item {
ProcessedElement::Way(_) => 0,
ProcessedElement::Node(_) => 1,
}
}, |a, b| a + b, || 0);Notice, we have non-standard functions that we can perform.
These are different from map/red/.... They can be found
in the Parallel trait.
Modules§
- blob
- The Blob iterator and item definitions
- block
- The Block iterator and item definitions
- element
- Element and ProcessedElement iterator, and item definitions
- model
- OpenStreetMaps Protobuf Definitions
Structs§
Traits§
- Parallel
- Defines the set of functions available on a parallel iterator. This allows for more efficient traversal of elements within a file.