Module tree

Module tree 

Source
Expand description

Module for the output tree structure.

This module provides the types for the tree structure that constitutes the output of the validator. The nodes in the tree are intended to correspond exactly to the protobuf messages, primitives, and YAML values (the latter actually using the JSON object model) that constitute the incoming plan. Likewise, the structure of the tree is the same as the input. However, unlike the input:

  • All nodes and the relations between them are encapsulated in generic types, independent from the corresponding messages/values in the original tree. This allows the tree to be traversed by generic code with no understanding of Substrait.
  • Additional information can be attached to the nodes, edges, and between the edges, such as diagnostic messages and data type information.

The node type for the output trees is Node. This structure contains a single NodeType enum variant and zero or more NodeData enum variants in an ordered sequence to form the tree structure; NodeType includes information about the node itself, while the NodeData elements represent edges to other nodes (Child) or contextual information. A subtree might look something like this:

                Node ---> ProtoMessage                   } Parent node
                 |
  .--------------'--------------.
  |         |         |         |
  v         v         v         v
Child  Diagnostic  Comment    Child                      } Edges
  |                             |
  v                             v
Node ---> ProtoPrimitive      Node ---> ProtoMessage     } Child nodes
           |                    |
           '-> PrimitiveData    :

Note that the Child struct includes information about how the child node relates to its parent (which field, array element, etc) via PathElement, such that the original tree structure could in theory be completely reconstructed.

Nevertheless, the conversion from protobuf/YAML to this tree structure is only intended to be a one-way street; indeed, the output tree is not intended to ever be treated as some executable query plan by a computer at all. It serves only as an intermediate format for documentation, debug, and/or validation output. The export module deals with breaking this internal representation down further, into (file) formats that are not specific to the Substrait validator.

Structs§

Child
Reference to a child node in the tree.
FlattenedNodeDataIter
FlattenedNodeIter
Node
Node for a semi-structured documentation-like tree representation of a parsed Substrait plan. The intention is for this to be serialized into some human-readable format.
NodeReference
A reference to a node elsewhere in the tree.

Enums§

Class
Semantical information about a node.
NodeData
Information nodes for a parsed protobuf message.
NodeType
The original data type that the node represents, to (in theory) allow the original structure of the plan to be recovered from the documentation tree.