Crate re_types_builder

Source
Expand description

This crate implements Rerun’s code generation tools.

These tools translate language-agnostic IDL definitions (flatbuffers) into code.

They are invoked by pixi run codegen.

§Organization

The code generation process happens in 4 phases.

§1. Generate binary reflection data from flatbuffers definitions.

All this does is invoke the flatbuffers compiler (flatc) with the right flags in order to generate the binary dumps.

Look for compile_binary_schemas in the code.

§2. Run the semantic pass.

The semantic pass transforms the low-level raw reflection data generated by the first phase into higher level objects that are much easier to inspect/manipulate and overall friendlier to work with.

Look for objects.rs.

§3. Fill the Arrow registry.

The Arrow registry keeps track of all type definitions and maps them to Arrow datatypes.

Look for type_registry.rs.

§4. Run the actual codegen pass for a given language.

We currently have two different codegen passes implemented at the moment: Python & Rust.

Codegen passes use the semantic objects from phase two and the registry from phase three in order to generate user-facing code for Rerun’s SDKs.

These passes are intentionally implemented using a very low-tech no-frills approach (stitch strings together, make liberal use of unimplemented, etc) that keep them flexible in the face of ever changing needs in the generated code.

Look for codegen/python.rs and codegen/rust.rs.

§Error handling

Keep in mind: this is all build-time code that will never see the light of runtime. There is therefore no need for fancy error handling in this crate: all errors are fatal to the build anyway.

Make sure to crash as soon as possible when something goes wrong and to attach all the appropriate/available context using anyhow’s with_context (e.g. always include the fully-qualified name of the faulty type/field) and you’re good to go.

§Testing

Same comment as with error handling: this code becomes irrelevant at runtime, and so testing it brings very little value.

Make sure to test the behavior of its output though: re_types!

§Understanding the subtleties of affixes

So-called “affixes” are effects applied to objects defined with the Rerun IDL and that affect the way these objects behave and interoperate with each other (so, yes, monads. shhh.).

There are 3 distinct and very common affixes used when working with Rerun’s IDL: transparency, nullability and plurality.

Broadly, we can describe these affixes as follows:

  • Transparency allows for bypassing a single layer of typing (e.g. to “extract” a field out of a struct).
  • Nullability specifies whether a piece of data is allowed to be left unspecified at runtime.
  • Plurality specifies whether a piece of data is actually a collection of that same type.

We say “broadly” here because the way these affixes ultimately affect objects in practice will actually depend on the kind of object that they are applied to, of which there are 3: archetypes, components and datatypes.

Not only that, but objects defined in Rerun’s IDL are materialized into 3 distinct environments: IDL definitions, Arrow datatypes and native code (e.g. Rust & Python).

These environment have vastly different characteristics, quirks, pitfalls and limitations, which once again lead to these affixes having different, sometimes surprising behavior depending on the environment we’re interested in. Also keep in mind that Flatbuffers and native code are generally designed around arrays of structures, while Arrow is all about structures of arrays!

All in all, these interactions between affixes, object kinds and environments lead to a combinatorial explosion of edge cases that can be very confusing when it comes to (de)serialization code, and even API design.

When in doubt, check out the rerun.testing.archetypes.AffixFuzzer IDL definitions, generated code and test suites for definitive answers.

Re-exports§

pub use self::report::Report;
pub use self::report::Reporter;

Modules§

data_type
This is a limited subset of arrow datatypes.
report

Structs§

Attributes
A collection of arbitrary attributes.
CppCodeFormatter
CppCodeGenerator
Docs
A high-level representation of the contents of a flatbuffer docstring.
DocsCodeGenerator
FbsBaseType
FbsEnum
FbsEnumVal
FbsField
FbsKeyValue
FbsObject
FbsSchema
FbsType
Object
A high-level representation of a flatbuffers object, which can be either a struct, a union or an enum.
ObjectField
A high-level representation of a flatbuffers field, which can be either a struct member or a union value.
Objects
The result of the semantic pass: an intermediate representation of all available object types; including structs, enums and unions.
PythonCodeFormatter
PythonCodeGenerator
RustCodeFormatter
RustCodeGenerator
SnippetsRefCodeGenerator
SourceLocations
TypeRegistry
Computes and maintains a registry of DataTypes for specified flatbuffers definitions.

Enums§

ElementType
The underlying element type for arrays/vectors/maps.
ObjectClass
Is this a struct, enum, or union?
ObjectKind
The kind of the object, as determined by its package root (e.g. rerun.components).
Type
The underlying type of an ObjectField.

Constants§

ATTR_ARROW_SPARSE_UNION
ATTR_ARROW_TRANSPARENT
ATTR_CPP_NO_DEFAULT_CTOR
ATTR_CPP_NO_FIELD_CTORS
ATTR_CPP_RENAME_FIELD
ATTR_DEFAULT
ATTR_DOCS_CATEGORY
ATTR_DOCS_UNRELEASED
ATTR_DOCS_VIEW_TYPES
ATTR_NULLABLE
ATTR_ORDER
ATTR_PYTHON_ALIASES
ATTR_PYTHON_ARRAY_ALIASES
ATTR_RERUN_COMPONENT_OPTIONAL
ATTR_RERUN_COMPONENT_RECOMMENDED
ATTR_RERUN_COMPONENT_REQUIRED
ATTR_RERUN_DEPRECATED_NOTICE
ATTR_RERUN_DEPRECATED_SINCE
ATTR_RERUN_LOG_MISSING_AS_EMPTY
ATTR_RERUN_OVERRIDE_TYPE
ATTR_RERUN_SCOPE
ATTR_RERUN_STATE
ATTR_RERUN_VIEW_IDENTIFIER
ATTR_RUST_CUSTOM_CLAUSE
ATTR_RUST_DERIVE
ATTR_RUST_DERIVE_ONLY
ATTR_RUST_NEW_PUB_CRATE
ATTR_RUST_OVERRIDE_CRATE
ATTR_RUST_REPR
ATTR_RUST_TUPLE_STRUCT
ATTR_TRANSPARENT

Traits§

CodeFormatter
Implements the formatting pass.
CodeGenerator
Implements the codegen pass.

Functions§

compile_binary_schemas
Compiles binary reflection dumps from flatbuffers definitions.
compute_re_types_builder_hash
This will automatically emit a rerun-if-changed clause for all the files that were hashed.
compute_re_types_hash
Also triggers a re-build if anything that affects the hash changes.
generate_cpp_code
Generates C++ code.
generate_docs
generate_fbs
Generate flatbuffers definition files.
generate_lang_agnostic
Handles the first 3 language-agnostic passes of the codegen pipeline:
generate_python_code
Generates Python code.
generate_rust_code
Generates Rust code.
generate_snippets_ref
root_as_schema
Verifies that a buffer of bytes contains a Schema and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior use root_as_schema_unchecked.

Type Aliases§

GeneratedFiles
In-memory generated files.