Expand description
This crate implements Rerun’s code generation tools.
These tools translate language-agnostic IDL definitions (flatbuffers) into code.
They are invoked by pixi run codegen
.
§Organization
The code generation process happens in 4 phases.
§1. Generate binary reflection data from flatbuffers definitions.
All this does is invoke the flatbuffers compiler (flatc
) with the right flags in order to
generate the binary dumps.
Look for compile_binary_schemas
in the code.
§2. Run the semantic pass.
The semantic pass transforms the low-level raw reflection data generated by the first phase into higher level objects that are much easier to inspect/manipulate and overall friendlier to work with.
Look for objects.rs
.
§3. Fill the Arrow registry.
The Arrow registry keeps track of all type definitions and maps them to Arrow datatypes.
Look for type_registry.rs
.
§4. Run the actual codegen pass for a given language.
We currently have two different codegen passes implemented at the moment: Python & Rust.
Codegen passes use the semantic objects from phase two and the registry from phase three in order to generate user-facing code for Rerun’s SDKs.
These passes are intentionally implemented using a very low-tech no-frills approach (stitch
strings together, make liberal use of unimplemented
, etc) that keep them flexible in the
face of ever changing needs in the generated code.
Look for codegen/python.rs
and codegen/rust.rs
.
§Error handling
Keep in mind: this is all build-time code that will never see the light of runtime. There is therefore no need for fancy error handling in this crate: all errors are fatal to the build anyway.
Make sure to crash as soon as possible when something goes wrong and to attach all the
appropriate/available context using anyhow
’s with_context
(e.g. always include the
fully-qualified name of the faulty type/field) and you’re good to go.
§Testing
Same comment as with error handling: this code becomes irrelevant at runtime, and so testing it brings very little value.
Make sure to test the behavior of its output though: re_types
!
§Understanding the subtleties of affixes
So-called “affixes” are effects applied to objects defined with the Rerun IDL and that affect the way these objects behave and interoperate with each other (so, yes, monads. shhh.).
There are 3 distinct and very common affixes used when working with Rerun’s IDL: transparency, nullability and plurality.
Broadly, we can describe these affixes as follows:
- Transparency allows for bypassing a single layer of typing (e.g. to “extract” a field out of a struct).
- Nullability specifies whether a piece of data is allowed to be left unspecified at runtime.
- Plurality specifies whether a piece of data is actually a collection of that same type.
We say “broadly” here because the way these affixes ultimately affect objects in practice will actually depend on the kind of object that they are applied to, of which there are 3: archetypes, components and datatypes.
Not only that, but objects defined in Rerun’s IDL are materialized into 3 distinct environments: IDL definitions, Arrow datatypes and native code (e.g. Rust & Python).
These environment have vastly different characteristics, quirks, pitfalls and limitations, which once again lead to these affixes having different, sometimes surprising behavior depending on the environment we’re interested in. Also keep in mind that Flatbuffers and native code are generally designed around arrays of structures, while Arrow is all about structures of arrays!
All in all, these interactions between affixes, object kinds and environments lead to a combinatorial explosion of edge cases that can be very confusing when it comes to (de)serialization code, and even API design.
When in doubt, check out the rerun.testing.archetypes.AffixFuzzer
IDL definitions, generated code and
test suites for definitive answers.
Re-exports§
Modules§
Structs§
- Attributes
- A collection of arbitrary attributes.
- CppCode
Formatter - CppCode
Generator - Docs
- A high-level representation of the contents of a flatbuffer docstring.
- Docs
Code Generator - FbsBase
Type - FbsEnum
- FbsEnum
Val - FbsField
- FbsKey
Value - FbsObject
- FbsSchema
- FbsType
- Object
- A high-level representation of a flatbuffers object, which can be either a struct, a union or an enum.
- Object
Field - A high-level representation of a flatbuffers field, which can be either a struct member or a union value.
- Objects
- The result of the semantic pass: an intermediate representation of all available object types; including structs, enums and unions.
- Python
Code Formatter - Python
Code Generator - Rust
Code Formatter - Rust
Code Generator - Snippets
RefCode Generator - Source
Locations - Type
Registry - Computes and maintains a registry of
DataType
s for specified flatbuffers definitions.
Enums§
- Element
Type - The underlying element type for arrays/vectors/maps.
- Object
Class - Is this a struct, enum, or union?
- Object
Kind - The kind of the object, as determined by its package root (e.g.
rerun.components
). - Type
- The underlying type of an
ObjectField
.
Constants§
- ATTR_
ARROW_ SPARSE_ UNION - ATTR_
ARROW_ TRANSPARENT - ATTR_
CPP_ NO_ DEFAULT_ CTOR - ATTR_
CPP_ NO_ FIELD_ CTORS - ATTR_
CPP_ RENAME_ FIELD - ATTR_
DEFAULT - ATTR_
DOCS_ CATEGORY - ATTR_
DOCS_ UNRELEASED - ATTR_
DOCS_ VIEW_ TYPES - ATTR_
NULLABLE - ATTR_
ORDER - ATTR_
PYTHON_ ALIASES - ATTR_
PYTHON_ ARRAY_ ALIASES - ATTR_
RERUN_ COMPONENT_ OPTIONAL - ATTR_
RERUN_ COMPONENT_ RECOMMENDED - ATTR_
RERUN_ COMPONENT_ REQUIRED - ATTR_
RERUN_ DEPRECATED_ NOTICE - ATTR_
RERUN_ DEPRECATED_ SINCE - ATTR_
RERUN_ LOG_ MISSING_ AS_ EMPTY - ATTR_
RERUN_ OVERRIDE_ TYPE - ATTR_
RERUN_ SCOPE - ATTR_
RERUN_ STATE - ATTR_
RERUN_ VIEW_ IDENTIFIER - ATTR_
RUST_ CUSTOM_ CLAUSE - ATTR_
RUST_ DERIVE - ATTR_
RUST_ DERIVE_ ONLY - ATTR_
RUST_ NEW_ PUB_ CRATE - ATTR_
RUST_ OVERRIDE_ CRATE - ATTR_
RUST_ REPR - ATTR_
RUST_ TUPLE_ STRUCT - ATTR_
TRANSPARENT
Traits§
- Code
Formatter - Implements the formatting pass.
- Code
Generator - Implements the codegen pass.
Functions§
- compile_
binary_ schemas - Compiles binary reflection dumps from flatbuffers definitions.
- compute_
re_ types_ builder_ hash - This will automatically emit a
rerun-if-changed
clause for all the files that were hashed. - compute_
re_ types_ hash - Also triggers a re-build if anything that affects the hash changes.
- generate_
cpp_ code - Generates C++ code.
- generate_
docs - generate_
fbs - Generate flatbuffers definition files.
- generate_
lang_ agnostic - Handles the first 3 language-agnostic passes of the codegen pipeline:
- generate_
python_ code - Generates Python code.
- generate_
rust_ code - Generates Rust code.
- generate_
snippets_ ref - root_
as_ schema - Verifies that a buffer of bytes contains a
Schema
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_schema_unchecked
.
Type Aliases§
- Generated
Files - In-memory generated files.