1. cdefmt
cdefmt ("c de format", short for "c deferred formatting") is a highly efficient logging framework that targets resource-constrained devices, like microcontrollers.
Inspired by https://github.com/knurling-rs/defmt/
2. Table of Contents
3. What is this?
The idea is simple, we want to spend as little time as possible on creating logs, and have them take up as little space as possible.
cdefmt achieves that through a multi phased approach:
- Encode as much information as possible at compile time
Done via the macros in
cdefmt/include/cdefmt.hand some modifications to the compilation flags/linkerscript. All of the log strings, parameter format and size information are encoded at compile time into the binary, these can be stripped from the shipped binary. - Have minimal information encoded into each log - only an ID and the raw values of the arguments.
- Defer the formatting logic to post processing Performed by the rust library cdefmt-decoder, using information from the originally compiled binary to parse the ID and arguments into a full string.
For more technical details refer to Technical Details
4. Usage
Before using the logging framework, there are some setup steps necessary to integrate the encoding logic into your compiled binary.
Once that's done, you can use the cdefmt-decoder library to decode your logs.
4.1. Example
A working example is provided under:
- stdout generates the logs and prints them to
stdoutIfstdoutis aFIFO(aka shell pipe|), the output format changes to a more compact, easy to parse, but not human understandable format. - stdin takes the logs from it's
stdinand, using the original binary file, parses them into proper logs. It expects the input to be formatted in the same way asstdoutoutputs inFIFOmode.
The easiest way to run the example would be to build the project using cmake:
Then run example-stdout and pipe it's stdout into the example-stdin, while providing example-stdin a path to the originally compiled binary of example-stdout:
|
To further complicate matters, it's possible to strip example-stdout and run the stripped binary, logs will still work:
# Remove all symbols and debugging information from example-stdout
# Remove `.cdefmt` section from the stripped binary
# Run the example (note that the parser still needs the original un-stripped binary)
|
This means that it's possible to ship stripped binaries, while using the originals to parse the logs.
The stripping process removes the following from the binary:
- log strings
- debugging information
- symbol table
This has 2 advantages:
- reduced size
- obfuscation
4.2. Setup
4.2.1. Dependencies
cdefmt is a header only library, implemented using mostly macros, however, it's dependent on the boost preprocessor and VMD libraries.
These are header only libraries compatible with c and c++.
If working with CMake, and linking with the cdefmt library, you should automatically get the required dependencies.
Otherwise you'll have to provide your own, using your build system.
4.2.2. Integration
-
cdefmt encodes the log strings along with some metadata into a special section in the elf binary, we need to modify the project's linker script to generate that section: Add the cdefmt metadata section to the end of the linker script (or right before
/DISCARD/if it exists):/* CDEFMT: log metadata section */ .cdefmt 0 (INFO) : { KEEP(*(.cdefmt.init .cdefmt.init.*)) . = . + 8; KEEP(*(.cdefmt .cdefmt.*)) } -
cdefmt uses the GNU build-id to uniquely identify an elf file, and validate the compatibility of the parsed logs with the supplied elf: Update (or add if it doesn't exist) the
.note.gnu.build-idsection:.note.gnu.build-id : { PROVIDE(__cdefmt_build_id = .); *(.note.gnu.build-id) } -
Compile with the following flags:
-g- cdefmt uses debugging information to parse log arguments.-Wl,--build-id- link the build-id into the binary.
-
Copy the header
cdefmt_config.hto your project, place it in aconfigdirectory, and make sure that your project has it in it's include path.Example project structure:
project_root ├── src │ └── main.c └── include ├── config/cdefmt_config.h └── some_header.hMake sure that
proect_root/includeis your project's include path, ascdefmt.his looking for aconfig/cdefmt_config.hinclude file. -
Update your copy of
cdefmt_config.hto reflect your desired configuration. -
Include
cdefmt.hto use the logger. -
Implement cdefmt_log to forward the logs to your logging backend.
-
Call CDEFMT_GENERATE_INIT outside of a function scope.
-
Call cdefmt_init after your logging backend is initialized and
cdefmt_logcan be safely called. -
Enjoy.
See example project for reference.
5. Technical Details
We'll follow the process of generating and parsing/formatting a single log.
5.1. Generation
Starting off with a simple log taking two arguments:
;
This macro is expanded by the preprocessor into
1 do while ;
Let's go over each line and explain what's happening there:
- Standard
do { ... } while (0)macro mechanism:- Creates scope for local variables
- Requires user to add a semicolon after the macro call
- Protects against
if (...) macro_call(); else ...issues
- Define a stract type:
- This struct will hold the metadata for this log.
- The struct is packed to minimize the metadata size.
- The struct is static, allowing the compiler to assign it a permanent address, rather than placing it on the stack.
- The struct is const to ensure immutability.
- Define the metadata version (for future compatibility).
- Define a unique counter, the value is unique per log in each compilation unit. There might be multiple logs with the same value in different compilation units. This value will be used to identify the log when decoding the logs.
- Define the line number where the log is defined.
- Define the length of the filename string - used when parsing the metadata structure.
- Define the length of the format string - used when parsing the metadata structure.
- Define the amount of named arguments profided - used when parsing the metadata structure.
- Define the level of the log.
- Define the filename where the log is defined - this is used later to find the compilation unit.
- Define the format string.
- Define a sub-struct - containing the names of the arguments.
- Define a sub struct - containing the name of the first argument.
- Define the length of the first argument's name.
- Define the first argument's name.
- Close the struct.
- Define a sub struct - containing the name of the second argument.
- Define the length of the second argument's name.
- Define the second argument's name.
- Close the struct.
- Close the names struct.
- Name the metadata variable, and place it into the
.cdefmt.metadatasection.- Notice how the variable's name ends with
50, the same value that will be assigned to thecounterfield.
- Notice how the variable's name ends with
- Assign the version, currently it's 1.
- Assign the counter value.
- Assign the log line.
- Assign the filename length.
- Assign the format string length.
- Assign the amount of arguments.
- Assign the log's level (2 = INFO).
- Assign the filename.
- Assign the format string.
- Assign the names of the arguments.
- Assign the name of the first argument.
- Assign the length of the name.
- Assign the name itself.
- Closing scope.
- Assign the name of the second argumnet.
- Assign the length of the name.
- Assign the name itself.
- Closing scope.
- Closing scope.
- Closing scope.
- Define a struct for holding the log information that will be sent over the wire.
- The struct is packed to save space.
- Log ID - pointer to the log metadata structure.
- First argument.
- Second argument.
- Array for holding dynamically sized arguments.
- Closing scope.
- Assign dynamic size to 0, as we don't have any dynamic arguments.
- Create an instance of the log args structure.
- Assign the log_id, to the address of the metadata structure.
- Assign the dynamic offset to 0, we can increment it later if there's any dynamic arguments.
- See 1.
- Copy the first argument into the log args structure.
- See 1.
- See 1.
- Copy the second argument into the log args structure.
- See 1.
- Send the log args structure to the log function.
- Notice that the length should calculate the size of the log args structure, which in this case is
sizeof(log_args_t)minus the entire dynamic data as there's no dynamic data.
- Notice that the length should calculate the size of the log args structure, which in this case is
- Placeholder in case user opts in for a static/heap log buffer rather than a stack based one.
- See 1
⚠️ Notice how we don't access the log metadata structure at any time during runtime, only its address.
5.2. Metadata
Let's look at how the necessary metadata is stored in the binary.
5.2.1. linker script .cdefmt section
As we've seen in the previous section, the log metadata is stored in a special .cdefmt section, this section is manually defined in the linker script:
1. /* CDEFMT: log metadata section */
2. .cdefmt 0 (INFO) : {
3. KEEP(*(.cdefmt.init .cdefmt.init.*))
4. . = . + 8;
5. KEEP(*(.cdefmt .cdefmt.*))
6. }
- Comment
- Define a new section:
parameter explanation .cdefmtSection name 0Place the section at address 0.All the logs we insert into this section will start from address 0 (INFO)Set the section type as not allocatable, so that no memory is allocated for the section when the program is run. - Put the
.cdefmt.initinputs into this section, this comes first as we want the init log to be uniquely placed at address 0. - Advance the location by 8, this ensures that even if the init log wasn't compiled, none of the other logs will ever have id 0.
- Put the
.cdefmtinputs into this section, this is where all the log strings/metadata will be stored.
To gain a better understanding I recommend reading through the binutils ld docs.
When the project is compiled and linked, each log metadata struct is assigned a unique address by the linker, this address can be used as a unique ID to identify the log.
The compilation process embeds these addresses into the places that reference the log metadata, because, being marked INFO and static, they don't depend on any runtime linkage, and are hardcoded by the linker, this means we can completely remove the .cdefmt section from the shipped binary, without affecting any of the functionality.
Now we know where the metadata strings are stored, and how we can extract them using the log ID, however we still don't know how to parse the log's arguments, as they're in a packed struct and can have any size, type or structure. The next section will discuss exactly that.
5.2.2. Debugging information
As we've seen previously, the log's arguments are copied into a packed struct, we need to know it's structure to be able to parse the binary blob back into usable data.
This is where debugging information comes in handy.
One of the requirements of this framework is compiling the code with -g, this instructs the compiler/linker to generate debugging information and embed it into the elf.
ELF uses the DWARF debugging format to store the debugging information, we can use that to our advantage.
Short reminder from the Generation section, for each log we're doing 3 things that are useful here:
- Defining a new type:
struct cdefmt_log_args_t<counter>. - Inserting
counterinto the log's metadata string. - Inserting the source file path into the log's metadata string.
The compiler will generate debugging information that describes this new type, and will embed it into the binary.
Using the counter value, and the source file path, we can find the specific compilation unit that defines the structure of the log args struct, there we can find the debugging information to figure out how to correctly parse the log's arguments from a binary blob.
Since the debugging information is only needed at parsing time, it can be stripped from the shipped binary.
5.3. Parsing
All that's left is to parse the binary blobs generated by the logger:
- Load the original, unstripped binary - there we find the
.cdefmtsection and the debugging information. - Extract the log id from the binary blob. Since we have the elf, we know the size of a pointer in the target, so we know the size of the log id.
- Find the log metadata in the
.cdefmtsection, and parse the binary structure. - Use the counter and source file to find the debugging information describing the log's arguments.
- Parse the remaining fields of the log args structure.
- Format the log string with the parsed arguments.
- Profit :D
6. License
- MIT license (LICENSE or http://opensource.org/licenses/MIT)