# prototools
A collection of protobuf utilities written in Rust.
## Tools
### `prototext`
Lossless, bidirectional converter between binary protobuf wire format and an
annotated text representation.
**The core guarantee:** `binary → text → binary` is byte-for-byte identical for
any input — well-formed, malformed, non-canonical, or schema-unknown.
#### Text format
The text side is a superset of the
[protobuf text format](https://protobuf.dev/reference/protobuf/textformat-spec/)
as produced by `protoc`. Every field line carries an inline annotation comment
that encodes enough information to reconstruct the exact binary bytes on
re-encoding:
```
#@ prototext: protoc
doubleOp: 2.718 #@ optional double = 21;
floatRp: [1.5, 2.5, 3.5] #@ repeated float [packed=true] = 42;
messageOp { #@ optional SwissArmyKnife = 31;
int32Op: 200 #@ optional int32 = 25;
}
GroupOp { #@ GROUP; optional GroupOp = 30;
uint64Op: 111 #@ optional uint64 = 130;
}
999: 12345 #@ VARINT;
stringOp: "hello" #@ optional string = 29; tag_overhang_count: 2;
99: "\001\002" #@ TRUNCATED_BYTES; missing_bytes_count: 5; optional bytes = 99;
type: TYPE_STRING #@ Type(9) = 5
type: 99 #@ Type(99) = 5; ENUM_UNKNOWN
```
The annotation format is documented in
[`docs/annotation-format.md`](docs/annotation-format.md) (grammar reference
with annotated examples and a proposed v2 format).
Annotations can be suppressed with `--no-annotations`; the output is then
mostly\* compatible with `protoc --decode` but cannot be re-encoded losslessly.
#### Schema
A compiled `.pb` descriptor and a root message type can be provided to resolve
field names and proto types. Without a schema every field is treated as unknown
and rendered by field number.
`google.protobuf.*` types are available without supplying a descriptor (embedded
at compile time).
#### Usage
```
prototext -d [-D descriptor.pb -t pkg.Message] [FILE ...] # binary → text
prototext -e [FILE ...] # text → binary
```
Key flags:
| `-d` / `--decode` | Binary → text (exclusive with `-e`) |
| `-e` / `--encode` | Text → binary (exclusive with `-d`) |
| `-D` / `--descriptor PATH` | Compiled `.pb` descriptor file |
| `-t` / `--type NAME` | Root message type (e.g. `pkg.MyMessage`) |
| `--no-annotations` | Suppress inline annotations (not round-trippable) |
| `-o PATH` | Write output to file (single input) |
| `-O DIR` | Output root directory (batch mode) |
| `-I DIR` | Input root directory |
| `-i` / `--in-place` | Rewrite each input file in place |
| `-q` / `--quiet` | Suppress warnings |
#### Install
From [crates.io](https://crates.io/crates/prototext) (once published):
```
cargo install prototext
```
From the GitHub repository:
```
cargo install --git https://github.com/douzebis/prototools prototext
```
Both install `prototext` to `~/.cargo/bin/`.
#### Quick start
The examples below use two tiny fixture files from `fixtures/cases/` in the
cloned repository. No schema file is needed — `prototext` decodes them
schemalessly, rendering each field by wire type and field number.
**Canonical encoding** — `fixtures/cases/qs_canonical.pb` holds the text
representation of a single varint field:
```
$ cat fixtures/cases/qs_canonical.pb
#@ prototext: protoc
1: 42 #@ varint
```
Encode to binary, inspect the two bytes, then round-trip back to text:
```
$ prototext -e fixtures/cases/qs_canonical.pb | od -A n -t x1
08 2a
$ prototext -e fixtures/cases/qs_canonical.pb | prototext -d
#@ prototext: protoc
1: 42 #@ varint
```
**Non-canonical encoding** — `fixtures/cases/qs_noncanonical.pb` encodes the
same value with one redundant continuation byte (`val_ohb: 1`):
```
$ cat fixtures/cases/qs_noncanonical.pb
#@ prototext: protoc
1: 42 #@ varint; val_ohb: 1
```
The annotation tells the encoder to preserve the extra byte. The binary is
three bytes instead of two — same value, different encoding:
```
$ prototext -e fixtures/cases/qs_noncanonical.pb | od -A n -t x1
08 aa 00
$ prototext -e fixtures/cases/qs_noncanonical.pb | prototext -d
#@ prototext: protoc
1: 42 #@ varint; val_ohb: 1
```
The core guarantee holds for both: `text → binary → text` is byte-for-byte
identical, even for non-canonical encodings.
#### Shell completion
```bash
# bash (workaround for known clap_complete path-completion bugs):
source <(PROTOTEXT_COMPLETE=bash prototext | sed \
-e '/^\s*) )$/a\ compopt -o filenames 2>/dev/null' \