hhh 1.0.1

The hhh Binary File Processor
Documentation
# Motivation

**hhh** is a tool for manipulating binary files.  Specifically, it can do the following.

  * Generate a hex dump from a file ([described here]writing.md)
  * Generating a file from a binary description ([described here]reading.md)

Lots of other tools do this but **hhh** was built to address specific needs in manipulating and constructing binary files.  If you want a nice, interactive hex editor, [010][] is a *great* editor, and is recommended.

**hhh** is a command-line tool and library, and is intended to be an easy-to-use hexadecimal processor that keeps simple things very simple, but that also makes fairly complex things possible.  For example, you can use hex, decimal, octal, binary, and UTF-8 strings to create a binary file.  Still, **hhh** is not a programming language, so the *really* complex stuff is beyond what it can do.  **hhh** drives at a sweet spot between existing hex editors and writing code to manipulate files.

It probably isn't for everyone.  Still, if you are into patching binaries, creating ELF files from scratch, and modifying the headers of PE files, this might be for you.  If you just want a program to create and read hex dumps, this is also for you.  Enjoy!

## Performance

Presently, on my computer the [Pandoc][] executable is 172,058,352 bytes long.  On my system (x86-64 Linux 6.2.0-32, Intel Core i7-1260P), **hhh** requires about 3 seconds to generate a complete hex dump of this file without radix prefixes, and the parser requires about 15 seconds to re-generate the binary from the hex dump (which is about 7 times larger than the original file).

``` bash
$ time hhh $(which pandoc) -o pandoc.hex

real	0m2.863s
user	0m2.356s
sys	0m0.507s

$ time hhh -p pandoc.hex -o pandoc.bin

real	0m14.971s
user	0m14.468s
sys	0m0.502s
```

Using radix prefixes bumps the generation time only slightly (around 4 seconds) but approximately doubles the parsing time to turn the hex dump back into a binary.  This makes some sense, since the file (with the prefixes) is approximately twice the size as without.

On the same machine, `hexdump` requires about 7 seconds to generate the hex dump.  Generating a hex dump with `xxd` requires only 2 seconds, and generating a binary from the hex dump requires about 3.  Of course, none of these have the crazy array of options present in **hhh**.

The bottom line is that performance of **hhh** is *good*, but *could possibly be improved*.

One reason for the difference is that **hhh** allows specifying file offsets *out of order* when parsing a hex dump (among other things).  This means it must build an internal representation of the file and then generate the final binary from the representation, a step the others do not really need.

## Automated Testing

The examples given in this book are automatically tested using a script found in the `etc` folder of the distribution: `hhh_doc_tests.py`.  See that script for details.


[010]: https://www.sweetscape.com/010editor/
[Pandoc]: https://pandoc.org/