MalwareDB Types
This crate contains the logic for parsing some executable and document datatypes, and for determining if a Zip file is an MS Office document or an archive of files.
Executable Types:
- ELF (feature flag
elf
, default) - Mach-O and Fat Mach-O (feature flag
macho
, default) - PE32 (feature flag
pe32
, default) - PEF (feature flag
pef
)
For each executable, the goal is to extract:
- Section information: names, sizes, entropy
- Import data
- Target: architecture, operating system, endianness, pointer size (32 vs 64-bit)
- Binary type (object file, executable, library, etc)
Some complications:
- How to get the imports for ELFs? Go has this figured out but I haven't been able to replicate. Goblin issue #363.
- Should I ditch the custom parsers for Goblin? It would allow me to get Authenticode data from PE32 files, but I worry it won't be tolerant to malformed files (as malware tends to be).
- Fat Mach-O files have a set of sections and characteristics per embedded Mach-O, how should this be related?
Document Types:
- PDF via pdf (feature flag
pdf
, default)
There should be a simple way to represent the needed data so the component which stores the data in the database doesn't have to be aware of file formats.