Expand description

Mach-O File Format Parser for Rust

Examples

use std::io::{Read, Cursor};
use std::fs::File;
use mach_object::{OFile, CPU_TYPE_X86_64, MachCommand, LoadCommand};

let mut f = File::open("tests/helloworld").unwrap();
let mut buf = Vec::new();
let size = f.read_to_end(&mut buf).unwrap();
let mut cur = Cursor::new(&buf[..size]);
if let OFile::MachFile { ref header, ref commands } = OFile::parse(&mut cur).unwrap() {
    assert_eq!(header.cputype, CPU_TYPE_X86_64);
    assert_eq!(header.ncmds as usize, commands.len());
    for &MachCommand(ref cmd, cmdsize) in commands {
        if let &LoadCommand::Segment64 { ref segname, ref sections, .. } = cmd {
            println!("segment: {}", segname);

            for ref sect in sections {
                println!("  section: {}", sect.sectname);
            }
        }
    }
}

For more detail, please check the unit tests and the otool example.

Structs

the archive file header

A stream of BIND opcodes to bind all binding symbols.

An iterator over the BindOpCode

The mach binding symbol information

Flags for bind symbol

The build_version_command contains the min OS version on which this binary was built to run for its platform. The list of known platforms and tool values following it.

The LC_DATA_IN_CODE load commands uses a LinkEditData to point to an array of DataInCodeEntry entries.

Dynamically linked shared libraries are identified by two things.

a module table entry

a table of contents entry

The following are used on the flags byte of a terminal node in the export information.

For each architecture in the file, specified by a pair of cputype and cpusubtype, the FatArch describes the file offset, file size and alignment in the file of the architecture specific member.

The structures of the file format for “fat” architecture specific file (wrapper design). At the begining of the file there is one FatHeader structure followed by a number of FatArch structures.

Fixed virtual memory shared libraries are identified by two things.

A stream of BIND opcodes to bind all lazy symbols.

The mach lazy binding symbol information

A variable length string in a load command is represented by an LcString structure.

The LinkEditData contains the offsets and sizes of a blob of data in the __LINKEDIT segment.

Wrap load command with size in the Mach-O file

The mach header appears at the very beginning of the object file

Structure of the __.SYMDEF table of contents for an archive.

A stream of REBASE opcodes

An iterator over the RebaseOpCode of a rebase infomation block.

The rebase symbol information

A segment is made up of zero or more sections.

Constants for the section attributes part of the flags field of a section structure.

The flags field of a section structure is separated into two parts a section type and section attributes.

Constants for the flags field of the segment_command

The packed version.

Symbol Iter

The encoded version.

A stream of BIND opcodes to bind all weak binding symbols.

The mach weak binding symbol information

Enums

OpCode for the binding symbol

Bind or rebase symbol type

The min OS version on which this binary was built to run.

The load commands directly follow the mach header.

The abstract file block, including mach-o file, fat/universal file, archive file and symdef block

OpCode for the rebasing symbol

the link-edit 4.3BSD “stab” style symbol

Constants

64 bit ABI

mask for architecture bits

64 bit libraries

mask for feature flags

build for platform min OS version

local of code signature

table of non-instructions in __text

string for dyld to treat like environment variable

compressed dyld information

compressed dyld information only

Code signing DRs copied from linked dylibs

dynamic link-edit symbol table info

encrypted segment information

64-bit encrypted segment information

compressed table of function start addresses

fixed VM file inclusion (internal use)

object identification info (obsolete)

fixed VM shared library identification

dynamically linked shared lib ident

dynamic linker identification

delay load of dylib until first use

optimization hints in MH_OBJECT files

linker options in MH_OBJECT files

load a specified fixed VM shared library

load a dynamically linked shared library

load a dynamic linker

load upward dylib

replacement for LC_UNIXTHREAD

arbitrary data included within a Mach-O file

prebind checksum

modules prebound for a dynamically

prepage command (internal use)

load and re-export dylib

64-bit image routines

runpath additions

64-bit segment of this file to be mapped

local of info to split segments

source version used to build binary

sub client

sub framework

sub library

sub umbrella

link-edit gdb symbol table info (obsolete)

link-edit stab symbol table info

thread

two-level namespace lookup hints

unix thread (includes a stack)

the uuid

build for iPhoneOS min OS version

build for MacOSX min OS version

build for AppleTV min OS version

build for Watch min OS version

indicates that this binary binds to all two-level namespace modules of its dependent libraries. only used when MH_PREBINDABLE and MH_TWOLEVEL are both set.

When this bit is set, all stacks in the task will be given stack execution privilege. Only used in MH_EXECUTE filetypes.

The code was linked for use in an application extension.

the final linked image uses weak symbols

dynamically bound bundle file

the binary has been canonicalized via the unprebind operation

NXSwapInt(MH_MAGIC)

NXSwapInt(MH_MAGIC_64)

core file

Only for use on dylibs. When linking against a dylib that has this bit set, the static linker will automatically not create a LC_LOAD_DYLIB load command to the dylib if no symbols are being referenced from the dylib.

companion file with only debug sections

dynamically bound shared library

shared library stub for static linking only, no section contents

dynamic link editor

demand paged executable file

the executable is forcing all images to use flat name space bindings

fixed VM shared library file

Contains a section of type S_THREAD_LOCAL_VARIABLES

the object file is the output of an incremental link against a base file and can’t be link edited again

x86_64 kexts

the shared library init routine is to be run lazily via catching memory faults to its writeable segments (obsolete)

the mach magic number

the 64-bit mach magic number

do not have dyld notify the prebinding agent about this executable

this umbrella guarantees no multiple defintions of symbols in its sub-images so the two-level namespace hints can always be used.

the object file has no undefined references

When this bit is set, the OS will run the main executable with a non-executable heap even on platforms (e.g. i386) that don’t require it. Only used in MH_EXECUTE filetypes.

When this bit is set on a dylib, the static linker does not need to examine dependent dylibs to see if any are re-exported

relocatable object file

When this bit is set, the OS will load the main executable at a random address. Only used in MH_EXECUTE filetypes.

the binary is not prebound but can have its prebinding redone. only used when MH_PREBOUND is not set.

the file has its dynamic undefined references prebound.

preloaded executable file

When this bit is set, the binary declares it is safe for use in processes with uid zero

When this bit is set, the binary declares it is safe for use in processes when issetugid() is true

the file has its read-only and read-write segments split

safe to divide up the sections into sub-sections via symbols for dead code stripping

the image is using two-level name space bindings

the final linked image contains external weak symbols

The N_ALT_ENTRY bit of the n_desc field indicates that the symbol is pinned to the previous content.

The N_ARM_THUMB_DEF bit of the n_desc field indicates that the symbol is a defintion of a Thumb function.

AST file path: name,,NO_SECT,0,0

begin common: name,,NO_SECT,0,0

include file beginning: name,,NO_SECT,0,sum

begin nsect sym: 0,,n_sect,0,address

The N_DESC_DISCARDED bit of the n_desc field never appears in linked image. But is used in very rare cases by the dynamic link editor to mark an in memory symbol as discared and longer used for linking.

end common (local name): 0,,n_sect,0,address

end common: name,,n_sect,0,0

include file end: name,,NO_SECT,0,0

end nsect sym: 0,,n_sect,0,address

alternate entry: name,,n_sect,linenumber,address

deleted include file: name,,NO_SECT,0,sum

procedure name (f77 kludge): name,,NO_SECT,0,0

procedure: name,,n_sect,linenumber,address

global symbol: name,,NO_SECT,type,0

left bracket: 0,,NO_SECT,nesting level,address

.lcomm symbol: name,,n_sect,type,address

second stab entry with length information

local sym: name,,NO_SECT,type,offset

The N_NO_DEAD_STRIP bit of the n_desc field only ever appears in a relocatable .o file (MH_OBJECT filetype). And is used to indicate to the static link editor it is never to dead strip the symbol.

compiler -O level: name,,NO_SECT,0,0

emitted with gcc2 compiled and in gcc source

object file name: name,,0,0,st_mtime

compiler parameters: name,,NO_SECT,0,0

global pascal symbol: name,,NO_SECT,subtype,line

parameter: name,,NO_SECT,type,offset

right bracket: 0,,NO_SECT,nesting level,address

The N_REF_TO_WEAK bit of the n_desc field indicates to the dynamic linker that the undefined symbol should be resolved using flat namespace searching.

register sym: name,,NO_SECT,type,register

src line: 0,,n_sect,linenumber,address

source file name: name,,n_sect,0,address

#included file name: name,,n_sect,0,address

structure elt: name,,NO_SECT,type,struct_offset

static symbol: name,,n_sect,type,address

The N_SYMBOL_RESOLVER bit of the n_desc field indicates that the that the function is actually a resolver function and should be called to get the address of the real function to use. This bit is only available in .o files (MH_OBJECT filetype)

compiler version: name,,NO_SECT,0,0

The N_WEAK_DEF bit of the n_desc field indicates to the static and dynamic linkers that the symbol definition is weak, allowing a non-weak symbol to also be used which causes the weak definition to be discared. Currently this is only supported for symbols in coalesed sections.

The N_WEAK_REF bit of the n_desc field indicates to the dynamic linker that the undefined symbol is allowed to be missing and is to have the address of zero when missing.

To simplify stripping of objects that use are used with the dynamic link editor, the static link editor marks the symbols defined an object that are referenced by a dynamicly bound object (dynamic shared libraries, bundles). With this marking strip knows not to strip these symbols.

To support the lazy binding of undefined symbols in the dynamic link-editor, the undefined symbols in the symbol table (the nlist structures) are marked with the indication if the undefined reference is a lazy reference or non-lazy reference. If both a non-lazy reference and a lazy reference is made to the same symbol the non-lazy reference takes precedence. A reference is lazy only when all references to that symbol are made through a symbol pointer in a lazy symbol pointer section.

section with only 4 byte literals

section with only 8 byte literals

section with only 16 byte literals

section contains symbols that are to be coalesced

section with only literal C strings

section contains DTrace Object Format

zero fill on demand section that can be larger than 4 gigabytes)

section with only pairs of function pointers for interposing

section with only lazy symbol pointers to lazy loaded dylibs

section with only lazy symbol pointers

section with only pointers to literals

section with only function pointers for initialization

section with only function pointers for termination

section with only non-lazy symbol pointers

regular section

section with only symbol stubs, byte size of stub in the reserved2 field

functions to call to initialize TLV values

template of initial values for TLVs

TLV descriptors

pointers to TLV descriptors

template of initial values for TLVs

zero fill on demand section

Statics

the real uninitialized data section no padding

the section common symbols are allocated in by the link editor

the real initialized data section no padding, no bss overlap

the fvmlib initialization section

the section following the fvmlib initialization section

the icon headers

the icons in tiff format

module information

string table

string table

symbol table

the real text part of the text

the tradition UNIX data segment

the icon segment

the segment for the self (dyld) modifing code stubs that has read, write and execute permissions

the segment containing all structs created and maintained by the link editor. Created with -seglinkedit option to ld(1) for MH_EXECUTE and FVMLIB file types only

objective-C runtime segment

the pagezero segment which has no protections and catches NULL references for MH_EXECUTE files

the tradition UNIX text segment

the unix stack segment

Traits

Read a fixed size string

Reference type and flags of symbol

Functions

Type Definitions