Expand description
wholesym
is a fully-featured library for fetching symbol files and for
resolving code addresses to symbols and debug info. It supports Windows, macOS
and Linux. It is lightning-fast and optimized for minimal time-to-first-symbol.
Use it as follows:
- Create a
SymbolManager
usingSymbolManager::with_config
. - Load a
SymbolMap
withSymbolManager::load_symbol_map_for_binary_at_path
. - Look up an address with
SymbolMap::lookup
. - Inspect the returned
AddressInfo
, which gives you the symbol name, and potentially file and line information, along with inlined function info.
Behind the scenes, wholesym
loads symbol files much like a debugger would.
It supports symbol servers, collecting information from multiple files, and
all kinds of different ways to embed symbol information in various file formats.
§Example
use wholesym::{SymbolManager, SymbolManagerConfig, LookupAddress};
use std::path::Path;
let symbol_manager = SymbolManager::with_config(SymbolManagerConfig::default());
let symbol_map = symbol_manager
.load_symbol_map_for_binary_at_path(Path::new("/usr/bin/ls"), None)
.await?;
println!("Looking up 0xd6f4 in /usr/bin/ls. Results:");
if let Some(address_info) = symbol_map.lookup(LookupAddress::Relative(0xd6f4)).await {
println!(
"Symbol: {:#x} {}",
address_info.symbol.address, address_info.symbol.name
);
if let Some(frames) = address_info.frames {
for (i, frame) in frames.into_iter().enumerate() {
let function = frame.function.unwrap();
let file = frame.file_path.unwrap().display_path();
let line = frame.line_number.unwrap();
println!(" #{i:02} {function} at {file}:{line}");
}
}
} else {
println!("No symbol for 0xd6f4 was found.");
}
This example prints the following output on my machine:
Looking up 0xd6f4 in /usr/bin/ls. Results:
Symbol: 0xd5d4 gobble_file.constprop.0
#00 do_lstat at ./src/ls.c:1184
#01 gobble_file at ./src/ls.c:3403
The example demonstrates support for debuglink
and debugaltlink
. It gets the symbol
information from local debug files at
/usr/lib/debug/.build-id/63/260a3e6e46db57abf718f6a3562c6eedccf269.debug
and at /usr/lib/debug/.dwz/aarch64-linux-gnu/coreutils.debug
, which were installed
by the coreutils-dbgsym
package. If these files are not present, it will fall back
to whichever information is available.
§Features
§Windows
Supported symbol file sources:
- Local PDB files at the absolute PDB path that’s written down in the .exe / .dll
-
PDB files on Windows symbol servers +
_NT_SYMBOL_PATH
environment variable - Breakpad symbol files, local or on a server
- DWARF-in-PE debug info
- Fallback symbols from exported functions and function start addresses
Unsupported for now (patches accepted):
-
Support for
/DEBUG:FASTLINK
PDB files (issue #53)
§macOS
Supported symbol file sources:
- Local dSYM bundles with symbol tables + DWARF, found in vicinity of the binary
- dSYM bundles found via Spotlight
- DWARF found in object files which are referred to from a linked binary (via OSO stabs symbols)
- Breakpad symbol files, local or on a server
- Symbols from the regular symbol table
- Fallback symbols from exported functions and function start addresses
Unsupported for now (patches accepted):
§Linux
Supported symbol file sources:
- DWARF and symbol tables in binaries
- DWARF and symbol tables in separate debug files found via build ID or debug link
- Symbol tables in MiniDebugInfo
-
Combining multiple files with DWARF if debug info has been partially moved with
dwz
(usingdebugaltlink
) -
debuginfod servers and the
DEBUGINFOD_URLS
environment variable - Breakpad symbol files, local or on a server
- Symbols from the regular symbol table
- Fallback symbols from exported functions and function start addresses
- Split DWARF with .dwo files
- Split DWARF with .dwp files
§Performance
The most computationally intense part of symbol resolution is the parsing of debug info.
Debug info can be very large, for example 700MB to 1500MB for Firefox’s libxul.
wholesym
uses the addr2line
and pdb-addr2line
crates for parsing DWARF and PDB, respectively. It also has its own code for parsing the
Breakpad sym format. All of these parsers have been optimized extensively
to minimize the time it takes to get the first symbol result, and to cache
things so that repeated lookups in the same functions are fast. This means:
- No expensive preprocessing happens when the symbol file is first loaded.
- Parsing is as lazy as possible: If possible, we only parse the bytes that are needed for the function which contains the looked-up address.
- The first parse is as shallow and fast as possible, and just builds an index.
- Strings (e.g. function names and file paths) are only looked up when needed.
- Symbol lists, line records, and inlines are cached in sorted structures, and queried via binary search.
Re-exports§
pub use debugid;
pub use samply_symbols;
Structs§
- The lookup result for an address.
- The build ID for an ELF file (also called “GNU build ID”).
- Information to find an external file and an address within that file, to be passed to
SymbolMap::lookup_external
orExternalFileSymbolMap::lookup
. - The debug information (function name, file path, line number) for a single frame at the looked-up address.
- Information about a library (“binary” / “module” / “DSO”) which allows finding symbol files for it. The information can be partial.
- The code ID for a Windows PE file.
- The path of a source file, as found in the debug info.
- Used in
SymbolManager::load_external_file
and returned bySymbolMap::symbol_file_origin
. - The symbol for a function.
- Allows obtaining
SymbolMap
s. - The configuration of a
SymbolManager
. - Contains the symbols for a binary, and allows querying them by address and iterating over them.
- The lookup result from
lookup_sync
.
Enums§
- An enum carrying an identifier for a binary. This is stores the same information as a
debugid::CodeId
, but without projecting it down to a string. - The error type used in this crate.
- Information to find an address within an external file, for debug info lookup.
- Information to find an external file with debug information.
- Contains address debug info (inlined functions, file names, line numbers) if available.
- An address that can be looked up in a
SymbolMap
. - A special source file path for source files which are hosted online.
- In case the loaded binary contains multiple architectures, this specifies how to resolve the ambiguity. This is only needed on macOS.