Crate samply_symbols

Crate samply_symbols 

Source
Expand description

This crate allows obtaining symbol information from binaries and compilation artifacts.

You probably want to be using the wholesym crate instead. wholesym has a much more ergonomic API; it is a wrapper around samply-symbols.

More specifically, samply-symbols provides the low-level implementation of wholesym, while satisfying both native and WebAssembly consumers, whereas wholesym only cares about native consumers.

The main entry point of this crate is the SymbolManager struct and its async load_symbol_map method. With a SymbolMap, you can resolve raw code addresses to function name strings, and, if available, to file name + line number information and inline stacks.

§Design constraints

This crate operates under the following design constraints:

  • Must be usable from JavaScript / WebAssembly: The Firefox profiler runs this code in a WebAssembly environment, invoked from a privileged piece of JavaScript code inside Firefox itself. This setup allows us to download a wasm bundle on demand, rather than shipping it with Firefox, which would increase the Firefox download size for a piece of functionality that the vast majority of Firefox users don’t need.
  • Performance: We want to be able to obtain symbol data from a fresh build of a locally compiled Firefox instance as quickly as possible, without an expensive preprocessing step. The time between “finished compilation” and “returned symbol data” should be minimized. This means that symbol data needs to be obtained directly from the compilation artifacts rather than from, say, a dSYM bundle or a Breakpad .sym file.
  • Must scale to large inputs: This applies to both the size of the API request and the size of the object files that need to be parsed: The Firefox profiler will supply anywhere between tens of thousands and hundreds of thousands of different code addresses in a single symbolication request. Firefox build artifacts such as libxul.so can be multiple gigabytes big, and contain around 300000 function symbols. We want to serve such requests within a few seconds or less.
  • “Best effort” basis: If only limited symbol information is available, for example from system libraries, we want to return whatever limited information we have.

The WebAssembly requirement means that this crate cannot contain any direct file access. Instead, all file access is mediated through a FileAndPathHelper trait which has to be implemented by the caller. We cannot even use the std::path::Path / PathBuf types to represent paths, because the WASM bundle can run on Windows, and the Path / PathBuf types have! Unix path semantics in Rust-compiled-to-WebAssembly.

Furthermore, the caller needs to be able to find the right symbol files based on a subset of information about a library, for example just based on its debug name and debug ID. This is used when SymbolManager::load_symbol_map is called with such a subset of information. More concretely, this ability is used by samply-api when processing a JSON symbolication API call, which only comes with the debug name and debug ID for a library.

§Supported formats and data

This crate supports obtaining symbol data from PE binaries (Windows), PDB files (Windows), mach-o binaries (including fat binaries) (macOS & iOS), and ELF binaries (Linux, Android, etc.). For mach-o files it also supports finding debug information in external objects, by following OSO stabs entries. It supports gathering both basic symbol information (function name strings) as well as information based on debug data, i.e. inline callstacks where each frame has a function name, a file name, and a line number. For debug data we support both DWARF debug data (inside mach-o and ELF binaries) and PDB debug data.

§Example

use samply_symbols::debugid::DebugId;
use samply_symbols::{
    CandidatePathInfo, FileAndPathHelper, FileAndPathHelperResult, FileLocation,
    FramesLookupResult, LibraryInfo, LookupAddress, OptionallySendFuture, SymbolManager,
};

async fn run_query() {
    let this_dir = std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR"));
    let helper = ExampleHelper {
        artifact_directory: this_dir.join("..").join("fixtures").join("win64-ci"),
    };

    let symbol_manager = SymbolManager::with_helper(helper);

    let library_info = LibraryInfo {
        debug_name: Some("firefox.pdb".to_string()),
        debug_id: DebugId::from_breakpad("AA152DEB2D9B76084C4C44205044422E1").ok(),
        ..Default::default()
    };
    let symbol_map = match symbol_manager.load_symbol_map(&library_info).await {
        Ok(symbol_map) => symbol_map,
        Err(e) => {
            println!("Error while loading the symbol map: {:?}", e);
            return;
        }
    };

    // Look up the symbol for an address.
    let lookup_result = symbol_map.lookup(LookupAddress::Relative(0x1f98f)).await;

    match lookup_result {
        Some(address_info) => {
            // Print the symbol name for this address:
            println!("0x1f98f: {}", address_info.symbol.name);

            // See if we have debug info (file name + line, and inlined frames):
            if let Some(frames) = address_info.frames {
                println!("Debug info:");
                for frame in frames {
                    println!(
                        " - {:?} ({:?}:{:?})",
                        frame.function, frame.file_path, frame.line_number
                    );
                }
            }
        }
        None => {
            println!("No symbol was found for address 0x1f98f.")
        }
    }
}

struct ExampleHelper {
    artifact_directory: std::path::PathBuf,
}

impl FileAndPathHelper for ExampleHelper {
    type F = Vec<u8>;
    type FL = ExampleFileLocation;

    fn get_candidate_paths_for_debug_file(
        &self,
        library_info: &LibraryInfo,
    ) -> FileAndPathHelperResult<Vec<CandidatePathInfo<ExampleFileLocation>>> {
        if let Some(debug_name) = library_info.debug_name.as_deref() {
            Ok(vec![CandidatePathInfo::SingleFile(ExampleFileLocation(
                self.artifact_directory.join(debug_name),
            ))])
        } else {
            Ok(vec![])
        }
    }

    fn get_candidate_paths_for_binary(
        &self,
        library_info: &LibraryInfo,
    ) -> FileAndPathHelperResult<Vec<CandidatePathInfo<ExampleFileLocation>>> {
        if let Some(name) = library_info.name.as_deref() {
            Ok(vec![CandidatePathInfo::SingleFile(ExampleFileLocation(
                self.artifact_directory.join(name),
            ))])
        } else {
            Ok(vec![])
        }
    }

   fn get_dyld_shared_cache_paths(
       &self,
       _arch: Option<&str>,
   ) -> FileAndPathHelperResult<Vec<ExampleFileLocation>> {
       Ok(vec![])
   }

    fn load_file(
        &self,
        location: ExampleFileLocation,
    ) -> std::pin::Pin<Box<dyn OptionallySendFuture<Output = FileAndPathHelperResult<Self::F>> + '_>> {
        Box::pin(async move { Ok(std::fs::read(&location.0)?) })
    }
}

#[derive(Clone, Debug)]
struct ExampleFileLocation(std::path::PathBuf);

impl std::fmt::Display for ExampleFileLocation {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        self.0.to_string_lossy().fmt(f)
    }
}

impl FileLocation for ExampleFileLocation {
    fn location_for_dyld_subcache(&self, suffix: &str) -> Option<Self> {
        let mut filename = self.0.file_name().unwrap().to_owned();
        filename.push(suffix);
        Some(Self(self.0.with_file_name(filename)))
    }

    fn location_for_external_object_file(&self, object_file: &str) -> Option<Self> {
        Some(Self(object_file.into()))
    }

    fn location_for_pdb_from_binary(&self, pdb_path_in_binary: &str) -> Option<Self> {
        Some(Self(pdb_path_in_binary.into()))
    }

    fn location_for_source_file(&self, source_file_path: &str) -> Option<Self> {
        Some(Self(source_file_path.into()))
    }

    fn location_for_breakpad_symindex(&self) -> Option<Self> {
        Some(Self(self.0.with_extension("symindex")))
    }

    fn location_for_dwo(&self, _comp_dir: &str, path: &str) -> Option<Self> {
        Some(Self(path.into()))
    }

    fn location_for_dwp(&self) -> Option<Self> {
        let mut s = self.0.as_os_str().to_os_string();
        s.push(".dwp");
        Some(Self(s.into()))
    }
}

Re-exports§

pub use pdb_addr2line::pdb;
pub use debugid;
pub use object;

Structs§

AddressInfo
The lookup result for an address.
BinaryImage
BreakpadIndex
BreakpadIndexParser
CompactSymbolTable
A “compact” representation of a symbol table. This is a legacy concept used by the Firefox profiler and kept for compatibility purposes. It’s called SymbolTableAsTuple in the profiler code.
ElfBuildId
The build ID for an ELF file (also called “GNU build ID”).
ExternalFileAddressRef
Information to find an external file and an address within that file, to be passed to SymbolMap::lookup_external or ExternalFileSymbolMap::lookup.
ExternalFileSymbolMap
FatArchiveMember
FileContentsWithChunkedCaching
FileContentsWrapper
A wrapper for a FileContents object. The wrapper provides some convenience methods and, most importantly, implements ReadRef for &FileContentsWrapper.
FrameDebugInfo
The debug information (function name, file path, line number) for a single frame at the looked-up address.
LibraryInfo
Information about a library (“binary” / “module” / “DSO”) which allows finding symbol files for it. The information can be partial.
PeCodeId
The code ID for a Windows PE file.
SourceFilePath
The path of a source file, as found in the debug info.
SymbolInfo
The symbol for a function.
SymbolManager
SymbolMap
SyncAddressInfo
The lookup result from lookup_sync.

Enums§

BreakpadParseError
BreakpadSymindexParseError
CandidatePathInfo
CodeByteReadingError
CodeId
An enum carrying an identifier for a binary. This is stores the same information as a debugid::CodeId, but without projecting it down to a string.
Error
The error type used in this crate.
ExternalFileAddressInFileRef
Information to find an address within an external file, for debug info lookup.
ExternalFileRef
Information to find an external file with debug information.
FramesLookupResult
Contains address debug info (inlined functions, file names, line numbers) if available.
LookupAddress
An address that can be looked up in a SymbolMap.
MappedPath
A special source file path for source files which are hosted online.
MultiArchDisambiguator
In case the loaded binary contains multiple architectures, this specifies how to resolve the ambiguity. This is only needed on macOS.

Traits§

DebugIdExt
FileAndPathHelper
This is the trait that consumers need to implement so that they can call the main entry points of this crate. This crate contains no direct file access - all access to the file system is via this trait, and its associated trait FileContents.
FileByteSource
FileContents
Provides synchronous access to the raw bytes of a file. This trait needs to be implemented by the consumer of this crate.
FileLocation
A trait which abstracts away the token that’s passed to the FileAndPathHelper::load_file trait method.
OptionallySendFuture
SymbolMapTrait

Functions§

debug_id_and_code_id_for_jitdump
This makes up an ID which looks like an ELF build ID, composed of information from the jitdump file header.
debug_id_for_object
Tries to obtain a DebugId for an object. This uses the build ID, if available, and falls back to hashing the first page of the text section otherwise. Returns None on failure.
demangle_any
Attempt to demangle the passed-in string. This tries a bunch of different demangling schemes.
load_external_file
relative_address_base
The “relative address base” is the base address which LookupAddress::Relative addresses are relative to. You start with an SVMA (a stated virtual memory address), you subtract the relative address base, and out comes a relative address.

Type Aliases§

FileAndPathHelperError
FileAndPathHelperResult