Expand description

This crate allows obtaining symbol information from binaries and compilation artifacts. It maps raw code addresses to symbol strings, and, if available, file name + line number information. The API was designed for the Firefox profiler.

The main entry point of this crate is the async query_api function, which accepts a JSON string with the query input. The JSON API matches the API of the Mozilla symbolication server (“Tecken”). An alternative JSON-free API is available too, but it is not very ergonomic.

Design constraints

This crate operates under the following design constraints:

  • Must be usable from JavaScript / WebAssembly: The Firefox profiler runs this code in a WebAssembly environment, invoked from a privileged piece of JavaScript code inside Firefox itself. This setup allows us to download the profiler-get-symbols wasm bundle on demand, rather than shipping it with Firefox, which would increase the Firefox download size for a piece of functionality that the vast majority of Firefox users don’t need.
  • Performance: We want to be able to obtain symbol data from a fresh build of a locally compiled Firefox instance as quickly as possible, without an expensive preprocessing step. The time between “finished compilation” and “returned symbol data” should be minimized. This means that symbol data needs to be obtained directly from the compilation artifacts rather than from, say, a dSYM bundle or a Breakpad .sym file.
  • Must scale to large inputs: This applies to both the size of the API request and the size of the object files that need to be parsed: The Firefox profiler will supply anywhere between tens of thousands and hundreds of thousands of different code addresses in a single symbolication request. Firefox build artifacts such as libxul.so can be multiple gigabytes big, and contain around 300000 function symbols. We want to serve such requests within a few seconds or less.
  • “Best effort” basis: If only limited symbol information is available, for example from system libraries, we want to return whatever limited information we have.

The WebAssembly requirement means that this crate cannot contain any direct file access. Instead, all file access is mediated through a FileAndPathHelper trait which has to be implemented by the caller. Furthermore, the API request does not carry any absolute file paths, so the resolution to absolute file paths needs to be done by the caller as well.

Supported formats and data

This crate supports obtaining symbol data from PE binaries (Windows), PDB files (Windows), mach-o binaries (including fat binaries) (macOS & iOS), and ELF binaries (Linux, Android, etc.). For mach-o files it also supports finding debug information in external objects, by following OSO stabs entries. It supports gathering both basic symbol information (function name strings) as well as information based on debug data, i.e. inline callstacks where each frame has a function name, a file name, and a line number. For debug data we support both DWARF debug data (inside mach-o and ELF binaries) and PDB debug data.

Example

use profiler_get_symbols::{
    FileContents, FileAndPathHelper, FileAndPathHelperResult, OptionallySendFuture,
    CandidatePathInfo, FileLocation
};
use profiler_get_symbols::debugid::DebugId;

async fn run_query() -> String {
    let this_dir = std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR"));
    let helper = ExampleHelper {
        artifact_directory: this_dir.join("..").join("fixtures").join("win64-ci")
    };
    profiler_get_symbols::query_api(
        "/symbolicate/v5",
        r#"{
            "memoryMap": [
              [
                "firefox.pdb",
                "AA152DEB2D9B76084C4C44205044422E1"
              ]
            ],
            "stacks": [
              [
                [0, 204776],
                [0, 129423],
                [0, 244290],
                [0, 244219]
              ]
            ]
          }"#,
        &helper,
    ).await
}

struct ExampleHelper {
    artifact_directory: std::path::PathBuf,
}

impl<'h> FileAndPathHelper<'h> for ExampleHelper {
    type F = Vec<u8>;
    type OpenFileFuture =
        std::pin::Pin<Box<dyn std::future::Future<Output = FileAndPathHelperResult<Self::F>> + 'h>>;

    fn get_candidate_paths_for_binary_or_pdb(
        &self,
        debug_name: &str,
        _debug_id: &DebugId,
    ) -> FileAndPathHelperResult<Vec<CandidatePathInfo>> {
        Ok(vec![CandidatePathInfo::SingleFile(FileLocation::Path(self.artifact_directory.join(debug_name)))])
    }

    fn open_file(
        &'h self,
        location: &FileLocation,
    ) -> std::pin::Pin<Box<dyn std::future::Future<Output = FileAndPathHelperResult<Self::F>> + 'h>> {
        async fn read_file_impl(path: std::path::PathBuf) -> FileAndPathHelperResult<Vec<u8>> {
            Ok(std::fs::read(&path)?)
        }

        let path = match location {
            FileLocation::Path(path) => path.clone(),
            FileLocation::Custom(_) => panic!("Unexpected FileLocation::Custom"),
        };
        Box::pin(read_file_impl(path.to_path_buf()))
    }
}

Re-exports

pub use debugid;
pub use object;
pub use pdb_addr2line::pdb;

Structs

A “compact” representation of a symbol table. This is a legacy concept used by the Firefox profiler and kept for compatibility purposes. It’s called SymbolTableAsTuple in the profiler code.

A struct that wraps a number of parameters for various “get_symbolication_result” functions.

Enums

Traits

This is the trait that consumers need to implement so that they can call the main entry points of this crate. This crate contains no direct file access - all access to the file system is via this trait, and its associated trait FileContents.

Provides synchronous access to the raw bytes of a file. This trait needs to be implemented by the consumer of this crate.

A trait which allows many “get_symbolication_result” functions to share code between the implementation that constructs a full symbol table and the implementation that constructs a JSON response with data per looked-up address.

Functions

Tries to obtain a DebugId for an object. This uses the build ID, if available, and falls back to hashing the first page of the text section otherwise. Returns None on failure.

Returns a symbol table in CompactSymbolTable format for the requested binary. FileAndPathHelper must be implemented by the caller, to provide file access.

A generic method which is used in the implementation of both get_compact_symbol_table and query_api. Allows obtaining symbol data for a given binary. The level of detail is determined by query.result_kind: The caller can either get a regular symbol table, or extended information for a set of addresses, if the information is present in the found files. See SymbolicationResultKind for more details.

This is the main API of this crate. It implements the “Tecken” JSON API, which is also used by the Mozilla symbol server. It’s intended to be used as a drop-in “local symbol server” which gathers its data directly from file artifacts produced during compilation (rather than consulting e.g. a database). The caller needs to implement the FileAndPathHelper trait to provide file system access. The return value is a JSON string.

Type Definitions