Skip to main content

Crate ud_arch_bpf

Crate ud_arch_bpf 

Source
Expand description

Linux eBPF + Solana SBF (sBPFv1 / sBPFv2) decoder + minimal lifter.

Every BPF “slot” is 8 bytes: a 1-byte opcode, a 1-byte pair of dst/src nibbles, a signed le16 offset, and a signed le32 immediate. One special instruction — lddw (load 64-bit immediate, opcode 0x18) — takes two consecutive slots: the first carries bits [31:0] in imm, the second has opcode 0 and bits [63:32] in its imm.

Solana SBF (classic / sBPFv1) and Agave sBPFv2 reuse the same encoding with a handful of extra opcodes:

  • CALL_REG (0x8d) — register-indexed dynamic call (added in sBPFv1).
  • UDIV / SDIV / UREM / SREM PQR variants — sBPFv2 dedicated division/remainder ops (the Linux eBPF opcodes for these slots mean different things or are absent).
  • Explicit sign-extends (SXH/SXW/SXD) — sBPFv2.

The decoder is variant-gated. Opcodes we know the mnemonic for in the configured variant emit InsnKind::* with a readable text rendering; opcodes we don’t recognise emit InsnKind::Unknown and the raw 8 bytes are preserved verbatim — the round-trip property holds via byte identity regardless of whether we can name the instruction.

References:

  • Linux Kernel — eBPF Instruction Set, v6.5 docs.
  • solana_rbpf — text format and SBF-specific opcode set.

Structs§

BpfCodec
One codec per BPF variant.
DecodedInsn
One decoded BPF slot.

Enums§

AssembleError
Errors the assembler surfaces. Each one points at the specific shape that failed to parse / encode, so the decompile-time byte-drop pass can keep bytes pinned for the lines we can’t yet handle (typically symbolic forms).
BpfVariant
Variant selector. The bytes for shared opcodes are identical across variants; the variant only changes which opcodes we know the mnemonic for and which ones are legal per the runtime that consumes the bytecode.
Error
Errors specific to the BPF backend.
InsnKind
Coarse classification — enough to drive CFG construction and to pick the text rendering. The variant-specific mnemonic (e.g. udiv64 vs udiv32 for sBPFv2) is derived from the raw opcode byte at format time; we don’t carry a separate mnemonic field on DecodedInsn.

Constants§

EM_BPF
EM_BPF from the ELF spec (Linux eBPF).
EM_SBF
EM_SBF from Solana’s ELF extension (sBPFv1 / sBPFv2 — variant distinction needs e_flags).
INSN_SIZE
On-disk size of one BPF instruction slot.

Functions§

assemble_bpf
Assemble one BPF instruction text into its 8-byte slot encoding. The address argument is currently unused — BPF branch offsets are encoded as slot-relative i16 values taken directly from the text, so the assembler doesn’t need to know where in the function it lives.
assemble_bpf_ifblock_cond
Like [parse_int] but accepts a leading - so callers can pass signed slot counts (used by the desymbolised call_internal form, whose imm may be negative when calling a function earlier in the section). Encode the jcc instruction that drives an ifblock / whileblock’s framing.
assemble_bpf_ja
Convenience: encode ja +offset / ja -offset. Used for then_tail_jmp (jumps over an else body) and tail_bytes (back-edge of a while loop). Always 8 bytes.
call_target
Compute the absolute byte-address target of a call <imm> instruction for a local call. The imm field on a BPF call is a signed slot offset relative to the next slot.
classify
Pure re-classifier — re-derives kind from opcode + the configured variant. Useful when something wants to re-walk a slice of decoded slots after the fact (matches the classify contract from other arch crates).
decode
Decode bytes as a BPF instruction stream starting at virtual address start. Buffer length must be a multiple of INSN_SIZE. The decoder recognises lddw (opcode 0x18) and emits two DecodedInsns for it — one Lddw carrying the 64-bit immediate, plus a LddwSecondHalf continuation — so each output DecodedInsn still has exactly 8 bytes.
desymbolize_bpf_text
Convert a symbolic BPF @asm text — the form crates/ud-translate/src/decompile/bpf.rs produces after applying label_<hex> and sub_<hex> rewrites — into the numeric form assemble_bpf accepts.
format_insn
Render a decoded instruction as text. Matches the solana_rbpf / llvm-objdump dialect closely enough that a reader who knows BPF will recognise everything.
jump_target
Compute the absolute byte-address target of a relative jump. BPF offsets are in slots (8 bytes each) and apply to the instruction after this one.
lift_function
Lift a decoded instruction stream into a CFG.
register
Register the BPF codec factory with ud_arch_codec::registry.

Type Aliases§

Result