pcap-toolkit 0.2.0

A blazing-fast, data-oriented PCAP manipulation, routing, and transformation tool written in Rust
Documentation

pcap-toolkit

CI

License

A high-performance CLI for inspecting, filtering, sorting, modifying, replaying, and exporting PCAP captures — designed to handle everything from quick triage to TB-scale data pipeline ingestion.

Table of Contents

Why pcap-toolkit?

tcpdump and tshark are powerful but stop short of the data engineering workflows that security analysts and threat hunters actually need: deterministic flow IDs for correlation, columnar export for DuckDB or Snowflake, timestamp shifting for lab replay, or sorting a months-long multi-file capture that doesn't fit in RAM.

pcap-toolkit fills that gap. Every operation streams packets with a minimal memory footprint and uses Rayon for multi-core throughput — so it stays fast whether your input is a 10 MB sample or a 2 TB archive.

Features

Inspection — info / stats

Extract a full capture summary in a single streaming pass, without loading payloads into RAM:

  • Start and end timestamps (millisecond precision)
  • Total packet count and byte volume
  • Unique source and destination IPs
  • Per-flow statistics keyed by 5-tuple (src_ip, dst_ip, src_port, dst_port, protocol)
  • Deterministic Flow ID (xxh3_64 hash) — bidirectional by default so A→B and B→A share one ID; --unidirectional for direction-aware keying

Filtering

Composable filters applied after sorting, before any output or replay:

Filter CLI Notes
Protocol --proto tcp,udp,icmp by name or IP protocol number
Source IP / CIDR --src-ip 10.0.0.0/8 exact or prefix, IPv4 and IPv6
Destination IP / CIDR --dst-ip 192.168.1.5
Either endpoint IP --ip 10.0.0.0/8 OR across src and dst
Source port / range --src-port 1024-65535 TCP and UDP only
Destination port / range --dst-port 443
Either endpoint port --port 80,443
Flow ID --flow-id <hex> one or more, comma-separated
Time window --from / --to RFC 3339 or ms epoch
TCP flags --tcp-flags SYN,RST exact or any match
Packet length --min-len / --max-len applied to captured length
BPF expression --filter "tcp and dst port 443" pure-Rust implementation, no libpcap required

Rules of the same type are OR-ed; different types are AND-ed. Full boolean control (and / or / not) is available in the TOML configuration.

Two-Pass Sorting

Strict chronological ordering with a near-zero RAM footprint (~20 bytes per packet):

  1. First pass — build a (timestamp_ns, byte_offset, length) index. Kept in memory for normal files; streamed to a .idx sidecar on disk for TB-scale inputs (~20 MB index per 1 M packets).
  2. Second pass — sort the index, then seek-and-stream packets in order to the output pipeline.

Sorted output can be time-sliced into separate files (hourly, daily, or any custom interval).

Traffic Modification

Applied during the second pass, before writing or replaying:

  • Payload truncation--max-payload-bytes N: keep only the first N bytes of the application payload, preserving all Ethernet / IP / transport headers. Shrinks storage while retaining full header fidelity for analysis.
  • Timestamp shifting — provide a target start datetime (ms epoch); all timestamps are shifted by the computed delta. Useful for re-anchoring old captures to a lab timeline.
  • IP address mapping — replace specific IPs with others (--replace-ip 10.0.0.1=192.168.1.1) or via a TOML mapping table. Checksums are automatically recomputed after any header change.

Export

Convert filtered, sorted captures into modern data formats:

  • JSON — one document per packet with parsed layer fields, flow ID, and Base64/hex payload; optional Zstd payload compression.
  • Apache Parquet — typed columnar schema (timestamps, IPs as integers, ports, flags, flow ID, payload). Row groups encoded in parallel with Rayon.
  • Apache Avro — schema-first encoding; Avro schema file emitted alongside the data for self-describing datasets.

All formats integrate directly with DuckDB, Spark, Snowflake, and Elasticsearch.

Live Replay

Send a processed capture back onto a network interface:

  • Honour original inter-packet timing or apply a speed multiplier (--speed 2.0, --speed max)
  • Accepts replay interface via CLI or TOML config
  • Requires CAP_NET_RAW; missing capability is caught early with a clear error

Usage

# Summarise a capture
pcap-toolkit info traffic.pcap

# Show per-flow statistics
pcap-toolkit stats traffic.pcap

# Filter to HTTPS traffic from a subnet and export to Parquet
pcap-toolkit export --proto tcp --dst-port 443 --src-ip 10.0.0.0/8 \
  --format parquet --output out.parquet traffic.pcap

# Sort a large capture and split into hourly files
pcap-toolkit sort --slice 1h --output sorted/ traffic.pcap

# Shift timestamps so the capture starts now, then replay at 2× speed
pcap-toolkit replay --shift now --speed 2.0 --interface eth0 traffic.pcap

# Extract a specific flow by ID
pcap-toolkit export --flow-id a3f2c1b0e4d5... --format json traffic.pcap

# Use a BPF expression for complex filtering
pcap-toolkit export --filter "tcp and dst port 443 and src net 10.0.0.0/8" traffic.pcap

Commands and flags are illustrative — see pcap-toolkit --help for the authoritative reference as the CLI stabilises.

Configuration

All options are available as CLI flags or in a TOML config file for repeatable pipelines:

# pcap-toolkit.toml

[[input]]
path = "captures/*.pcap"

[sort]
enabled = true
slice   = "1h"

[filter]
proto        = ["tcp", "udp"]
dst_port     = [443, 80]
src_ip       = ["10.0.0.0/8"]
unidirectional = false   # bidirectional flow IDs (default)

[[output]]
format = "parquet"
path   = "out/traffic.parquet"
compress_payload = true

[[output]]
format = "json"
path   = "out/traffic.json"

[replay]
interface = "eth0"
speed     = 1.0

CLI flags take precedence over the config file.

Installation

Pre-built binaries

Download the latest binary for your platform from the releases page.

From crates.io

cargo install pcap-toolkit

With Nix

nix run codeberg:slundi/pcap-toolkit

Or add it permanently to your NixOS configuration or home-manager:

inputs.pcap-toolkit.url = "git+https://codeberg.org/slundi/pcap-toolkit.git";

From source

git clone https://codeberg.org/slundi/pcap-toolkit.git
cd pcap-toolkit
cargo install --path .