nfprobe 0.0.1 - Docs.rs

## Workflow

### Build time

Build userspace binary `nfprobe` and BPF object.

### Run time

- At start time, `nfprobe` creates all the maps.

- Then `nfprobe` periodically read metrics from metric maps, delete
  old maps and create new maps.

- To enable packet capture for an interface, use `nfprobe` to edits
  the BPF object file to resolve external variables like ifindex, and
  etc.  Then install the BPF program using `tc`.

## Tables:

### Global maps

The global maps are created by `nfprobe`.  They are pinned to global
namespace.

- *config map* stores runtime config.  Right now the only run time
  config is time offset, i.e. `epoch time - ktime`.  The time offset
  is used to align time buckets.

- *status map* stores misc metrics from BPF code.  This table is a
  `PERCPU_ARRAY`.  The metrics include, for example, total number of
  packets, number of map lookup errors, and etc.

### Netflow metrics maps

For each protocol, netflow metrics are stored in a 2 level maps.

The top level is a bucket map of type `HASH_OF_MAPS`.  The key of the
bucket map, `(start_ts, end_ts)`, specifies the time bucket.  The
value of the bucket map points to the 2nd level maps.

The bucket maps are created by `nfprobe`.  They are pinned to global
namespace.

The 2nd level maps are metric maps that store netflow metrics of a
bucket.  The key of a metric map contains the following:

    (cpu, ifindex, direction).(protocol keys)

The protocol keys are protocol specific.  For example, IP has metric
key `(saddr, daddr, protocol)`.  The value of the metric map stores
netflow metrics.

The metrics maps are created by `nfprobe`.  These maps need to be
created before `start_ts` of its time bucket, and deleted after all
metrics of the map are collected by `nfprobe`.

## Netflow Metric

Each netflow metric has an Unique ID for de-duplication.  The UID is:

   (hostname, start_ts, end_ts, metric_key)

Each packet is counted once.  For example, a TCP packet is counted in
TCP metrics, but not in IP metrics.

## Data Enrichment

Use sqlite to store enrichment data.

## Error handling

### BPF errors

BPF errors are counted in the global status map.  Perf events are also
generated for errors.  The perf events are rate limited to avoid too
many events.

## TODO

use perf event for panic, ref cilium

how map ref count is managed
-> bpf_map_put is called when
   1. file object is deleted (OBJ_PIN)
   2. when associated prog is deleted
    bpf_prog_put()
    ==> free_used_maps(aux)