Skip to main content

Module evaluate

Module evaluate 

Source
Expand description

Per-image evaluation orchestrator.

The bridge between the dataset layer (crate::CocoDataset / crate::CocoDetections) and the IoU-type-agnostic spine (crate::matchingcrate::accumulate). Pycocotools fuses these in evaluate() (cocoeval.py 174-216); we keep the layers separate so the spine stays untouchable per ADR-0005.

The pass is generic over EvalKernel — a Similarity supertrait that adds the dataset-bridging methods that turn a (image, category) cell into kernel-typed annotations. Bbox and segm reuse the same orchestrator with BboxIou and SegmIou respectively; future kernels (OKS, Boundary IoU) plug in by adding one impl EvalKernel for FooIou block — match_image, accumulate, and summarize_* stay untouched.

§What this layer does

For each (image, category) cell:

  1. Gather GTs and DTs from the dataset indices.
  2. Pre-filter DTs to the top max_dets_per_image by score (the matching engine and accumulator both rely on this cap; smaller max_dets values are sliced downstream by accumulate).
  3. Build the kernel’s annotation slices via EvalKernel::build_gt_anns / EvalKernel::build_dt_anns and compute the GT × DT IoU matrix once via Similarity::compute.
  4. For each area range, build the per-call _ignore vector (quirk D3) from the dataset’s base ignore (D1) plus the area filter (D6/D7), run the crate::matching engine, apply quirk B7 by flipping dt_ignore for unmatched DTs whose area is outside the active range, and pack the result as a crate::accumulate::PerImageEval at [k][a][i].

§Quirk dispositions handled here

  • D3 (aligned): per-call _ignore computed without mutating the dataset.
  • D6/D7 (strict): area filter uses non-strict <= / >= on both bounds (mirrors cocoeval.py:251’s g['area'] < aRng[0] or g['area'] > aRng[1] exclusion). An annotation whose area equals a bucket boundary lands in both adjacent buckets. Inequality direction matches the eval-time filter in pycocotools, not getAnnIds(areaRng=...).
  • B7 (strict): unmatched DTs whose area is out of range get dt_ignore=true so they do not contribute to the precision/recall curve in this area cell.
  • AA3 (strict, ADR-0026): when the dataset carries LVIS federated metadata and the current (image, category) cell is in not_exhaustive_category_ids[image], every unmatched DT in the cell has its dt_ignore set to true. Mirrors lvis-api eval.py:269-278’s OR into the area-bucket dt_ig_mask. The matching engine is unchanged: the flag piggybacks on the same dt_ignore field B7 already drives.
  • AA4 (strict, ADR-0026): on a federated dataset and with use_cats=true, a cell (image I, category C) is evaluated only when C ∈ pos[I] ∪ neg[I]. Cells with no GT (so C ∉ pos[I]) and no neg listing produce no eval_imgs entry — the existing Option<PerImageEval> distinction (None vs an empty cell) is the same one lvis-api’s eval.py:336 filter relies on.
  • L4 (aligned): use_cats=false collapses every category onto a single virtual k=0 bucket, with category_id carried through matching as a no-op.
  • E2 / J4 (strict): DTs never carry an is_crowd flag — the crate::dataset::CocoDetection type lacks the field. Only GT crowdness drives the E1 asymmetry inside the kernel.
  • J3 (strict): DT areas are read from crate::dataset::CocoDetection::area, which the dataset layer derives from the bbox at construction.
  • J2 (strict): under ParityMode::Strict, a DT lacking a segmentation field under iouType="segm" has its bbox synthesized into a 4-point rectangle polygon [[x1,y1, x1,y2, x2,y2, x2,y1]] and rasterized — bit-for-bit the path pycocotools/coco.py:341 follows. Under ParityMode::Corrected (the default for net-new users) the synthesis is refused with EvalError::InvalidAnnotation: silent coercion of bbox results to rectangle masks is a footgun, and users who want strict parity opt in.
  • J6 (corrected): per-entry dispatch — every detection is inspected independently for the segm/bbox kind. Under ParityMode::Corrected heterogeneous DT lists (some entries with segmentation, some without) are rejected up-front rather than silently routed through the first-entry-decides dispatch pycocotools follows at coco.py:330-363.

Structs§

ArchivedAreaRange
An archived AreaRange
ArchivedOwnedEvaluateParams
An archived OwnedEvaluateParams
AreaRange
Closed [lo, hi] area bucket — both bounds are inclusive per quirks D6/D7, so an annotation with area exactly equal to a bound lands in this bucket (and in the adjacent one when the boundary is shared).
AreaRangeResolver
The resolver for an archived AreaRange
BoundaryIouCached
Kernel used by evaluate_boundary and evaluate_boundary_cached — same semantics as BoundaryIou but threads a single BoundaryComputeScratch across every compute call (so the dataset-wide pass amortizes per-mask + per-cell allocations) and optionally consults a BoundaryGtCache for cross-call GT band reuse.
EvalGrid
Output of evaluate_bbox / evaluate_segm / evaluate_boundary — the flat (K, A, I) grid of PerImageEval cells the accumulator consumes, plus the dimensions needed to construct crate::accumulate::AccumulateParams.
EvalImageMeta
Pycocotools-shaped per-cell bookkeeping that the matching engine strips out when packing PerImageEval. Surfaced separately so the accumulator stays narrow per ADR-0005, and FFI / COCOeval drop-in consumers can reconstruct evalImgs dicts without re-running eval.
EvaluateParams
Inputs to evaluate_bbox / evaluate_segm / evaluate_boundary / evaluate_with. IoU-agnostic — kernel-specific configuration (sigmas, prefilter thresholds, …) lives on the EvalKernel passed alongside.
OwnedEvaluateParams
Owned counterpart to EvaluateParams.
OwnedEvaluateParamsResolver
The resolver for an archived OwnedEvaluateParams
SegmIouCached
Kernel used by evaluate_segm and evaluate_segm_cached — same semantics as SegmIou but threads a single SegmComputeScratch across every compute call (so the dataset-wide pass amortizes per-cell Vec allocations across the ~36 k anns of a val2017 pass) and optionally consults a SegmGtCache for cross-call GT bbox+area reuse.

Enums§

ArchivedKernelKind
An archived KernelKind
GtCacheRef
Either a borrowed or Arc-owned reference to a per-kernel GT cache.
KernelKind
Discriminator for the four kernel families on the IoU axis (per ADR-0012’s iou-type taxonomy). Carried in distributed-eval partials (ADR-0031) so a head-rank reconstruction refuses to merge bbox and segm partials silently.
KernelKindResolver
The resolver for an archived KernelKind

Constants§

AREA_UNBOUNDED
Sentinel upper bound for “unbounded” area buckets, mirroring the 1e10 pycocotools uses for all / large.
COLLAPSED_CATEGORY_SENTINEL
Sentinel category_id emitted on every cell when use_cats=false. Mirrors pycocotools’ p.catIds = [-1] collapse (quirk L4).

Traits§

EvalKernel
Bridges a CocoDataset / CocoDetections cell to a kernel’s annotation type.

Functions§

evaluate_bbox
Run the per-image bbox evaluation pass. Thin wrapper over evaluate_with with the BboxIou kernel.
evaluate_boundary
Run the per-image boundary-IoU evaluation pass (ADR-0010). Thin wrapper over evaluate_with with the BoundaryIou kernel.
evaluate_boundary_cached
Cached variant of evaluate_boundary: reuses GT bands across calls via a caller-owned BoundaryGtCache.
evaluate_keypoints
Run the per-image OKS (iouType="keypoints") evaluation pass per ADR-0012. Thin wrapper over evaluate_with with the OksSimilarity kernel.
evaluate_segm
Run the per-image segmentation-mask evaluation pass. Thin wrapper over evaluate_with with the SegmIou kernel.
evaluate_segm_cached
Cached variant of evaluate_segm: reuses GT bbox + area across calls via a caller-owned SegmGtCache.
evaluate_with
Run the per-image evaluation pass with the given EvalKernel.