Expand description
avanalyze
Long-running Apple Vision.framework worker that analyses keyframes and emits
mediaschema-shaped detections.
§What it does
avanalyze wraps Apple’s Vision.framework with a synchronous Rust API. A
single VisionAnalyzer owns one of every supported request kind
(face / body-pose / body-pose-3D / hand-pose / classification /
saliency / aesthetics / barcode / text / horizon / animal / animal-body-pose
/ person-segmentation / person-instance-mask / document-segmentation)
at fixed, pinned revisions, and analyze_keyframe(...) runs them all
against a single JPEG and packages the results into one
mediaschema::domain::Keyframe.
The output is the validated domain shape — Keyframe<Uuid7> with
try_new-style detection value objects (BoundingBox, Confidence,
NormCoord, …). Serialisation to the wire / sqlx / mongodb backends
happens inside mediaschema, not at the engine boundary.
Note: feature_print detections previously emitted by Apple’s
VNGenerateImageFeaturePrintRequest are no longer part of the
keyframe payload — feature embeddings live in LanceDB keyed by the
keyframe id under the locked schema, so they are produced by a
separate downstream stage rather than at the Vision-engine boundary.
§Requirements
- macOS (Vision.framework is Apple-only).
- A working
objc2toolchain (Xcode command-line tools). - Rust 1.95 or newer (edition 2024).
On non-macOS targets the cfg(target_os = "macos") gates make the
platform deps drop out entirely; the crate still compiles as a no-op so
downstream workspaces can keep avanalyze in their dep tree
unconditionally.
§Status
Pre-release (0.0.0). The data plane — VisionAnalyzer::analyze_keyframe
— is functional. Service-framework integration (ThreadService,
Request / Reply, handle_message) is commented out pending the
external findit-service / findit-pipeline crates landing in the
workspace; once those exist the block at the top of src/lib.rs will
be re-enabled.
§Layout
src/lib.rs—VisionAnalyzer, the request set, and the per-request extractors that translateVNObservations intomediaschemadetections.src/options.rs— per-request configuration knobs (AppleVisionClassificationOptions,…BodyPoseOptions, …) and the top-levelServiceOptions.src/wire_ext.rs— local extension traits that give themediaschemawire types ergonomic::new(…)/.with_*(…)builder surfaces (the proto-generated structs ship as#[derive(Default)]records with public fields and no constructors).
§License
avanalyze is licensed under either of
at your option.
Long-running Apple Vision.framework service thread.
Each worker thread owns an AppleVisionAnalyzer and processes keyframes
independently. Vision.framework is stateless per-request, so multiple
workers can run in parallel.
Input: Request via crossbeam bounded channel
Output: Reply via callback back to the processor-local coordinator
Structs§
- Apple
Vision Aesthetics Options - Apple
Vision Animal Options - Apple
Vision Animal Pose Options - Apple
Vision Barcode Options - Apple
Vision Body Pose3D Options - Apple
Vision Body Pose Options - Apple
Vision Classification Options - Apple
Vision Document Segmentation Options - Apple
Vision Face Capture Options - Apple
Vision Face Landmark Options - Apple
Vision Face Rectangle Options - Apple
Vision Hand Pose Options - Apple
Vision Horizon Options - Apple
Vision Human Subject Options - Apple
Vision Person Instance Mask Options - Apple
Vision Person Segmentation Options - Apple
Vision Saliency Options - Apple
Vision Text Options - Service
Options - Vision
Analyzer Apple - Apple Vision analyzer — one per worker thread.