Skip to main content

Crate apple_vision

Crate apple_vision 

Source
Expand description

§vision

Safe Rust bindings for Apple’s Vision framework — on-device OCR, object detection, face landmarks, and other computer vision tasks on macOS.

Status: experimental. v0.1 ships text recognition (OCR); object/face detection, classification, barcode scanning land in v0.2.

§Quick start — OCR

use apple_vision::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let recognizer = TextRecognizer::new()
        .with_recognition_level(RecognitionLevel::Accurate)
        .with_language_correction(true);

    let observations = recognizer.recognize_in_path("/tmp/screenshot.png")?;
    for obs in &observations {
        println!("[{:.2}] '{}'", obs.confidence, obs.text);
    }
    Ok(())
}

§Composes with the rest of the doom-fish stack

screencapturekit-rs / capture ──► IOSurface / PNG ──► vision ──► text
                                                          │
                                                          ▼
                                                  foundation-models
                                                  ("summarise this")

§Feature flags

FeatureStatus
recognize_text (default)
detect_faces (default)
detect_rectangles🚧 v0.4
classify_image🚧 v0.4
detect_barcodes🚧 v0.4

§Roadmap

  • VNRecognizeTextRequest (OCR) via TextRecognizer
  • VNDetectFaceRectanglesRequest via FaceDetector (returns bounding box + roll/yaw/pitch)
  • CGImage / CVPixelBuffer ingest (file path AND zero-copy CVPixelBuffer paths)
  • VNDetectFaceLandmarksRequest (face landmark points)
  • VNDetectRectanglesRequest
  • VNClassifyImageRequest
  • VNDetectBarcodesRequest
  • Async API (VNRequest completion handlers exposed via async fn)

§License

Licensed under either of Apache-2.0 or MIT at your option.


§API Documentation

Safe Rust bindings for Apple’s Vision framework — OCR, object detection, face landmarks, and other on-device computer vision tasks.

v0.1 ships text recognition (OCR) only. Object/face detection lands in v0.2.

Re-exports§

pub use error::VisionError;
pub use recognize_text::BoundingBox;recognize_text
pub use recognize_text::RecognitionLevel;recognize_text
pub use recognize_text::RecognizedText;recognize_text
pub use recognize_text::TextRecognizer;recognize_text
pub use detect_faces::DetectedFace;detect_faces
pub use detect_faces::FaceDetector;detect_faces
pub use detect_barcodes::detect_barcodes_in_path;detect_barcodes
pub use detect_barcodes::DetectedBarcode;detect_barcodes
pub use saliency::attention_saliency_in_path;saliency
pub use saliency::SalientRegion;saliency
pub use face_landmarks::detect_face_landmarks_in_path;face_landmarks
pub use face_landmarks::FaceWithLandmarks;face_landmarks
pub use face_landmarks::LandmarkPoint;face_landmarks
pub use body_pose::detect_human_body_pose_in_path;body_pose
pub use body_pose::DetectedBodyPose;body_pose
pub use body_pose::JointPoint;body_pose
pub use hand_pose::detect_human_hand_pose_in_path;hand_pose
pub use hand_pose::DetectedHandPose;hand_pose
pub use contours::detect_contours_in_path;contours
pub use contours::Contour;contours
pub use contours::ContourOptions;contours
pub use animals::recognize_animals_in_path;animals
pub use animals::RecognizedAnimal;animals
pub use classify::classify_image_in_path;classify
pub use classify::Classification;classify
pub use rectangles::detect_document_segmentation_in_path;rectangles
pub use rectangles::detect_rectangles_in_path;rectangles
pub use rectangles::RectangleObservation;rectangles
pub use rectangles::RectangleOptions;rectangles
pub use horizon::detect_horizon_in_path;horizon
pub use feature_print::generate_image_feature_print_in_path;feature_print
pub use feature_print::FeaturePrint;feature_print
pub use humans::detect_human_rectangles_in_path;humans
pub use humans::DetectedHuman;humans
pub use aesthetics::calculate_aesthetics_scores_in_path;aesthetics
pub use aesthetics::detect_face_capture_quality_in_path;aesthetics
pub use aesthetics::AestheticsScores;aesthetics
pub use aesthetics::FaceCaptureQuality;aesthetics
pub use segmentation::generate_foreground_instance_mask_in_path;segmentation
pub use segmentation::generate_person_segmentation_in_path;segmentation
pub use segmentation::InstanceMask;segmentation
pub use segmentation::SegmentationMask;segmentation
pub use segmentation::SegmentationQuality;segmentation
pub use optical_flow::generate_optical_flow_in_paths;optical_flow
pub use optical_flow::OpticalFlowAccuracy;optical_flow
pub use coreml::coreml_classify_in_path;coreml

Modules§

aestheticsaesthetics
Aesthetics scoring (VNCalculateImageAestheticsScoresRequest) and face capture quality (VNDetectFaceCaptureQualityRequest).
animalsanimals
Animal recognition (VNRecognizeAnimalsRequest).
body_posebody_pose
Human body pose detection (VNDetectHumanBodyPoseRequest).
classifyclassify
General-purpose image classification (VNClassifyImageRequest).
contourscontours
Edge contour detection (VNDetectContoursRequest).
coremlcoreml
CoreML inference via Vision (VNCoreMLRequest).
detect_barcodesdetect_barcodes
Barcode detection via VNDetectBarcodesRequest (Vision v0.4).
detect_facesdetect_faces
FaceDetector — wraps VNDetectFaceRectanglesRequest.
error
Errors from the Vision bridge.
face_landmarksface_landmarks
detect_face_landmarks_in_path — wraps VNDetectFaceLandmarksRequest.
feature_printfeature_print
Image feature print (VNGenerateImageFeaturePrintRequest) — semantic image embedding for content-based similarity.
ffi
Raw FFI declarations matching the Swift bridge in swift-bridge/Sources/VisionBridge/Vision.swift.
hand_posehand_pose
Human hand pose detection (VNDetectHumanHandPoseRequest).
horizonhorizon
Horizon detection (VNDetectHorizonRequest).
humanshumans
Human-rectangle detection (VNDetectHumanRectanglesRequest) — lightweight person bounding boxes without joint skeletons.
optical_flowoptical_flow
Optical flow generation (VNGenerateOpticalFlowRequest).
prelude
Common imports.
recognize_textrecognize_text
TextRecognizer — wraps VNRecognizeTextRequest for image-file OCR.
rectanglesrectangles
Rectangle + document-segmentation detection.
saliencysaliency
Attention-based saliency detection via VNGenerateAttentionBasedSaliencyImageRequest.
segmentationsegmentation
Segmentation mask generation — VNGeneratePersonSegmentationRequest and VNGenerateForegroundInstanceMaskRequest.