Face ID

Easily run face detection, landmark prediction, facial recognition, and attribute estimation in Rust via ONNX Runtime.

Features

Detection: Face detection using SCRFD. Coordinates are relative to image dimensions (0.0 to 1.0).
Landmarks: Predict 5 facial keypoints (eyes, nose, mouth) for alignment (also relative).
Recognition: Generate 512-d embeddings using ArcFace for identity verification.
Attributes: Estimate gender and age.
Automatic Alignment: Built-in transforms to align faces to the pose required by recognition models.
HF Integration: Automatically download pre-trained models from HuggingFace.

Usage

The FaceAnalyzer is the easiest way to use this crate. It handles the entire pipeline: detecting faces, aligning them, and running recognition/attribute models in batches.

use face_id::analyzer::FaceAnalyzer;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Initialize the analyzer (downloads models from HuggingFace on first run)
    let analyzer = FaceAnalyzer::from_hf().build().await?;

    let img = image::open("assets/img/crowd.jpg")?;
    let faces = analyzer.analyze(&img)?;

    for (i, face) in faces.iter().enumerate() {
        println!("Face #{}", i);
        println!("  Score: {:.2}", face.detection.score);
        println!("  BBox (relative): {:?}", face.detection.bbox);

        if let Some(ga) = &face.gender_age {
            println!("  Gender: {:?}, Age: {}", ga.gender, ga.age);
        }

        if let Some(emb) = &face.embedding {
            println!("  Embedding (first 5 dims): {:?}", &emb[..5]);
        }
    }

    Ok(())
}

Usage: Individual Components

You can also use the components individually if you don't need the full pipeline.

Face Detection only

The ScrfdDetector finds face bounding boxes when given an image. When using a _kps model, it also returns the location of the eyes, nose and mouth.

use face_id::detector::ScrfdDetector;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut detector = ScrfdDetector::from_hf().build().await?;
    let face_boxes = detector.detect(&img)?; // Relative coordinates [0, 1]
    
    // To get absolute pixel coordinates:
    let absolute_boxes: Vec<_> = face_boxes.iter()
        .map(|f| f.to_absolute(img.width(), img.height()))
        .collect();
}

Facial Recognition (Embeddings)

Recognition requires aligned face crops, meaning the eyes, nose and mouth are warped to the spot the embedding model expects them. This crate provides face_align::norm_crop to transform a face based on its landmarks into the 112x112 format required by ArcFace.

use face_id::embedder::ArcFaceEmbedder;
use face_id::face_align::norm_crop;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut embedder = ArcFaceEmbedder::from_hf().build().await?;

    // Align the face using landmarks from `ScrfdDetector`.
    let aligned_img = norm_crop(&img, &landmarks, 112);

    // Compute embedding
    let embedding = embedder.compute_embedding(&aligned_img)?;

    // Now you can compare if two faces are the same person, 
    // or cluster a bunch of face embeddings to group them.
}

Clustering & Face Crop

You can cluster faces from many images and then extract cropped faces using two helper functions:

use face_id::analyzer::FaceAnalyzer;
use face_id::helpers::{cluster_faces, extract_face_thumbnail};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let analyzer = FaceAnalyzer::from_hf().build().await?;
    let paths = vec!["img1.jpg", "img2.jpg"];

    // Cluster faces across multiple images
    let clusters = cluster_faces(&analyzer, paths)
        .min_cluster_size(5)
        .call()?;

    for (cluster_id, faces) in clusters {
        println!("Cluster {cluster_id} has {} faces", faces.len());
        for (path, face) in faces {
            // Extract a square thumbnail with 60% padding
            let img = image::open(path)?;
            let thumbnail = extract_face_thumbnail(&img, &face.detection.bbox, 1.6, 256);
            // thumbnail.save(format!("cluster_{cluster_id}_{}.jpg", ...))?;
        }
    }
}

Loading Local Models

If you want to use local ONNX model files instead of downloading from HuggingFace, use the builder method.

use face_id::analyzer::FaceAnalyzer;
fn main() -> anyhow::Result<()> {
    let analyzer = FaceAnalyzer::builder(
        "models/det.onnx", // Detector
        "models/rec.onnx", // Embedder (Recognition)
        "models/attr.onnx" // Gender/Age
    )
    .build()?;
    
    Ok(())
}

Customizing Hugging Face Models

You can mix and match specific model versions from Hugging Face repositories. For example, using the medium-complexity 10g_bnkps detector instead of the default:

use face_id::analyzer::FaceAnalyzer;
use face_id::model_manager::HfModel;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let analyzer = FaceAnalyzer::from_hf()
        // Specify a smaller detector model than the default:
        // > `embedder_model` and `gender_age_model` can also be specified in the builder.
        .detector_model(HfModel {
            id: "public-data/insightface".to_string(),
            file: "models/buffalo_l/det_10g.onnx".to_string(),
        })
        .detector_input_size((640, 640))
        .detector_score_threshold(0.5)
        .detector_iou_threshold(0.4)
        .build()
        .await?;
    Ok(())
}

Execution Providers (Nvidia, AMD, Intel, Mac, Arm, etc.)

Since this is implemented with ort, many execution providers are available to enable hardware acceleration. You can enable an execution provider in this crate with cargo features. A full list of execution providers is available here.

To use CUDA, add the cuda feature to your Cargo.toml and configure the builder:

use face_id::analyzer::FaceAnalyzer;
use ort::ep::{CUDA, TensorRT};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let analyzer = FaceAnalyzer::from_hf()
        .with_execution_providers(&[
            TensorRT::default().build(),
            CUDA::default().build(),
        ])
        .build()
        .await?;
}

Model Details

Detection (SCRFD)

The naming convention for SCRFD models indicates complexity (FLOPs) and whether they include 5-point facial keypoints ( kps).

Name	Easy	Medium	Hard	FLOPs	Params(M)	Infer(ms)	BBox	Facial Keypoints
500m.onnx	90.57	88.12	68.51	500M	0.57	3.6	✅	❌
1g.onnx	92.38	90.57	74.80	1G	0.64	4.1	✅	❌
34g.onnx	96.06	94.92	85.29	34G	9.80	11.7	✅	❌
2.5g_bnkps.onnx	93.80	92.02	77.13	2.5G	0.82	4.3	✅	✅
10g_bnkps.onnx	95.40	94.01	82.80	10G	4.23	5.0	✅	✅
34g_gnkps.onnx	96.17	95.19	84.88	34G	9.84	11.8	✅	✅

BN vs GN: bnkps (Batch Norm) models have higher general recall. gnkps (Group Norm) models are specifically better at handling very large faces or faces rotated past 90 degrees.
Easy/Medium/Hard refers to accuracy on training data, source: insightface.
34g_gnkps is an evolution of the bnkps models, more info, and source for the gnkps numbers here: https://modelscope.cn/models/iic/cv_resnet_facedetection_scrfd10gkps/summary

Recognition (ArcFace)

The default recognition model is w600k_r50.onnx (ResNet-50) from the InsightFace "Buffalo_L" bundle. It produces a * 512-dimensional* L2-normalized vector.

Features

hf-hub (Default): Allows downloading models from Hugging Face.
copy-dylibs / download-binaries (Default): Simplifies ort setup.
serde: Enables serialization/deserialization for results.
clustering (Default): Enables face clustering using HDBSCAN.
Execution Providers: cuda, tensorrt, coreml, directml, openvino, etc.

Troubleshooting

Dynamic Linking

If you are using the load-dynamic feature and encounter library errors:

Download the onnxruntime library from GitHub Releases.

Set the ORT_DYLIB_PATH environment variable:

# Linux/macOS
export ORT_DYLIB_PATH="/path/to/libonnxruntime.so"
# Windows (PowerShell)
$env:ORT_DYLIB_PATH = "C:/path/to/onnxruntime.dll"

face_id 0.3.0

Face ID

Features

Usage

Usage: Individual Components

Face Detection only

Facial Recognition (Embeddings)

Clustering & Face Crop

Loading Local Models

Customizing Hugging Face Models

Execution Providers (Nvidia, AMD, Intel, Mac, Arm, etc.)

Model Details

Detection (SCRFD)

Recognition (ArcFace)

Features

Troubleshooting

Dynamic Linking