[][src]Crate cv_core

Rust CV Core

This library provides common abstractions and types for computer vision (CV) in Rust. All the crates in the rust-cv ecosystem that have or depend on CV types depend on this crate. This includes things like camera model traits, bearings, poses, keypoints, etc. The crate is designed to be very small so that it adds negligable build time. It pulls in some dependencies that will probably be brought in by writing computer vision code normally. The core concept is that all CV crates can work together with each other by using the abstractions and types specified in this crate.

The crate is designed to work with #![no_std], even without an allocator. libm is used (indirectly through [num-traits]) for all math algorithms that aren't present in std. Any code that doesn't need to be shared across all CV crates should not belong in this repository. If there is a good reason to put code that some crates may need into cv-core, it should be gated behind a feature.

Triangulation

Several of the traits with in cv-core, such as TriangulatorObservances, must perform a process called triangulation. In computer vision, this problem occurs quite often, as we often have some of the following data:

We have to take this data and produce a 3d point. Cameras have an optical center which all bearings protrude from. This is often refered to as the focal point in a standard camera, but in computer vision the term optical center is prefered, as it is a generalized concept. What typically happens in triangulation is that we have (at least) two optical centers and a bearing (direction) out of each of those optical centers approximately pointing towards the 3d point. In an ideal world, these bearings would point exactly at the point and triangulation would be achieved simply by solving the equation for the point of intersection. Unfortunately, the real world throws a wrench at us, as the bearings wont actually intersect since they are based on noisy data. This is what causes us to need different triangulation algorithms, which deal with the error in different ways and have different characteristics.

Here is an example where we have two pinhole cameras A and B. The @ are used to show the virtual image plane. The virtual image plane can be thought of as a surface in front of the camera through which the light passes through from the point to the optical center O. The points a and b are normalized image coordinates which describe the position on the virtual image plane which the light passed through from the point to the optical center on cameras A and B respectively. We know the exact pose (position and orientation) of each of these two cameras, and we also know the normalized image coordinates, which we can use to compute a bearing. We are trying to solve for the point p which would cause the ray of light to pass through points a and b followed by O.

  • p the point we are trying to triangulate
  • a the normalized keypoint on camera A
  • b the normalized keypoint on camera B
  • O the optical center of a camera
  • @ the virtual image plane
                       @
                       @
              p--------b--------O
             /         @
            /          @
           /           @
          /            @
  @@@@@@@a@@@@@
        /
       /
      /
     O

Re-exports

pub use nalgebra;
pub use sample_consensus;

Structs

CameraPoint

A 3d point which is relative to the camera's optical center and orientation where the positive X axis is right, positive Y axis is down, and positive Z axis is forwards from the optical center of the camera. The unit of distance of a CameraPoint is unspecified and relative to the current reconstruction.

CameraToCamera

This contains a relative pose, which is a pose that transforms the CameraPoint of one image into the corresponding CameraPoint of another image. This transforms the point from the camera space of camera A to camera B.

CameraToWorld

This contains a camera pose, which is a pose of the camera relative to the world. This transforms camera points (with depth as z) into world coordinates. This also tells you where the camera is located and oriented in the world.

FeatureMatch

Normalized keypoint match

FeatureWorldMatch

Normalized keypoint to world point match

KeyPoint

A point on an image frame. This type should be used when the point location is on the image frame in pixel coordinates. This means the keypoint is neither undistorted nor normalized.

Skew3

Contains a member of the lie algebra so(3), a representation of the tangent space of 3d rotation. This is also known as the lie algebra of the 3d rotation group SO(3).

WorldPoint

A point in "world" coordinates. This means that the real-world units of the pose are unknown, but the unit of distance and orientation are the same as the current reconstruction.

WorldToCamera

This contains a world pose, which is a pose of the world relative to the camera. This maps WorldPoint into CameraPoint, changing an absolute position into a vector relative to the camera.

Traits

Bearing

Describes the direction that the projection onto the camera's optical center came from. It is implemented on projection items from different camera models. It is also implemented for Unit<Vector3<f64>> if you want to pre-compute the normalized bearings for efficiency or to turn all camera models into a unified type.

CameraModel

Allows conversion between the point on an image and the internal projection which can describe the bearing of the projection out of the camera.

ImagePoint

Allows the retrieval of the point on the image the feature came from.

Pose

This trait is implemented by all the different poses in this library:

Projective

This trait is implemented for homogeneous projective 3d coordinate.

TriangulatorObservances

This trait is for algorithms which allow you to triangulate a point from two or more observances. Each observance is a WorldToCamera and a Bearing.

TriangulatorRelative

This trait allows you to take one relative pose from camera A to camera B and two bearings a and b from their respective cameras to triangulate a point from the perspective of camera A.