Expand description
§Rust CV Core
This library provides common abstractions and types for computer vision (CV) in Rust. All the crates in the rust-cv ecosystem that have or depend on CV types depend on this crate. This includes things like camera model traits, bearings, poses, keypoints, etc. The crate is designed to be very small so that it adds negligable build time. It pulls in some dependencies that will probably be brought in by writing computer vision code normally. The core concept is that all CV crates can work together with each other by using the abstractions and types specified in this crate.
The crate is designed to work with #![no_std]
, even without an allocator. libm
is used
(indirectly through [num-traits
]) for all math algorithms that aren’t present in std
. Any
code that doesn’t need to be shared across all CV crates should not belong in this repository.
If there is a good reason to put code that some crates may need into cv-core
, it should be
gated behind a feature.
§Triangulation
Several of the traits with in cv-core
, such as TriangulatorObservances
, must perform a process
called triangulation. In computer vision, this problem
occurs quite often, as we often have some of the following data:
We have to take this data and produce a 3d point. Cameras have an optical center which all bearings protrude from. This is often refered to as the focal point in a standard camera, but in computer vision the term optical center is prefered, as it is a generalized concept. What typically happens in triangulation is that we have (at least) two optical centers and a bearing (direction) out of each of those optical centers approximately pointing towards the 3d point. In an ideal world, these bearings would point exactly at the point and triangulation would be achieved simply by solving the equation for the point of intersection. Unfortunately, the real world throws a wrench at us, as the bearings wont actually intersect since they are based on noisy data. This is what causes us to need different triangulation algorithms, which deal with the error in different ways and have different characteristics.
Here is an example where we have two pinhole cameras A and B. The @
are used to show the
virtual image plane. The virtual image plane can be thought
of as a surface in front of the camera through which the light passes through from the point to the optical center O
.
The points a
and b
are normalized image coordinates which describe the position on the virtual image plane which
the light passed through from the point to the optical center on cameras A
and B
respectively. We know the
exact pose (position and orientation) of each of these two cameras, and we also know the normalized image coordinates,
which we can use to compute a bearing. We are trying to solve for the point p
which would cause the ray of light to
pass through points a
and b
followed by O
.
p
the point we are trying to triangulatea
the normalized keypoint on camera Ab
the normalized keypoint on camera BO
the optical center of a camera@
the virtual image plane
@
@
p--------b--------O
/ @
/ @
/ @
/ @
@@@@@@@a@@@@@
/
/
/
O
Re-exports§
pub use nalgebra;
pub use sample_consensus;
Structs§
- A 3d point which is relative to the camera’s optical center and orientation where the positive X axis is right, positive Y axis is down, and positive Z axis is forwards from the optical center of the camera. The unit of distance of a
CameraPoint
is unspecified and relative to the current reconstruction. - This contains a relative pose, which is a pose that transforms the
CameraPoint
of one image into the correspondingCameraPoint
of another image. This transforms the point from the camera space of cameraA
to cameraB
. - This contains a camera pose, which is a pose of the camera relative to the world. This transforms camera points (with depth as
z
) into world coordinates. This also tells you where the camera is located and oriented in the world. - Normalized keypoint match
- Normalized keypoint to world point match
- A point on an image frame. This type should be used when the point location is on the image frame in pixel coordinates. This means the keypoint is neither undistorted nor normalized.
- Contains a member of the lie algebra so(3), a representation of the tangent space of 3d rotation. This is also known as the lie algebra of the 3d rotation group SO(3).
- A point in “world” coordinates. This means that the real-world units of the pose are unknown, but the unit of distance and orientation are the same as the current reconstruction.
- This contains a world pose, which is a pose of the world relative to the camera. This maps
WorldPoint
intoCameraPoint
, changing an absolute position into a vector relative to the camera.
Traits§
- Describes the direction that the projection onto the camera’s optical center came from. It is implemented on projection items from different camera models. It is also implemented for
Unit<Vector3<f64>>
if you want to pre-compute the normalized bearings for efficiency or to turn all camera models into a unified type. - Allows conversion between the point on an image and the internal projection which can describe the bearing of the projection out of the camera.
- Allows the retrieval of the point on the image the feature came from.
- This trait is implemented by all the different poses in this library:
- This trait is implemented for homogeneous projective 3d coordinate.
- This trait is for algorithms which allow you to triangulate a point from two or more observances. Each observance is a
WorldToCamera
and aBearing
. - This trait allows you to take one relative pose from camera
A
to cameraB
and two bearingsa
andb
from their respective cameras to triangulate a point from the perspective of cameraA
.