BINOCULAR STEREO VISION BASED ON OPENCV
Ke Lu, Xiangyang Wang, Zhi Wang, Lei Wang
School of Communication and Information Engineering
Shanghai University
lukeshu09@gmail.com
Keywords: OpenCV, Camera calibration, Stereo
rectification.
Abstract
Stereo vision is one of the important branches in
computer vision research. Among the techniques of
stereo vision, people have paid more attention to
binocular stereo vision which is based on processing
two images. It directly simulates the manner of human
eyes observing one scene from two different
viewpoints. In this paper, binocular stereo vision
technology is analyzed. Based on OpenCV, the
important algorithm of stereo vision is achieved and
the depth information of object is recovered by our
program. By making full use of the functions of
OpenCV, this method achieves better efficiency and
accuracy of computation as well as good portability,
which can be adopted in many types of computer
vision systems.
1ˊIntroduction
Stereo vision plays an important role in computer
vision which aims to recover the depth information of
the scene. The research in stereo vision is very useful.
It can be applied to 3D reconstruction, industrial
automation systems, remote sensing and so on. An
integrated stereo vision system usually consists of
image capture, camera calibration, stereo rectify,
feature extraction, stereo matching and depth
information computing. Among these parts, camera
calibration and stereo matching are the most important
and difficult problems [1-3].
OpenCV is an open source computer vision library
which is written in C and C++ and runs under Linux,
Windows and Mac OS X. It was designed for
computational efficiency and with a strong focus on
real-time applications. OpenCV is written in optimized
C and can take advantage of multicore processors. It is
mainly used for some advanced image processing,
such as feature detection and tracking, motion analysis,
object segmentation and recognition and 3D
reconstruction. Because of its high performance it is
widely used as an image processing software [4].
2ˊFundamental principle
The basic principle of binocular stereo vision is similar
to the way in which human eyes did. It observes one
scene from two different viewpoints. By using the
principle of triangulation, the information of depth is
recovered by the disparities.
Figure 1. A perfectly undistorted, aligned stereo rig
As shown in Figure 1, suppose that we have a
perfectly undistorted, aligned, and measured stereo rig.
Two cameras whose image planes are exactly coplanar
with each other, with exactly parallel optical axes that
are a known distance apart, and with equal focal
lengths. Also, assume that the principal points have
been calibrated to have the same pixel coordinates in
their respective left and right images. What’s more, the
images are row-aligned and that every pixel row of
one camera aligns exactly with the corresponding row
in the other camera [4]. P is a point in the real world
and p, q are the two projective points on the two planes
and have the horizontal coordinate Xl and Xr.
In this simplified case, the disparity can be defined by
d=Xl-Xr and the depth is inversely proportional to the
disparity. By using similar triangles, the depth Z can
be derived as follows [5]:
( )T Xl Xr T fTZ
Z f Z Xl Xr
� �
� � (1)
However in the real world, cameras will almost never
be exactly aligned in the frontal parallel. So we need to
ICSSC 2011 74
process the image to map a real world camera setup
into a geometry that resembles this ideal arrangement.
2.1. Camera calibration
The purpose of camera calibration is to establish
imaging model, determine the camera position and
attribute parameters so as to ensure the congruent
relationship between 3D object point in world
coordinate and the 2D point in image plane [6].
Camera calibration is one of the key procedures for
computer vision and the accuracy of camera
calibration is of vital importance in terms of ensuring
the systematic accuracy.
In general, camera calibration can be classified into
traditional camera calibration and camera self-
calibration [7]. The approach for traditional camera
calibration is carried out with a prepared object whose
3D geometry shape is already known, namely given a
fixed camera model, based on special experiment
condition, the captured images are processed. The
intrinsic and extrinsic parameters of camera are
computed using a series of mathematical
transformation and basic principle. While the camera
self-calibration is the calibration with no user
intervention or calibration pattern, performed online
during the acquisition process [8].
In binocular stereo vision, apart from single camera
calibration, we must solve the relative position
between two cameras. Suppose that the extrinsic
parameters (rotation matrix and translation vector) of
two cameras are , and , which reflect the
relative position of two cameras in the world
coordinate system. Then the relationship between two
cameras can be expressed as follows [4]:
lR lT rR rT
( )Tr lR R R
lT Tr RT � (2)
In this formula, R and T represent rotation matrix and
translation vector between two cameras.
2.2. Stereo rectification
After we have got the intrinsic and extrinsic
parameters of cameras and the rotation matrix and
translation vector between two cameras, we need to
rectify images so that the two image planes are
accurately row-aligned. As shown in Figure 2, Ěl and
Ěr are the original image planes and Ěl` and Ěr` are
the rectified image planes.
Figure 2. Stereo rectification
There are many ways to compute rectification terms
such as Hartley’s algorithm and Bouguet’s algorithm.
Hartley’s algorithm can yield uncalibrated stereo using
just the fundamental matrix while Bouguet’s algorithm
uses the rotation and translation parameters from two
calibrated cameras. Hartley’s algorithm can be used to
derive structure from motion recorded by a single
camera but may produce more distorted images than
Bouguet’s calibrated algorithm [4-5].
2.3. Stereo matching
Stereo matching—matching a 3D point in the two
different camera views—can be computed only over
the visual areas in which the views of the two cameras
overlap [9]. The same scene under different view
images will be very different, there are many factors,
such as lighting conditions, scene geometry and
physical properties, noise and distortion can affect the
image gray value. Therefore, to accurately match
images which contains so many negative factors is
very difficult.
The existing techniques for general binocular stereo
matching can be classified into two categories: local
method and global method. Local methods use only
small areas surrounding the pixels, while global
methods optimize some global energy function [2].
Local methods, such as gradient-based optimization,
block matching, and feature matching can be very
efficient, but they may mismatch in ambiguous regions.
Global methods, such as intrinsic curves, dynamic
programming, graph cuts, and belief propagation can
be less sensitive to these problems. However, these
methods are more expensive in their computational
cost.
3. Experiment result
3.1. Camera calibration
We take 15 pairs of images for the chessboard from
different directions and extract corner points of the
chessboard in every image. Then we find out the
75
coordinates of the corners in sub-pixel level and
calculate the intrinsic and extrinsic parameters for
cameras. Figure 3 shows one of such pairs.
Figure 3. Chessboard corner extraction
3.2. Stereo rectification
With the intrinsic and extrinsic parameters, we can
calculate the row-aligned rectification rotations and the
reprojection matrix and then compute the left and right
rectification lookup maps for the left and right camera
views. Then we can take pictures for the scene and get
row-aligned images by remapping them. Figure 4
shows the original left and right image pair (upper
panels) and the stereo rectified left and right image
pair (lower panels).
Figure 4. Stereo rectification
3.3. Corner detection and stereo matching
As shown in Figure 5, we extract corners from left
image and find correspondence corners in right image
by using epipolar constraint and gray similarity. Then
we can compute the disparities with multiple pairs of
matched points.
Figure 5. Corner detection and stereo matching
3.4. Reprojection
Using disparities and reprojection matrix, we can
compute 3D coordinates of corners.
In our experiment, we calculate the size of the box
according to the coordinates of corners. The results are
shown in Tabel 1.
Actual
size(cm)
Measurement
results(cm)
Length 25.5 23.1
Width 20.0 17.9
Height 7.5 6.8
Table 1. Experiment result
4. Conclusion
As one of the important branches of computer vision,
the researches in binocular stereo vision are of great
significance in engineering application. In our paper,
we propose a method based on OpenCV by which
depth can be well recovered. Next, we will implement
a 3D reconstruction system by this method, which will
simplify 3D modelling.
5. Acknowledgement
This research work is supported by Shanghai
Municipal Natural Science Foundation (No.
09ZR1412300), National Natural Science Foundation
of China (NSFC, No. 60975024), National Natural
Science Foundation of China (60873130, 60872115),
and the Shanghai’s Leading Academic Discipline
Project of Shanghai Municipal Education Committee
(J50104).
References
[1] E. E. Hemayed, "A survey of camera self-calibration,"
in Proceedings. IEEE Conference on Advanced Video
and Signal Based Surveillance, 2003., 2003, pp. 351-
357.
[2] D. Scharstein, R. Szeliski, and R. Zabih, "A taxonomy
and evaluation of dense two-frame stereo
correspondence algorithms," in Stereo and Multi-
Baseline Vision, 2001. (SMBV 2001). Proceedings.
IEEE Workshop on, 2001, pp. 131-140.
[3] L. JunJie, A. Jakas, A. Al-Obaidi, and L. Yonghuai, "A
comparative study of different corner detection
methods," in Computational Intelligence in Robotics
and Automation (CIRA), 2009 IEEE International
Symposium on, 2009, pp. 509-514.
[4] G. Bradski and A. Kaehler, Learning OpenCV.
America: O'Reilly, 2009.
76
77
[5] S. Huihuang and H. Bingwei, "A Simple Rectification
Method of Stereo Image Pairs with Calibrated
Cameras," in Information Engineering and Computer
Science (ICIECS), 2010 2nd International Conference
on, 2010, pp. 1-4.
[6] Z. Zhang, "A flexible new technique for camera
calibration," Pattern Analysis and Machine
Intelligence, IEEE Transactions on, vol. 22, pp. 1330-
1334, 2000.
[7] Z. Hou, J. Zhao, L. Gu, and G. Lv, "Automatic
Calibration Method Based on Traditional Camera
Calibration Approach," in Information Science and
Engineering (ICISE), 2009 1st International
Conference on, 2009, pp. 1168-1171.
[8] Y. Yongjie, Z. Qidan, L. Zhuang, and C. Quanfu,
"Camera Calibration in Binocular Stereo Vision of
Moving Robot," in Intelligent Control and Automation,
2006. WCICA 2006. The Sixth World Congress on,
2006, pp. 9257-9261.
[9] T. Tangfei, K. Ja Choon, and C. Hyouk Ryeol, "A fast
block matching algorthim for stereo correspondence,"
in Cybernetics and Intelligent Systems, 2008 IEEE
Conference on, 2008, pp. 38-41
本文档为【binocular stereo vision based on opencv】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。