A robust lane detection and tracking method based on computer vision
This article has been downloaded from IOPscience. Please scroll down to see the full text article.
2006 Meas. Sci. Technol. 17 736
(http://iopscience.iop.org/0957-0233/17/4/020)
Download details:
IP Address: 220.149.89.16
The article was downloaded on 25/10/2012 at 06:54
Please note that terms and conditions apply.
View the table of contents for this issue, or go to the journal homepage for more
Home Search Collections Journals About Contact us My IOPscience
INSTITUTE OF PHYSICS PUBLISHING MEASUREMENT SCIENCE AND TECHNOLOGY
Meas. Sci. Technol. 17 (2006) 736–745 doi:10.1088/0957-0233/17/4/020
A robust lane detection and tracking
method based on computer vision
Yong Zhou, Rong Xu, Xiaofeng Hu and Qingtai Ye
640 Institute, School of Mechanical Engineering, Shanghai Jiao Tong University,
Shanghai 200 030, People’s Republic of China
E-mail: zhou@sjtu.edu.cn, rxu@sjtu.edu.cn, wshxf@sjtu.edu.cn and
yqingtai@sjtu.edu.cn
Received 2 October 2005, in final form 6 January 2006
Published 21 February 2006
Online at stacks.iop.org/MST/17/736
Abstract
This paper presents a robust method designed to detect and track a road lane
from images provided by an on-board monocular monochromatic camera.
The proposed lane detection approach makes use of a deformable template
model to the expected lane boundaries in the image, a maximum a posteriori
formulation of the lane detection problem, and a Tabu search algorithm to
maximize the posterior density. The model parameters completely
determine the position of the host vehicle within the lane, its heading
direction and the local structure of the lane ahead. Based on the lane
detection result in the first frame of the image sequence, a particle filter,
having multiple hypotheses capability and performing nonlinear filtering, is
used to recursively estimate the lane shape and the vehicle position in the
sequence of consecutive images. Experimental results reveal that the
proposed lane detection and tracking method is robust against broken lane
markings, curved lanes, shadows, strong distracting edges, and occlusions in
the captured road images.
Keywords: lane detection, lane tracking, MAP estimate, Tabu search, particle
filter, computer vision
1. Introduction
Lane detection and tracking, the process of estimating the
local geometry structure of the lane ahead and the position and
heading direction of the vehicle inside the lane, is primary and
essential for many intelligent vehicle applications, including
intelligent cruise control, lane departure warning, lateral
control and autonomous driving etc.
In most conditions, lane detection and tracking is
simplified into a problem of finding the lane boundaries in
the input road images. We argue that a distinction can
be made between lane detection and lane tracking. Lane
detection aims to determine the location of lane boundaries
in a single image without strong prior knowledge about the
lane position. Lane tracking involves determining the location
of lane boundaries in a sequence of consecutive images by
constraining the probable lane location in the current image
using information about the lane location in previous images.
The difficulty of lane detection and tracking lies in the fact
that the lane boundaries in an image can have relatively weak
local contrast, or strong distracting edges.
Recently, many lane detection methods have been
introduced [1–3], and they can be classified into region-based
methods and edge-based methods. Region-based methods
first label image pixels into road and non-road classes based
on particular features. Then, lane models are used to fit the
segmentation results. Usually, two particular features, namely
colour [4–7] and texture [8, 9], have been used to segment the
road region.
The colour and texture in various roads are not different.
Due to the uncontrolled illumination condition, the colour
of a road varies with time. Meanwhile, the region-based
lane detection methods are time-consuming and difficult to
precisely locate the lane boundaries.
Most lane detection methods are edge-based. After an
edge detection step, the edge-based methods organize the
detected edges into meaningful structure (lane marking) or
fit a lane model to the detected edges. Most of the edge-based
0957-0233/06/040736+10$30.00 © 2006 IOP Publishing Ltd Printed in the UK 736
A robust lane detection and tracking method based on computer vision
methods use simple lane models to characterize the lane
boundaries.
The simplest model used to characterize lane boundaries
is a straight model considering they are straight [10]. This
technique is simple but it generates error on the vehicle
localization if the boundaries of the lane are not straight.
In [11], the adaptive randomized Hough transform is
used to detect the lane boundary. The three-dimensional
(3D) parametric space of the curve is reduced to the two-
dimensional (2D) and the one-dimensional (1D) space. The
paired parameters in two dimensions are estimated by gradient
directions and the last parameter in one dimension is used to
verify the estimated parameters by histogram.
In [12], the authors propose a B-Snake-based lane
detection and tracking algorithm without any camera
parameters. The Canny/Hough estimation of vanishing points
(CHEVP) is presented for providing a good initial position
for the B-Snake and minimum mean-square error (MMSE) is
proposed to determine the control points of the B-Snake model
by the overall image forces on two sides of the lane.
The LANA algorithm [13] uses frequency-domain
features to find the lane edge. The feature vectors are used to
compute the likelihood probability through fitting the detected
features to a lane model. In [14], the road edges are detected
using a model which characterizes the road edges in road scene
images and is initialized by an off-line training step. The model
is updated in a recursive way after each detection.
In [15] is used a lane-curve function (LCF) obtained by
the transformation of the defined parabolic function on the
world coordinates into the image coordinates. A comparison
is carried out between the slope of an assumed LCF and
the phase angle of the edge pixels in the lane region of
interest constructed by the assumed LCF. The LCF with the
minimum difference in the comparison becomes the true LCF
corresponding to the lane curve.
LOIS in [16] and [17] uses a deformable template
model of the lane boundaries in the image plane to locate
lane boundaries without thresholding the intensity gradient
information. The Metropolis algorithm is used to maximize
a function which evaluates how well the image gradient data
support a given set of template deformation parameters.
GOLD [18] and RALPH [19] reproject the image ahead
of the vehicle onto the ground plane. In GOLD, lane markings
are modelled as narrow bright features against a darker
background. The lane width is identified by constructing a
histogram of the horizontal separation between all pairs of
potential lane edge points and finding the peak value. The lane
is found by identifying the pairs of edge points in each row
forming the longest continuous lane. RALPH determines the
road curvature and lateral offsets using a matching technique
that adaptively adjusts and aligns a template to the averaged
scan line intensity profile.
Edge-based methods are highly dependent on the methods
used to extract the edges corresponding to the lane boundaries.
Often in practice the strongest edges are not the road edges,
so that the detected edges do not necessarily fit a straight line
or a smoothly varying model. Edge-based methods often fail
to locate the lane boundaries in images with strong distracting
edges.
Lane tracking is a problem to recursively estimate the lane
shape parameters. In [20–23] Kalman filter estimators are used
whose observations are image edge features, and a controlled
search for these features allows edges that do not correspond
to useful road markings to be rejected [24]. These methods
need to reduce the effects of the outliers prior to road shape
estimation, which is a difficult task. When a tracking failure
occurs, the Kalman filter based lane tracking is difficult to
recover. Kalman filtering is inadequate because it is based on
Gaussian densities which, being unimodal, cannot represent
simultaneous alternative hypotheses. So a more robust lane
tracking method is needed.
To overcome the demerits of the lane detection and
tracking methods described above, a novel lane detection
and tracking method is developed in this paper. Firstly, a
deformable template model of the projective projection of
the lane boundaries is introduced assuming that the lane
boundaries are parabolas in the ground plane. Then, the lane
detection problem is formulated as a maximum a posteriori
(MAP) estimate problem. Due to the non-concavity of the
function involved, a Tabu search algorithm is used to obtain the
global maxima. The model parameters calculated completely
determine the position of the vehicle inside the lane, its
heading direction, and the local structure of the lane. The
lane detection result in the first frame is used to initialize
a lane tracker. In this study, the lane shape and vehicle
position in the sequence of consecutive images are recursively
estimated using a particle filter, which has multiple hypotheses
capability and can perform nonlinear filtering. The proposed
lane detection approach can handle the situations where the
lane boundaries in an image have relatively weak local contrast
or where there are strong distracting edges.
The model proposed in this paper is similar to those
introduced by [17, 22] in form, but the meanings of our model’s
parameters are explicit. We give the relationships between the
model’s parameters and the vehicle position and orientation,
and the lane shape. The proposed lane detection method is
different to these in [17, 22]. Furthermore, this paper gives a
lane tracking method that has multiple hypotheses capability
and can filter the estimated parameters.
The rest of this paper is organized as follows: section 2
describes the proposed lane detection approach including the
lane model in the image coordinate system, a maximum
a posteriori (MAP) formulation of the lane detection problem,
a Tabu search algorithm to maximize the a posteriori density.
Section 3 presents a particle filter based lane tracking method.
Experimental results are provided in section 4, and section 5
gives conclusions.
2. Lane detection based on the deformable template
model and MAP estimate
2.1. Lane model
The lane model plays an important role in lane detection and
tracking algorithms. With reference to figure 1, let O be the
optical centre of the camera, at a height H above the ground
plane. Let OXYZ be a world coordinate system with OZ
parallel to the tangent to the lane border in A,OX pointing
right, OY pointing downward. The camera axes are indicated
by xc, yc and zc, with zc being in the direction of the optical
axis, xc in the direction of the scan lines and yc downward
737
Y Zhou et al
X
Y
Z
xc
yc
zc
O
H
d
P
A
l
Figure 1. The coordinate systems for the computation of the
coordinates of a lane boundary point.
(in the direction of increasing scan line numbers). The pan of
the camera is the angle θ between the optical axis and the YZ
plane, the tilt of the camera is the angle φ between the optical
axis and the XZ plane.
Based on the fact that the curvatures of high speed
roads are small, the lane boundaries can be approximated by
parabolas in a flat ground plane in the length of the visible
road [20]. Let (η, ε) denote a point on the lane boundary in an
earth fixed coordinate system with its origin represented by a
reference point A on the road boundary and η, ε pointing front
(along the tangent to the lane boundary) and right, respectively
(see figure 1). For small changes of the road direction, that is,
for ε/η � 1, the coordinates of the point on the lane boundary
with curvature C0 can be approximated by the parametric
equation (1): {
ε = C0l2/2
η = l, (1)
where l is the arc length of the lane boundary segment. We now
introduce the approximation θ � 1 (cos θ ∼= 1, sin θ ∼= θ,
sin2 θ ∼= 0), which is reasonable since the vehicle is
running along the lane. If the relationship among the earth
fixed coordinate system, the world coordinate system, the
camera coordinate system and the image coordinate system
is established, the earth fixed coordinate lane boundary model
(1) can be transformed to the image lane boundary model as
follows (see appendix for details):
n = K(m − m0)−1 + B(m − m0) + n0, (2)
where m and n are the row number and column number of
the road image, respectively, m0 = −αv tan φ + u0 is the
row corresponding to the horizon of the ground plane in the
image coordinate system and can be determined by the camera
calibration. The coefficients K,B and n0 are related to the
curvature C0 of the lane, the position d and heading direction
θ of the vehicle relative to the lane boundary, respectively.
They are formulated as follows:
K = αuαvC0H(2 cos3 φ)−1 (3)
B = αud cos φ(αvH)−1 (4)
n0 = −αuθ(cos φ)−1 + v0, (5)
where d is the horizontal distance from the camera to the lane
boundary, (u0, v0) are the image centre coordinates, (αu, αv)
are the horizontal scale factor and the vertical scale factor of the
camera, respectively. The parameters (u0, v0), (αu, αv),H , φ
and m0 can be determined by intrinsic and extrinsic camera
calibration procedures.
The model introduced above is similar to that introduced
by [17, 22] in form, but the meanings of our model’s
parameters are explicit, as shown by equations (3), (4) and (5).
Assuming that the left lane boundary is the shifted version of
the right lane boundary at a distance along the X-axis in the
ground plane, the left and right lane boundaries have equal
curvature and equal tangential orientation where they intersect
the X-axis; so K and n0 will be equal for the left and right lane
boundaries. As a result, the lane shape in an image can be
defined by the four parameters K,BL,BR and n0.
2.2. The likelihood function
For a given deformable model, how to evaluate its fitness with
the road image is an important problem. Let x denote the
parameters of the deformable model, z denote the observed
road image. Here, we define a likelihood function p(z|x),
which specifies the probability of seeing the real lanes in the
input road images, when given a lane model with specific
parameters. In this study, the likelihood function measures
the quality or goodness-of-fit of the lane shape given by
x = [K,BL,BR, n0]T with the actual lane present in the
image. The likelihood function we propose here only uses
the edge information in the input image.
In many road images, it is difficult, if not impossible, to
select a suitable threshold that eliminates clutter edges without
eliminating many of the lane edge points of interest. To
preserve the useful information, a better alternative is to use a
whole gradient map. However, the computation cost is high
if all pixels are included. We argue that a very low threshold
value can ensure the existence of true edges corresponding
to the lane markings or road boundaries. So, we calculate
the gradient values of the input image using the 3 × 3 Sobel
operator with very low threshold. Then, two images can be
obtained: a grey level edge magnitude map fm(u, v), denoting
the gradient magnitude of the input image, and a grey scale
edge direction map fd(u, v), denoting the ratio of vertical
and horizontal gradient magnitudes of the input image. Let
E denote the set of the pixels whose gradient magnitudes
are larger than the specified threshold. Based on the edge
information, the likelihood function is defined by equation
(6):
p(z|x) ∝
∑
(u,v)∈E
(fm(u, v)W(DL(u, v))
× |cos(βL)| + fm(u, v)W(DR(u, v))|cos(βR)|), (6)
738
A robust lane detection and tracking method based on computer vision
where DL(u, v) and DR(u, v) are the minimum distances
from the point (u, v) to the left lane boundary and to the
right lane boundary, respectively; βL and βR are the angles
between the gradient direction of the point (u, v) and the
tangential directions of the left lane boundary and the right lane
boundary at the point that has minimum distance from the point
(u, v), respectively. W(D) is a weighting function, and can
be interpreted as a fuzzy member function with membership
radius R if W(0) = 1,W(D) decreases monotonically from
0 to R, and W(D) = 0 for D � R. W(D) is defined using
equation (7):
W(D) =
{
e−D
2/σ 2 0 � D < R
0 else. (7)
Definition of the likelihood function requires that the lane
model agree with the image edges not only in position, but
also in the tangential direction. The contribution made by a
pixel to the likelihood is the gradient magnitude at that pixel,
multiplied by a weighting function whose value decreases as
the pixel gets farther away from the lane boundary defined by
the given parameter x, and a function whose value decreases
as the gradient direction at that pixel moves farther away from
the tangential direction of the lane model. The higher this
likelihood, the better the lane model matches the edges in the
input image.
2.3. MAP estimate
The problem of lane detection in grey level images can
be formulated as an equivalent problem of determining the
MAP hypotheses by using the Bayes theorem to calculate the
posterior probability of each candidate hypothesis. In this
study, finding the MAP hypothesis involves the maximization
of a function over a compact subspace of R4.
A prior probability density function (pdf) p(x) describes
the a priori constraints on the location of lane edges in the
images of the road scenes. A likelihood pdf p(z|x) denotes
the probability of observing the image in which the lane shape
is represented by x = [K,BL,BR, n0]T . Then, the MAP
estimate can be given by
x∗ = arg max
x∈R4
p(x|z). (8)
By using the Bayes theorem,
x∗ = arg max
x∈R4
p(z|x)p(x). (9)
Real world lanes are never too narrow or wide, so a prior
probability density function (pdf) p(x) is constructed over the
lane model parameters set to reflect a priori knowledge as
follows:
p(x) ∝ exp
(
− (BR − BL − µR)
2
σ 2R
)
. (10)
So
x∗ = arg max
x∈R4
exp
(
− (BR − BL − µR)
2
σ 2R
)
×
∑
(u,v)∈E
(fm(u, v)W(DL(u, v))
× |cos(βL)| + fm(u, v)W(DL(u, v))|cos(βL)|). (11)
The maximization, however, cannot be achieved by a
simple hill-climbing approach, due to the non-concavity of
the function involved. The simulated annealing algorithm or
Metropolis algorithm can be used to find global extremes.
But they have some problems such as slow convergence and
efficiency, and the probability of getting the global optimal
value is small [25]. Tabu (or Taboo) search is different from
the simulated annealing algorithm in that it has an explicit
memory component. At each iteration the neighbourhood of
the current solution is partially explored, and a move is made
to the best non-Tabu solution in that neighbourhood [26]. So
Tabu search can efficiently escape from the local maxima.
To obtain the global maximum, we use the Tabu search. In
the next section, the Tabu search based optimization of the
function involved in the MAP estimate is presented.
2.4. Tabu search based MAP estimate
Tabu search, first presented by Glover [27], is a stochastic
global optimization procedure which proves efficient for
solving various combinatorial optimization problems and
the optimization of multi-model functions with continuous
variables. We refer the reader to [28, 29] for details. The Tabu
search algorithm can be summarized as follows: start with a
current solution, called a configuration, evaluate the criterion
function for that solution. Generate a set of feasible solutions
in the neighbour of the current solution. If the best of these
solutions is not in the Tabu
本文档为【001 A robust lane detection and tracking method based on computer vision】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。