InTech-Real Time Facial Feature Points Tracking With Pyramidal Lucas Kanade Algorithm
InTech-Real Time Facial Feature Points Tracking With Pyramidal Lucas Kanade Algorithm
InTech-Real Time Facial Feature Points Tracking With Pyramidal Lucas Kanade Algorithm
Facial expression tracking is a fundamental problem in computer vision due to its important
role in a variety of applications including facial expression recognition, classification, and
detection of emotional states, among others H. Xiaolei (2004). Research on face tracking has
been intensified due to its wide range of applications in psychological facial expression
analysis and human computer interaction. Recent advances in face video processing and
compression have made face-to-face communication be practical in real world applications.
However, higher bandwidth is still highly demanded due to the increasing intensive
communication. And after decades, robust and realistic real time face tracking still poses a
big challenge. The difficulty lies in a number of issues including the real time face feature
tracking under a variety of imaging conditions (e.g., skin color, pose change, self-occlusion
and multiple non-rigid features deformation) K. Ki-Sang (2007).
Our study aims to develop an automatic facial expression recognition system. This system
analysis the movement of the eyebrows, lips and eyes from video sequences, to determine
whether a person is happy, sad, disgust or fear.
In this paper, we concentrate our work on facial feature tracking. Our real time facial
features tracking system is outlined in figure 1, which is constituted of two important
modules:
1. Extract features in facial image, using a geometrical model and gradient projection
Abdat et al. (2008).
2. Facial feature points tracking with optical flow (pyramidal Lucas-Kanade algorithm)
Bouguet (2000).
The organization of this paper is as follows: in section 2, we will present a face detection
algorithm with HAAR-like features. Facial feature points extraction with a geometrical
model and gradient projection will be described in section 3. The tracking of facial feature
points with Pyramidal Lucas-Kanade will be presented in section 4. Finally the concluding
remarks will be given in section 5.
2. Face detection
Face detection is the first step in our facial expression recognition system, which consist to
delimit the face area with a rectangle. For this, we have used a modified Viola & Joness face
detector based on the Haar-like features Viola & Jones (2001).
Source: Human-Robot Interaction, Book edited by: Daisuke Chugo,
ISBN 978-953-307-051-3, pp. 288, February 2010, INTECH, Croatia, downloaded from SCIYO.COM
172
Human-Robot Interaction
Real Time Facial Feature Points Tracking with Pyramidal Lucas-Kanade Algorithm
173
174
Human-Robot Interaction
first and is intended to reject majority of sub-window before calling more complex classifiers
P. Viola & M. Jones (2001).
Real Time Facial Feature Points Tracking with Pyramidal Lucas-Kanade Algorithm
175
2.
Then, we find the maximum value which corresponds to the line contains eyes. This
line corresponds to many transitions: skin to sclera, sclera to iris, iris to pupil and the
same thing for the other side (high gradient).
Median axis location is a vertical line which devises the frontal face in two equal sides.
In other words, it is the line passed by the nose. To determine the median axis, we take
the median of the bounding box of the face.
176
Human-Robot Interaction
3.
Mouth axis location is determined as the same way of eyes axis. For the localization of
this axis, we look for the maximum value of the projection curve in the low part of the
bounding box from eye axis.
Once the eyes and mouth axis are located, we use the geometric face model Shih & Chuang
(2004) which suppose that:
The vertical distance between two-eyes and the center of the nostrils is 0.6D.
Real Time Facial Feature Points Tracking with Pyramidal Lucas-Kanade Algorithm
177
Fig. 8. Uniform distribution selection from the bounding box of facial feature.
178
Human-Robot Interaction
However, the equation 8 can not determine with a single way the optical flow. The
indetermination of optical flow due to the absence of global constraint in the precedent
equations, only gradients which are local measures are taken into account. Lucas and
Kanade have added new constraints to ensure the uniqueness of the solution. The method of
applying a calculation of least squares to minimize
Lucas and Kanade consists to find
constraint. They define a pre-neighborliness, and they optimize in order to give a solution
of the following system for n points:
(9)
4.1 Discussion
After feature points extraction using the uniform distribution, we have used pyramidal
Lucas-Kanade algorithm to track those points as shown in Figure 9. This algorithm has less
computation. So, it is adapted for real time application. A motion, caused by a real movingface, should be highly correlated in space and time domains. In other words, a moving-face
in a video sequence should be seen as the conjunction of several smoothed and coherent
observations over time.
Tracking a set of interest points is based on valuation techniques of movement between two
consecutive images. To obtain a reliable tracking, it is important that these issues be
discriminating in the image. For example, a point in the midst of a region of a uniform
image may not be identified precisely because all the neighboring pixels are similar. Also, an
interest point is normally a point which has a position in the image with strong bidirectional changes. The points tracking consist to identify a set of N interest points in order
to model the interest region, and compute a location of each item according to calculations
of optical flow.
Figure 9, shows an example of points tracking. These points are selected using the uniform
distribution. It can be noted that from the second image, points began to disperse in
arbitrary manner diverging from the correct position.
With the uniform distribution, we have got a bad results because these points havent a
strong bidirectional variation. In order to resolve this problem, we search for the strong
points in the image, for this reason, we have look for good features to track of Shi & Tomasi
(1994).
4.2 Good features to track of Shi and Thomasi:
In order to compare the obtained results using uniform distribution, we have used the
method of Shi and Thomasi for interest points extraction. This method is based on the
general assumption that the luminance intensity does not change for image acquisition.
Real Time Facial Feature Points Tracking with Pyramidal Lucas-Kanade Algorithm
179
180
Human-Robot Interaction
4.3 Detection of facial feature points using the Shi and Thomasi method:
Figure 10 shows the obtained results for feature points detection with the method of Shi and
Thomasi (video sequence, real time acquisition) applied to the whole image. We can see a
good tracking for these points in the remaining of the sequence, unlike the first method
(uniform distribution), which prove that the Pyramidal Lucas-Kanade Feature Tracker need
a strong points to be tracked.
Fig. 10. Extraction of feature points in the first frame and feature points tracking using
pyramidal Lucas-Kanade feature tracker in the remaining of the sequence.
The method of Shi and Thomasi ensures good detection of points that have strong gradient.
This good detection leads to a good tracking of these points.
4.4 Detection of facial feature points in the bounding box:
In the previous section, we have presented a detection of interest points in the face,
however, we need only the points which surround facial features such as eyes, eyebrows
and mouth. For this reason, we will reject all the pixels outside the rectangle. Figure 11
shows interest region, which will be used for the detection of points with Shi and Thomasi
method.
Fig. 11. The interest region feature point extraction in the first frame.
Real Time Facial Feature Points Tracking with Pyramidal Lucas-Kanade Algorithm
181
Figure 12 shows an example of points tracking in the bounding box. The tracking is very
well; the first image presents the detection of points in the bounding boxes which delimit
the facial features using Shi and Thomasi method. These detected points correspond to
pixels with strong gradient. The following images present the 1st, 2nd, 22th and 46th frames
in the first video sequence and the 1st, 2nd, 51th and 67th for the second sequence.
Our system is implemented in VC.NET on a pentium IV with 2GHz under windows XP. The
table 1 presents the elapsed time for each step in our system. The size of the frame is 576
720 and the video sequence format is AVI I420.
For the first frame, the elapsed time for rectangle localization is 0.281S and the elapsed time
for point detection is 1.40S. For the remaining of the sequence, we can track the detected
points for 0.031s.
6. References
Abdat, F., Maaoui, C. & Pruski, A. (2008). Real facial feature points tracking with pyramidal
lucas-kanade algorithm, IEEE RO-MAN08, The 17th International Symposium on
Robot and Human Interactive Communication, Germany.
Bouguet, J. (2000). Pyramidal implementation of the lucas kanade feature tracker, Intel
Corporation Microprocessor Research Labs.
H. Xiaolei, Z. Song, W. Y. M. D.-S. D. (2004). A hierarchical framework for high resolution facial
expression tracking, 3rd IEEE Workshop on articulated and non rigid motion ANM 2004.
K. Ki-Sang, J. Dae-Sik, C. H.-I. (2007). Real time face tracking with pyramidal lucas-kanade
feature tracker, Computational science and its applications ICCSA 2007 4705: 10741082.
182
Human-Robot Interaction
Fig. 12. Extraction of feature points in the bounding box for the first frame and Feature
points tracking using pyramidal Lucas-Kanade in the remaining of the sequence.
P.Viola & M.Jones (2001). Rapid object detection using a boosted cascade of simple features,
Conference on CVPR 2001.
R.Belaroussi & Milgram, M. (2006). Face tracking and facial features detection with a
webcam, CVMP 2006.
Shi, J. & Tomasi, C. (1994). Good features to track, IEEE Conf. Computer Vision and Pattern
Recognition Seattle CVPR94.
Shih, F. & Chuang, C. (2004). Automatic extraction of head and face boundaries and facial
features, Information Sciences 158: 117130.
Su, M. & Hsieh, Y. (2007). A simple approach to facial expression recognition, Proceeding
WSEAS 2007 Australia.
Viola, P. & Jones,M. (2001). Robust real-time object detection, 2nd international workshop
on statistical and computational theories of vision - modeling, learning, computing,
and sampling Vancouver, Canada.