A Survey On Object Detection and Tracking Algorithms (2013)
A Survey On Object Detection and Tracking Algorithms (2013)
Thesis submitted in
June 2013
to the department of
Computer Science and Engineering
of
National Institute of Technology Rourkela
in partial fulfillment of the requirements
for the degree of
Master Of Technologyy
by
Rupesh Kumar Rout
(Roll 211CS1049)
I am grateful to numerous local and global peers who have contributed towards
shaping this thesis. I am very much obliged to Prof. B.Majhi for his guideline,
advice and support during my thesis work.I am very much indebted to Prof. Ashok
Kumar Turuk, Head-CSE, for his con-tenuous encouragement and support. He is
always ready to help with a smile. I am also thankful to all the professors of the
department for their support. I am really thankful to my all friends. My sincere
thanks to everyone who has provided me with kind words, a welcome ear, new ideas,
useful criticism, or their invaluable time, I am truly indebted. I must acknowledge
the academic resources that I have got from NIT Rourkela. I would like to thank
administrative and technical staff members of the Department who have been kind
enough to advise and help in their respective roles. Last, but not the least, I would
like to dedicate this thesis to my family, for their love, patience, and understanding.
Certificate ii
Acknowledgement iii
Abstract iv
1 Introduction 1
1.1 Object Detection and Tracking . . . . . . . . . . . . . . . . . . . . . 1
1.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Background 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Object Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Object Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Feature Selection for Tracking . . . . . . . . . . . . . . . . . . 17
2.4.2 Single Camera Object Tracking: . . . . . . . . . . . . . . . . . 18
2.4.3 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
iv
2.4.4 Motion Segmentation . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.5 Segmentation Methods . . . . . . . . . . . . . . . . . . . . . . 19
2.4.6 Foreground Segmentation . . . . . . . . . . . . . . . . . . . . 20
2.4.7 Moving object Detection . . . . . . . . . . . . . . . . . . . . . 20
2.4.8 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.9 Tracking Methods: . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.10 Prediction Methods: . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Object Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5.1 Block Matching Method . . . . . . . . . . . . . . . . . . . . . 27
2.5.2 Tracking Method . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.1 General Application . . . . . . . . . . . . . . . . . . . . . . . 31
2.6.2 State Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.3 The Discrete Kalman Filter . . . . . . . . . . . . . . . . . . . 32
2.6.4 Discrete Kalman Filter Algorithm . . . . . . . . . . . . . . . . 35
2.6.5 Algorithm Discussion . . . . . . . . . . . . . . . . . . . . . . . 36
2.6.6 Filter Parameter And Tuning . . . . . . . . . . . . . . . . . . 37
2.7 Motion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Literature Survey 40
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Simulative Result of Detection and Tracking Algorithm . . . . . . . . 44
3.4 Motion Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Frame difference method . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6 Background subtraction . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.7 Simple Background Subtraction . . . . . . . . . . . . . . . . . . . . . 48
3.8 Running Average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
v
3.9 Morphological Operation . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.10 Gaussian Mixture Model . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.11 W 4 background subtraction . . . . . . . . . . . . . . . . . . . . . . . 53
3.12 Background Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.12.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.13 Object Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.13.1 Block Matching Method . . . . . . . . . . . . . . . . . . . . . 57
3.13.2 Tracking Method . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.13.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . 58
3.14 Motion-Based Multiple Object Tracking . . . . . . . . . . . . . . . . 59
3.15 Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.16 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4 Conclusions 63
4.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Bibliography 64
List of Figures
Introduction
1
Chapter 1 Introduction
of a variable number of objects are crucial tasks for a wide range of home, business,
and industrial applications such as security, surveillance, management of access
points, urban planning, traffic control, etc. However, these applications were not
still playing an important part in consumer electronics. The main reason is that they
need strong requirements to achieve satisfactory working conditions, specialized and
expensive hardware, complex installations and setup procedures, and supervision of
qualified workers. Some works have focused on developing automatic detection and
tracking algorithms that minimizes the necessity of supervision. They typically use a
moving object function that evaluates each hypothetical object configuration with the
set of available detections without to explicitly compute their data association. Thus,
a considerable saving in computational cost is achieved. In addition, the likelihood
function has been designed to account for noisy, false and missing detections. The
field of machine (computer) vision is concerned with problems that involve interfacing
computers with their surrounding environment. One such problem, surveillance, has
an objective to monitor a given environment and report the information about the
observed activity that is of significant interest. In this respect, video surveillance
usually utilizes electro-optical sensors (video cameras) to collect information from
the environment. In a typical surveillance system, these video cameras are mounted
in fixed positions or on pan-tilt devices and transmit video streams to a certain
location, called monitoring room. Then, the received video streams are monitored
on displays and traced by human operators. However, the human operators might
face many issues, while they are monitoring these sensors. One problem is due to the
fact that the operator must navigate through the cameras, as the suspicious object
moves between the limited field of view of cameras and should not miss any other
object while taking it. Thus, monitoring becomes more and more challenging, as the
number of sensors in such a surveillance network increases. Therefore, surveillance
systems must be automated to improve the performance and eliminate such operator
errors. Ideally, an automated surveillance system should only require the objectives
2
Chapter 1 Introduction
3
Chapter 1 Introduction
1.2 Overview
In moving object detection various background subtraction techniques available
in the literature were simulated. Background subtraction involves the absolute
difference between the current image and the reference updated background over
a period of time. A good background subtraction should be able to overcome the
problem of varying illumination condition, background clutter, shadows, camouflage,
bootstrapping and at the same time motion segmentation of foreground object should
be done at the real time. It’s hard to get all these problems solved in one background
subtraction technique. So the idea was to simulate and evaluate their performance
on various video data taken in complex situations.
Object tracking is a very challenging task in the presence of variability
Illumination condition, background motion, complex object shape, partial and full
object occlusions. Here in this thesis, modification is done to overcome the problem
of illumination variation and background clutter such as fake motion due to the leaves
of the trees, water flowing, or flag waving in the wind. Sometimes object tracking
involves tracking of a single interested object and that is done using normalized
correlation coefficient and updating the template.
On developing a framework to detect moving objects and generate reliable tracks
from surveillance video. After setting up a basic system that can serve as a
platform for further automatic tracking research, the question of variation in distances
between the camera and the objects in different parts of the scene (object depth) in
surveillance videos are takled. A feedback-based solution to automatically learn
the distance variation in static-camera video scenes is implemented based on object
motion in different parts of the scene. It gives more focus towards the investigation
of detection and tracking of objects in video surveillance. The surveillance system
is the process of monitoring the behavior, activities or other changing information,
usually people for the purpose of influencing, managing, directing, and protecting.
Most of the surveillance system includes static camera and fixed background which
4
Chapter 1 Introduction
gives a clue for the object detection in videos by background subtraction technique.
In surveillance system three main important steps these are object detection, object
tracking and recognition. Some challenges in video processing Video analysis, video
segmentation, video compression, video indexing. In case of video analysis there are
three key steps: detection of interesting moving object, tracking of such objects from
frame to frame and analysis of objects tracks to recognize their behavior. Next it
comes video segmentation it means separation of objects from the background. It
also consists of three important steps: object detection, object tracking and object
recognition. In this work it is given more focus towards the investigation video
analysis and video segmentation section.
5
Chapter 1 Introduction
1.3 Motivation
After studying the literature, it is found that detecting the object from the video
sequence and also track the object it is a really challenging task. Object tracking can
be a time consuming process due to amount of data that is contained in the video.
From the literature survey it is found that there are many background subtraction
algorithm exits which work efficiently in both indoor and outdoor surveillance system.
Julio et al. [3] has proposed a background modeling technique and used another
algorithm to detect shadowed region. But the shadow removal technique is an
overhead for object tracking algorithm. It will be better if the shadow can be
removed at the time of the foreground object detection algorithm by designing an
efficient algorithm, which can properly classify the foreground object and background
removing false foreground pixel from detection. Then there will no extra computation
needed for shadow detection and removal.
Video surveillance is the most active research topic in computer vision for humans
and vehicles. Here the aim is to develop an intelligent visual surveillance system by
re-placing the age old tradition method of monitoring by human operators. The
motivation in doing is to design a video surveillance system for motion detection,
and object tracking.
The area of automated surveillance systems is currently of immense interest due
to its implications in the field of security. Surveillance of vehicular traffic and human
activities offers a context for the extraction of significant information such as scene
motion and traffic statistics, object classification, human identification, anomaly
detection, as well as the analysis of interactions between vehicles, between humans
or between vehicles and humans. A wide range of research possibilities is open in
relation to video surveillance and tracking.
6
Chapter 1 Introduction
1.4 Objective
This thesis aims to improve the performance of object detection and tracking
by contributing originally to two components (a) motion segmentation (b) object
tracking.
Automatic tracking of objects can be the foundation for many interesting
applications. An accurate and efficient tracking capability at the heart of such a
system is essential for building higher level vision-based intelligence. Tracking is not
a trivial task given the non-deterministic nature of the subjects, their motion, and
the image capture process itself. The objective of video tracking is to associate target
objects in consecutive video frames. The association can be especially difficult when
the objects are moving fast relative to the frame rate.
from the previous section it is found that there are many problems in detecting
of an object and tracking of objects and also recognition for fixed camera network.
The goal of the work in this thesis is twofold:
To analyze some tracking method for tracking the single objects and multiple
objects.
7
Chapter 1 Introduction
Chapter2: This chapter discusses about the background concepts related to this
project work. The chapter also discusses object segmentation in image sequences,
background modeling and tracking approaches. The architecture and block diagram
of tracking flow systems are also explained in this chapter.
Chapter 3: The literature surveys that have been done during the research work
has ben discussed here. It also provides a detailed survey of the literature related
to motion detection and object tracking.Discussion about the existing and some new
methods for detection and tracking of objects are done. In this chapter existing
methods are discussed and also examined. This chapter presents the methodology
and implementation of some existing and experimental results subsequently
Chapter 4: This chapter provides concluding comments those can be made to the
project.
8
Chapter 2
Background
2.1 Introduction
Object tracking is an important job within the field of computer vision. Object
detection involves locating objects in frames of a video sequence. Tracking is the
process of locating moving objects or multiple objects over a period of time using
a camera. Technically, tracking is the problem of estimating the trajectory or path
of an object in the image plane as it moves around a scene. The high-powered
computers, the availability of high quality and inexpensive video cameras, and the
increasing need for automated video analysis has generated a great deal of interest
in object tracking algorithms. There are three key steps in video analysis:
So now the question arises here that, where object tracking is suitable to apply?
Mainly the use of object Tracking is pertinent in the task of:
Motion-based recognition
9
Chapter 2 Background
automated surveillance
video indexing
human-computer interaction
traffic monitoring
vehicle navigation
Noise in images,
10
Chapter 2 Background
tracking have been proposed. These differ from each other based on the way
they approach the following questions:
– How should the motion, appearance and shape of the object be modeled?
11
Chapter 2 Background
Object representations. (a) Centroid, (b) multiple points, (c) rectangular patch,
(d) elliptical patch, (e) part-based multiple patches, (f) object skeleton, (g) complete
object contour, (h) control points on object contour, (i) object silhouette.
Similarly there are a various ways to represent the appearance feature of objects.
It should be noted that the shape representation can be combined with appearance
representations for tracking. Some common appearance representations in the case
of object tracking are described in [1] as follows.
12
Chapter 2 Background
13
Chapter 2 Background
14
Chapter 2 Background
15
Chapter 2 Background
16
Chapter 2 Background
17
Chapter 2 Background
Mostly features are chosen manually by the user depending on the application.
The problem of automatic feature selection has received significant attention in the
pattern recognition community. Automatic feature selection methods can be divided
into, Filter Methods and Wrapper Methods. Filter methods try to select the features
based on general criteria, whereas Wrapper methods selects the features based on
the usefulness of the features in a specific problem domain.
Till now various concepts of object tracking are being disscussed.It is necessary to
know how the tracking occurs in front of a single fix camera. It is very important to
track properly in a single camera so that it will be easy for us to track it in multiple
cameras. Whenever tracking is being done using single camera there are various
challenges need to be taken care of.
18
Chapter 2 Background
2.4.3 Segmentation
The first job in any surveillance application is to distinguish the target objects in
the video frame. Most pixels in the frame belong to the background and static
regions, and suitable algorithms are needed to detect individual targets in the
scene. Since motion is the key indicator of target presence in surveillance videos,
motion-based segmentation schemes are widely used. An effective and simple method
for approximating the background that enables the detection of individual moving
objects in video frames is being utilized.
These are the different segmentation technique which will be discussed in details in
chapter 4.There are numerous proposals for the solution of moving object detection
19
Chapter 2 Background
problem in the surveillance system. Although there are numerous proposals for the
solution of moving object detection problem in surveillance systems, some of these
methods are found out to be more promising by the researchers in the field. Methods
are like
Foreground segmentation is the process of dividing a scene into two classes; one is
foregrounding another one is background. The background is the region such as
roads, buildings and furniture. While the background is fixed, its appearance can
be expected to change over time due to factors such as changing weather or lighting
conditions. The foreground any element of the scene that is moving or expected to
move and some foreground elements may actually be stationary for long periods of
time such as parked cars, which may be stationary for hours at a time. It is also
possible that some elements of background may actually move, such as trees moving
in a breeze.
The main approaches to locating foreground objects within in the surveillance
system is
20
Chapter 2 Background
scene. These foreground regions have a significant role in subsequent actions, such
as tracking and event detection.
The objective of video tracking is to associate target objects in consecutive video
frames. The association can be especially difficult when the objects are moving fast
relative to the frame rate. Another situation that increases the complexity of the
problem is when tracked object changes orientation over time. For these situations
video tracking systems usually employ a motion model which describes how the
image of the target might change for the different possible motion of the objects.
Numerous approaches for object tracking have been proposed. These primarily differ
from each other based on the way they approach the following questions: Which
object representation is suitable for tracking? Which image features should be used?
How should the motion, appearance and shape of the object be modeled?
The answers to these questions depend on the context or environment in which
the tracking is performed and the end use for which the tracking information is being
sought.
Moving object segmentation is simply based on a comparison between the input
frame and a certain background mode, and different regions between the input and
the model are labeled as foreground based on this comparison. This assessment can
be the simple frame differencing, if the background is static (has no moving parts
and is easier to model). However, more complex comparison methods are required
to segment foreground regions when background scenes have dynamic parts, such as
moving tree branches and bushes. In the literature, there are various algorithms,
which can cope with these situations that will be discussed in the following sections
[2].
Nearly, every tracking system starts with motion detection. Motion detection
aims at separating the corresponding moving object region from the background
image. The first process in the motion detection is capturing the image information
using a video camera. The motion detection stage includes some image preprocessing
21
Chapter 2 Background
step such as; gray-scaling and smoothing, reducing image resolution using low
resolution image technique, frame difference, morphological operation and labeling.
The preprocessing steps are applied to reduce the image noise in order to achieve
a higher accuracy of the tracking. The smoothing technique is performed by using
median filter. The lower resolution image is performed in three successive frames to
remove the small or fake motion in the background. Then the frame difference is
performed on those frames to detect the moving object emerging on the scene. The
next process is applying a morphological operation such as dilation and erosion as
filtering to reduce the noise that remains in the moving object. Connected component
labeling is then performed to label each moving object in different label.
The second stage is tracking the moving object. In this stage, a block matching
technique to track only the interest moving object among the moving objects
emerging in the background, is performed. The blocks are defined by dividing the
image frame into non-overlapping square parts. The blocks are made based on PISC
image that considers the brightness change in all the pixels of the blocks relative to
the considered pixel.
The last stage is object identification. For this purpose spatial and color
information of the tracked object as the image feature is used. Then, a feature queue
is created to save the features of the moving objects. When the new objects appear
on the scene, they will be tracked and labeled, and the features of the object are
extracted and recorded into the queue. Once a moving object is detected, the system
will extract the features of the object and identify it from the identified objects in
the queue. A few details of each stage are described as follows.
22
Chapter 2 Background
extraction of the foreground objects, making moving object detection a crucial part
of the system. In order to decide on whether some regions in a frame are foreground
or not there should be a model for the background intensities. Any change, which
is caused by a new object, should be detected by this model, whereas un-stationary
background regions, such as branches and leaves of a tree or a flag waving in the
wind, should be identified as a part of the background. So to handle these problems
a method was proposed.
Mainly object detection method consists of two main steps. The first step is a
preprocessing step including gray scaling, smoothing, and reducing image resolution
and so on. The second step is filtering to remove the image noise contained in the
object. The filtering is performed by applying the morphology filter such as dilation
23
Chapter 2 Background
and erosion. And finally connected component labeling is performed on the filtered
image
Pre-processing : In the preprocessing phase, the first step of the moving object
detection process is capturing the image information using a video camera. In order
to reduce the processing time, a grayscale image is used on entire process instead of
the color image. The grayscale image only has one color channel that consists of 8
bits while RGB image has three color channels. Image smoothing is performed to
reduce image noise from input image in order to achieve high accuracy for detecting
the moving objects. The smoothing process is performed by using a median filter
with m × m pixels. Here, un-stationary background such as branches and leaf of a
tree as part of the background are considered. The un-stationary background often
24
Chapter 2 Background
considers as a fake motion other than the motion of the object interest and can
cause the failure of detection of the object. To handle this problem, the resolution
of the image is reduced to be a low resolution image. A low resolution image is done
by reducing spatial resolution of the image with keeping the image size. The low
resolution image can be used for reducing the scattering noise and the small fake
motion in the background because of the un-stationary background such as leaf of a
tree. These noises that have small motion region will be disappeared in low resolution
image.
Next it comes filtering phase. In order to fuses narrow breaks and long thin gulfs,
eliminates small holes, and fills gaps in the contour, a morphological operation is
applied to the image. As a result, small gaps between the isolated segments are erased
and the regions are merged. To extract the bounding boxes of detecting objects,
connected component analysis was used. Morphological operation is performed to
fill small gaps inside the moving object and to reduce the noise remained in the
moving objects. The morphological operators implemented are dilation followed by
erosion. In dilation, each background pixel that is touching an object pixel is changed
into an object pixel. Dilation adds pixels to the boundary of the object and closes
isolated background pixel. Dilation can be expressed as:
1, if there is one or more pixels of the 8 neighbors are 1
f (x, y) = (2.1)
0, otherwise
In erosion, each object pixel that is touching a background pixel is changed into
a background pixel. Erosion removes isolated foreground pixels. Erosion can be
expressed as:
0, if there is one or more pixels of the 8 neighbors are 0
f (x, y) = (2.2)
1, otherwise
25
Chapter 2 Background
An important part of a tracking system is the ability to predict where an object will
be next frame. This is needed to aid the matching the tracks to detect objects and to
predict the position during occlusion. There are four common approaches to predict
the objects’ positions:
1. Block matching
2. Kalman filters
3. Motion models
4. particle filter
26
Chapter 2 Background
The entire process of tracking the moving object is illustrated in the following Fig
2.6. The block matching method is well described in [4], which is applied here.
Block matching is a technique for tracking the interest moving object among
the moving objects emerging on the scene. In this article, the blocks are defined
by dividing the image frame into non-overlapping square parts. The blocks are
27
Chapter 2 Background
made based on peripheral increment sign correlation (PISC) image that considers
the brightness change in all the pixels of the blocks relative to the considered pixel.
In the Fig it shows the block in the PISC image with block size is 55 pixels. Therefore,
one block consists of 25 pixels. The blocks of the PISC image in the previous frame
are defined as shown in Eq. (2.3). Similarly, the blocks of the PISC image in the
current frame are defined in Eq. (2.4). To determine the matching criteria of the
blocks in two successive frames, the evaluation is done using a correlation value
that expresses in Eq. (2.5). This equation calculates the correlation value between
block in the previous frame and the current one for all pixels in the block. The
high correlation value shows that the blocks are matching each other. The interest
moving object is determined when the number of matching blocks in the previous
and current frame are higher than the certain threshold value. The threshold value
is obtained experimentally.
1, if fnp≥f (i,j)
bnp = (2.3)
o, otherwise
1, if fnp≥f (i,j)
′
bnp = (2.4)
o, otherwise
∑
N
′
∑
N
′
corrn = bnp ∗ bnp + (1 − bnp ) ∗ (1 − bnp ) (2.5)
P =0 P =0
′
where p & p are the block in the previous and current frame, n is the number of
block and N is the number of pixels of block.
The tracking method used in this article can be described as follows. The matching
process is illustrated in Fig.2.6. Firstly, blocks and the tracking area are made only
in the area of moving objects to reduce the processing time.The block size (block
A) is made with 9x9 pixels in the previous frame. It is assumed that the object
28
Chapter 2 Background
coming firstly will be tracked as the interest moving object. The block A will search
the matching block in each block of the current frame by using correlation value as
expressed in Eq. (2.5). In the current frame, the interest moving object is tracked
when the object has maximum number of matching blocks. When that matching
criteria are not satisfied, the matching process is repeated by enlarging the tracking
area (the rectangle with dash line). The blacks still are made in the area of moving
objects. When the interest moving object still cannot be tracked, then the moving
object is categorized as not interest moving object or another object and the tracking
process is begun again from the begin.
The feature of objects extracted in the spatial domain is the position of the tracked
object. The spatial information combined with the features in time domain represents
the trajectory of the tracked object, so the movement and the speed of the moving
objects that are tracked can be estimated. Therefore, the features of spatial domain
are very important to object identification. The bounding box defined in Eq. (2.4)
is used as spatial information of moving objects.
After getting the interest moving object, that is extracted by using a bounding
box. The bounding box can be determined by computing the maximum and minimum
value of x and y coordinates of the interest moving object according to the following
equation:
{
i
Bmin i
= (ximin , ymin )|x, y ∈ Oi } (2.6)
{
i
Bmax = (ximax , ymax
i
)|x, y ∈ Oi } (2.7)
where Oi denotes set of coordinate of points in the interest moving object i,Bmin
i
i
is the left top corner cordinates of the interest moving object i, and Bmax is the right
bottom corner cordinates of the interesting moving object i. In the chapter 4 shows
the bounding box of the object tracking.
29
Chapter 2 Background
The prediction
The correction
In the first step the state is predicted with the dynamic model. The prediction
step uses the state model to predict the new state of the variables.
t
X = DX t−1 + W (2.8)
∑t ∑
t−1
t t
=D D +Q (2.9)
∑t
Where X t and are the state and covariance predictions at time t. D is the
state transition matrix which defines the relation between the state variables at time
t and t-1. Q is the covariance of the noise W. Similarly the correction step uses the
current observation Z t to update the object state
∑
t ∑
t
t
K = M [Mt
M t + Rt ]−1 (2.10)
t
X t = X + K t [Z t − M X t ] (2.11)
∑
t ∑
t ∑
t
= −K M t
(2.12)
30
Chapter 2 Background
The basic components of the Kalman filter are the state vector, the dynamic model
and the observation model, which are described below
31
Chapter 2 Background
The state vector contains the variables of interest. It describes the state of the
dynamic system and represents its degrees of freedom. The variables in the state
vector cannot be measured directly but they can be inferred from the values that
are measurable. Elements of the state vector can be positioned, velocity, orientation
angles, etc. A very simple example is a train that is driving with a constant velocity
on a straight rail. In this case the train has two degrees of freedom, the distance and
the velocity. The state vector has two values at the same time; one is the predicted
value before the update and the posterior value after the update.
In 1960, R.E. Kalman published his famous paper describing a recursive solution
to the discrete data linear filtering problem. Since that time, due in large part
to advances in digital computing; the Kalman filter has been the subject of
extensive research and application, particularly in the area of autonomous or assisted
navigation.
The main problem with Kalman filtering is that statistical models are required
for the system and the measurement instruments. Unfortunately, they are typically
not available, or difficult to obtain. The two most commonly recommended methods
of approaching this problem are:
32
Chapter 2 Background
Where A is referred to as the state transition matrix and w is a noise term. This
noise term is a Gaussian random variable with zero mean and a covariance matrix
Q, so its probability distribution is
p (w) ∼ N (O, Q)
The covariance matrix Q will be referred to as the process noise covariance matrix
in the remainder of this report. It accounts for possible changes in the process between
t and t + 1 that are not already accounted for in the state transition matrix. Another
assumed property of w is that it is independent of the state xt . The measurement is
33
Chapter 2 Background
yt = Cxt + wt (2.14)
34
Chapter 2 Background
The Kalman filter estimates a process by using a form of feedback control: the filter
estimates the process state at some time and then obtains feedback in the form of
(noisy) measurements. As such, the equations for the Kalman filter fall into two
groups:
The time update equations are responsible for projecting forward (in time) the
current state and error covariance estimates to obtain the next time step. The
measurement update equations are responsible for the feedback-i.e. for incorporating
a new measurement into the improved estimate. The time update equations can
also be thought of as predictor equations, while the measurement update equations
can be thought of as corrector equations. The specific equations for the time and
measurement updates are presented below
Discrete Kalman filter time update equations.
Kk = Pk− H T + Pk H + R− (2.17)
Xk = Xk− − H ∧ xk (2.18)
This recursive nature is one of the very appealing features of the Kalman filter
it makes practical implementations much more feasible than an implementation of
a which is designed to operate on all of the data directly for each estimate. The
Kalman filter instead recursively conditions the current estimate on all of the past
measurements.
35
Chapter 2 Background
The Kalman filter estimates a process by using a form of feedback control: The filter
estimates the process state at some time and then obtains feedback in the form of
measurements.
As such, the equations for the Kalman filter fall into two groups: time update
equations and measurement update equations. The time update equations are
responsible for projecting forward (in time) the current state and error covariance
estimates to obtain the a priori estimates for the next time step. The measurement
update equations are responsible for the feedback-i.e. for incorporating a new
measurement into the a priori estimate to obtain an improved a posterior estimate.
The time update equations can also be thought of as predictor equations, while the
measurement update equations can be thought of as corrector equations. Indeed the
final estimation algorithm resembles that of a predictor-corrector algorithm as shown
36
Chapter 2 Background
in figure.
The specific equations for the time and measurement updates are presented below.
Discrte Kalman Filter Time Update
b−
xk = AxK−1 + Buk−1 (2.19)
Pk− = APk−1 A + Q
T
(2.20)
As shown the time update equations above project the state and covariance
estimates forward from time step to step.
Discrete Kalman Filter Measurement Update
bk − + K(Zk − H)b
bk = x
x xk − (2.22)
Pk = (I − Kk H)Pk−1 (2.23)
The first task during the measurement update is to compute the Kalman gain.
The next step is to actually measure the process to obtain and then to generate and a
posterior state estimate by incorporating the measurement as in equation (2.22). The
final step is to obtain a posterior error covariance estimate (2.23). After each time
and measurement update pair, the process is repeated with the previous posterior
estimates used to project or predict the new a priori estimates. This recursive nature
is one of the very appealing features of the Kalman filter. Figure (2.9) below offers
a complete picture of the operation of the filter, combining figure (2.10)
37
Chapter 2 Background
38
Chapter 2 Background
statistical models are required for the system and the measurement instruments.
Unfortunately, they are typically not available, or difficult to obtain. The two most
commonly recommended methods of approaching this problem are:
where P (t + 1) is the expected position at the next time step, P (t) is the
position at current time step, and V (t) is the velocity at the current time step. For
the simplest implementation,
39
Chapter 3
Literature Survey
3.1 Introduction
The research conducted so far for object detection and tracking objects in video
surveillance system are disscussed in this chapter. The set of challenges outlined
above span several domains of research and the majority of relevant work will
be reviewed in the upcoming chapters. In this section, only the representative
video surveillance systems are discussed for better understanding of the fundamental
concept. Tracking is the process of object of interest within a sequence of frames,
from its first appearance to its last. The type of object and its description within the
system depends on the application. During the time that it is present in the scene it
may be occluded by other objects of interest or fixed obstacles within the scene. A
tracking system should be able to predict the position of any occluded objects.
Object tracking systems are typically geared towards surveillance application
where it is desired to monitor people or vehicles moving about an area. There
are two district approaches to the tracking problem, top-down and another one is
bottom-up. Top-down methods are goal oriented and the bulk of tracking systems are
designed in this manner. These typically involve some sort of segmentation to locate
region of interest, from which objects and features can be extracted for the tracking
40
Chapter 3 Literature Survey
41
Chapter 3 Literature Survey
between two frame images to extract the moving regions. In another work, Stauffer
& Grimson et al. [6] proposed a Gaussian mixture model based on background model
to detect the object. Liu et al. [7] ,proposed background subtraction to detect moving
regions in an image by taking the difference between current and reference background
image in a pixel-by-pixel. Collins et al. [8], developed a hybrid method that combines
three-frame differencing with an adaptive background subtraction model for their
VSAM (Video Surveillance and Monitoring) project. Desa & Salih et al [9], proposed
a combination of background subtraction and frame difference that improved the
previous results of background subtraction and frame difference. Sugandi et al. [10],
proposed a new technique for object detection employing frame difference on low
resolution image. Julio cezar et al. [3] has proposed a background model, and
incorporate a novel technique for shadow detection in gray scale video sequences.
Satoh et al. [11], proposed a new technique for object tracking employing block
matching algorithm based on PISC image. Sugandi et al. [12], proposed tracking
technique of moving persons using camera peripheral increment sign correlation
image. Beymer & konolige et al. [2],1999 proposed in stereo camera based object
tracking, use kalman filter for predicting the objects position and speed in x-2
dimension. Rosals & sclaroff et al.,1999 proposed use of extended kalman filter to
estimate 3D trajectory of an object from 2D motion.
In object detection method, many researchers have developed their methods.
Liu et al., 2001 proposed background subtraction to detect moving regions inan
image by taking the difference between current and reference background image
in a pixel-by-pixel. It is extremely sensitive to change in dynamic scenes derived
from lighting and extraneous events etc. In another work, Stauffer & Grimson,
1997 proposed a Gaussian mixture model based on background model to detect the
object. Lipton et al., 1998 proposed frame difference that use of the pixel-wise
differences between two frame images to extract the moving regions. This method is
very adaptive to dynamic environments, but generally does a poor job of extracting
42
Chapter 3 Literature Survey
all the relevant pixels, e.g., there may be holes left inside moving entities. In order
to overcome disadvantage of two-frames differencing, in some cases three-frames
differencing is used. For instance, Collins et al., 2000 developed a hybrid method
that combines three-frame differencing with an adaptive background subtraction
model for their VSAM (Video Surveillance and Monitoring) project. The hybrid
algorithm successfully segments moving regions in video without the defects of
temporal differencing and background subtraction. Desa & Salih, 2004 proposed
a combination of background subtraction and frame difference that improved the
previous results of background subtraction and frame difference.
In object tracking methodology, this article will describe more about the
region based tracking. Region-based tracking algorithms track objects according
to variations of the image regions corresponding to the moving objects. For these
algorithms, the background image is maintained dynamically and motion regions are
usually detected by subtracting the background from the current image. Wren et al.,
1997 explored the use of small blob features to track a single human in an indoor
environment. In their work, a human body is considered as a combination of some
blobs respectively representing various body parts such as head, torso and the four
limbs. The pixels belonging to the human body are assigned to the differen t body
part’s blobs. By tracking each small blob, the moving human is successfully tracked.
McKenna et al., 2000 proposed an adaptive background subtraction method in which
color and gradient information are combined to cope with shadows and unreliable
color cues in motion segmentation. Tracking is then performed at three levels of
abstraction: regions, people, and groups. Each region has a bounding box and
regions can merge and split. A human is composed of one or more regions grouped
together under the condition of geometric structure constraints on the human body,
and a human group consists of one or more people grouped together.
Cheng & Chen, 2006 proposed a color and a spatial feature of the object to
identify the track object. The spatial feature is extracted from the bounding box
43
Chapter 3 Literature Survey
of the object. Meanwhile, the color features extracted is mean and standard value
of each object. Czyz et al., 2007 proposed the color distribution of the object as
observation model. The similarity of the objects measurement using Bhattacharya
distance. The low Bhattacharya distance corresponds to the high similarity.
To overcome the related problem described above, this article proposed a new
technique for object detection employing frame difference on low resolution image
Sugandi et al., 2007, object tracking employing block matching algorithm based on
PISC image Satoh et al., 2001 and object identification employing color and spatial
information of the tracked object Cheng & Chen, 2006.
Algorithm
This chapter gives the idea about the existing and some modified methods analysis
of algorithms for detection and tracking of objects. First some existing algorithm for
detecting the objects like Frame difference method, Gaussian Mixture model to detect
the object is disccused. Finally, background substrction and background modelling
is shown. After implementing all these exiesting algorithms then put one modified
model for background modelling.
Tracking is the process of object of interest within a sequence of frames,from its
first appearance to its last.The type of object and its description within the system
depends on the application.During the time that it is present in the scene it may be
occluded by other objects of interest or fixed obstacles within the scene.A tracking
system should be able to predict the position of any occluded objects.
Object tracking systems are typically geared towards survillance application
where it is desired to monitor people or vehicles moving about an area.The ball
tracking system has become a stadard feature of tenise and cricket broadcast and
uses object tracking techniques to locate and track the ball as it moves in the court.
44
Chapter 3 Literature Survey
First implementation of an existing algorithm for tracking the object by using Block
matching method is done.
45
Chapter 3 Literature Survey
Frame differencing is the simplest moving object detection method which is based on
determining the difference between input frame intensities and background model by
using pixel per pixel subtraction.
Grad. Sch. of Eng. et al. [5] have proposed frame difference method to detect
the moving objects. In this case, frame difference method is performed on the three
successive frames, which are between frame Fk and Fk−1 and also the frame between
Fk & Fk+1 and the output image as frame difference image is two difference images
dk−1 and dk+1 is expressed as
46
Chapter 3 Literature Survey
background image, which is updated during a period of time. It works well only in the
presence of stationary cameras. The subtraction leaves only non-stationary or new
objects, which include entire silhouette region of an object. This approach is simple
and computationally affordable for real-time systems, but are extremely sensitive
to dynamic scene changes from lightning and extraneous event etc. Therefore it is
highly dependent on a good background maintenance model.
Here in this chapter simulation of different background subtraction techniques
available in the literature, for motion segmentation of object is performed.
Background subtraction detects moving regions in an image by taking the difference
between the current image and the reference background image captured from a static
background during a period of time. The subtraction leaves only non-stationary or
new objects, which include entire silhouette region of an object. The problem with
background subtrac-tion [14], [8] is to automatically update the background from the
incoming video frame and it should be able to overcome the following problems:
Memory: The background module should not use much resource, in terms of
computing power and memory.
47
Chapter 3 Literature Survey
48
Chapter 3 Literature Survey
where α is a learning rate. The binary motion detection mask D(x, y) is calculated
as follows
1, if |It (x, y) − B(x, y)| ≥ τ
D(x, y) = (3.6)
0, otherwise
49
Chapter 3 Literature Survey
In erosion, each object pixel that is touching a background pixel is changed into a
background pixel. Erosion removes isolated foreground pixels. Erosion of set A by
structuring element B [7] is defined as:
∪
A⊖B = (A)−b (3.8)
bϵB
The number of pixels added or removed from the objects in an image depends on the
size and shape of the structuring element used to process the image. Morphological
operation eliminates background noise and fills small gaps inside an object. There
is no fixed limit on the number of times dilation and erosion is performed. In the
given algorithm dilation and erosion is used iteratively till the foreground object
is completely segmented from the background.After morphological operation now
the results of following frames, remove noise from frame difference and background
subtraction frame result.
50
Chapter 3 Literature Survey
Where k is the number of Gaussian Mixture and that is used. The number of k varies
51
Chapter 3 Literature Survey
where
ρ = αη(pt |µi,t−1 , σi,t−1 ) (3.13)
In this case the variable (1|α) defines the speed at which the distribution
parameter changes. In the pixel (pt ) matches the i-th Gaussian, then the matching
remaining (k-1) Gaussians are updated in the following manner,
The values for weight and variance vary based on the significance that is given to
a pixel which is least likely to occure in a perticular way.
All the Gaussian weights are normalized after the update is performed.The
k-Gaussians are then reordered based on their likelihood of exiestence.
52
Chapter 3 Literature Survey
Then (b) distribution are modeled to be the background and the remaining (k-b)
distributionsa are modeled as the foreground for the next pixel.
The values for (b) is determined
( b )
∑
B = argminb ωi > T (3.17)
i=1
Where T is some threshold value which measures the propotion of the data that
needs to match the background and then the first B distribution are choosen as
background model.
53
Chapter 3 Literature Survey
The background cannot remain same for a long period of time, so the initial
back-ground needs to be updated. W 4 uses pixel-based update and object-based
update method to cope with illumination variation and physical deposition of object.
W 4 uses change map for background updation.A detection support map (gS )
computes the number of times the pixel (x, y) is classified as background pixel.A
detection support map g S computes the number of times the pixel (x, y) is classified
as background pixel.
gSt−1 (x, y) + 1 if pixel is background;
gSt (x, y) = (3.18)
gSt−1 (x, y)
if pixel is f oreground;
The background cannot remain same for a long period of time, so the initial
back-ground needs to be updated.W 4 uses pixel-based update and object-based
update method to cope with illumination variation and physical deposition of
object.W 4 uses change map for background updation.
A detection support map (gS ) computes the number of times the pixel (x, y) is
classified as background pixel.
gS (x, y) + 1 if pixel is background
t−1
gSt (x, y) = (3.19)
gS (x, y)
t−1 if pixel is f oreground
A motion support map (mS) computes the number of times the pixel(x, y) is
classified as moving pixel.
mSt−1 (x, y) + 1 if Mt (x, y) = 1;
mSt (x, y) = (3.20)
mS (x, y) if Mt (x, y) = 0;
t−1
where
1 if (|It (x, y) − It+1 (x, y)| > 2 ∗ σ)∧
Mt (x, y) = (|It−1 (x, y) − It (x, y) > 2 ∗ σ) (3.21)
0 otherwise
54
Chapter 3 Literature Survey
Figure 3.9: Output after background substraction using Julio cezar method
55
Chapter 3 Literature Survey
3.12.1 Analysis
56
Chapter 3 Literature Survey
The entire process of tracking the moving object is illustrated in the following Fig
2.6. The block matching method is well described in [4], which we have applied here.
Block matching is a technique for tracking the interest moving object among the
moving objects emerging in the scene. In this article, the blocks are defined by
dividing the image frame into non-overlapping square parts. The blocks are made
based on peripheral increment sign correlation (PISC) image Satoh et al., 2001;
Sugandi et al., 2007 that considers the brightness change in all the pixelsof the
blocks relative to the considered pixel. Fig. 9 shows the block in PISC image with
block size is 55 pixels. Therefore, one block consists of 25 pixels. The blocks of the
PISC image in the previous frame are defined as shown in Eq. (2.3). Similarly, the
blocks of the PISC image in the current frame are defined in Eq. (2.4). To determine
the matching criteria of the blocks in two successive frames, evaluation is done using
correlation value that expresses in Eq. (2.5). This equation calculates the correlation
value between block in the previous frame and the current one for all pixels in the
block. The high correlation value shows that the blocks are matched each other.
The interest moving object is determined when the number of matching blocks in
the previous and current frame are higher than the certain threshold value. The
57
Chapter 3 Literature Survey
∑
N
′
∑
N
′
corrn = bnp ∗ bnp + (1 − bnp ) ∗ (1 − bnp ) (3.24)
P =0 P =0
′
where p & p are the block in the previous and current frame, n is the number of
block and N is the number of pixels of block.
The tracking method used in this article can be described as following. The matching
process is illustrated in Fig.2.6. Firstly, blocks and the tracking area are made only
in the area of moving object to reduce the processing time. The previous frame is
devided into block size (block A) with 9x9 pixels in the previous frame. It is assume
that the object coming firstly will be tracked as the interest moving object. The
block A will search the matching block in each block of the current frame by using
correlation value as expresses in Eq.(2.5). In the current frame, the interest moving
object is tracked when the object has maximum number of matching blocks. When
that matching criteria is not satisfied, the matching process is repeated by enlarging
the tracking area (the rectangle with dash line).The blocks still are made inside the
area of moving object. When the interest moving object still cannot be tracked, then
the moving object is categorized as not interest moving object or another object and
the tracking process is begun again from the begin.
The feature of objects extracted inthe spatial domain is the position of the tracked
object. The spatial information combined with the features in time domain represents
58
Chapter 3 Literature Survey
the trajectory of the tracked object, so the movement and speed of the moving objects
can be estimated that needs to tracked. Therefore, the features of spatial domain
are very important toobject identification. The bounding box defined in Eq. (2.4) is
used as spatial information of moving objects.
After getting the interest moving object,then extraction of interest moving object
by using a bounding box. The bounding box can be determined by computing the
maximum and minimum value of x and y coordinates of the interest moving object
according to the following equation:
{
i
Bmin = (ximin , ymin
i
)|x, y ∈ Oi } (3.25)
{
i
Bmax = (ximax , ymax
i
)|x, y ∈ Oi } (3.26)
where Oi denotes set of coordinate of points in the interest moving object i,Bmin
i
i
is the left top corner cordinates of the interest moving object i, and Bmax is the right
bottom corner cordinates of the interesting moving object i.In the chapter 4 shows
the bounding box of the object tracking.
59
Chapter 3 Literature Survey
and automotive safety. The problem of motion-based object tracking can be divided
into two parts
Step 1: Compute the cost of assigning every detection to each track using the
distance method. The cost takes into account the Euclidean distance between
the predicted centroid of the track and the centroid of the detection. It also
includes the confidence of the prediction, which is maintained by the Kalman
filter
60
Chapter 3 Literature Survey
Step 2: Solve the assignment problem represented by the cost matrix using the
assign Detections To Tracks function. The function takes the cost matrix and
the cost of not assigning any detections to a track.
The value for the cost of not assigning a detection to a track depends on the range
of values returned by the distance method of the KalmanFilter. This value must be
tuned experimentally. Setting it too low increases the likelihood of creating a new
track, and may result in track fragmentation. Setting it too high may result in a
single track corresponding to a series of separate moving objects.
3.15 Simulator
Matlab is a simple an event driven simulation tool which provides a platform to
analyze the static and dynamic nature of the video processing. All experiments
relevant to the thesis are carried out on 2.81GHz AMD Athlon 64 X2 Dual
Core processor with 2GB RAM. The experiments are simulated using Matlab
different Version 7.10.0.499 (R2012).In this chapter, the simulator and the simulation
parameter that are used for experiments are discussed .
61
Chapter 3 Literature Survey
3.16 Conclusion
This chapter reviews the literature surveys that have been done during the re-search
work. The related work that has been proposed by many researchers has been
discussed . The research papers related to object detection and tracking of objects
diagnosis from 1998 to 2011 has been shown which discussed about different methods
and algorithm to diagnose the tracking system.
62
Chapter 4
Conclusions
4.1 Conclusions
In every chapter the object detection and tracking methods are being surveyed. This
thesis has examined methods to improve the performance of motion segmentation
algorithms and Block matching technique for object tracking applications and
examined methods for multi-modal fusion in an object tracking system.
Motion segmentation is a key step in many tracking algorithms as it forms the
basis of object detection. Improving segmentation results as well as being able to
extract additional information such as frame difference, Gaussian of mixture model,
background subtraction allows for improved object detection and thus tracking.
However a strength of kalman filter is their ability to track object in adverse situation.
Integrating a kalman filter within a standard tracking system allows the kalman filter
is to use progressively updated features and aids in main training identity of the
tracked object, and provides tracking system with an effective means. The simulator
and the simulation parameters used for the experiments are disscussed. We have
shown the simulation results in the form of images.
63
Bibliography
[1] Alper Yilmaz, Omar Javed, and Mubarak Shah. Object tracking: A survey. Acm Computing
Surveys (CSUR), 38(4):13, 2006.
[2] Gary Bishop and Greg Welch. An introduction to the kalman filter. Proc of SIGGRAPH,
Course, 8:27599–3175, 2001.
[3] J Cezar Silveira Jacques, Claudio Rosito Jung, and Soraia Raupp Musse. Background
subtraction and shadow detection in grayscale video sequences. In Computer Graphics and
Image Processing, 2005. SIBGRAPI 2005. 18th Brazilian Symposium on, pages 189–196. IEEE,
2005.
[4] Budi Sugandi, Hyoungseop Kim, Joo Kooi Tan, and Seiji Ishikawa. A block matching technique
for object tracking employing peripheral increment sign correlation image. In Computer and
Communication Engineering, 2008. ICCCE 2008. International Conference on, pages 113–117.
IEEE, 2008.
[5] Alan J Lipton, Hironobu Fujiyoshi, and Raju S Patil. Moving target classification and tracking
from real-time video. In Applications of Computer Vision, 1998. WACV’98. Proceedings.,
Fourth IEEE Workshop on, pages 8–14. IEEE, 1998.
[6] Chris Stauffer and W Eric L Grimson. Adaptive background mixture models for real-time
tracking. In Computer Vision and Pattern Recognition, 1999. IEEE Computer Society
Conference on., volume 2. IEEE, 1999.
[7] Ya Liu, Haizhou Ai, and Guang-you Xu. Moving object detection and tracking based on
background subtraction. In Multispectral Image Processing and Pattern Recognition, pages
62–66. International Society for Optics and Photonics, 2001.
[8] Changick Kim and Jenq-Neng Hwang. Fast and automatic video object segmentation and
tracking for content-based applications. Circuits and Systems for Video Technology, IEEE
Transactions on, 12(2):122–129, 2002.
[9] Shahbe Mat Desa and Qussay A Salih. Image subtraction for real time moving object
extraction. In Computer Graphics, Imaging and Visualization, 2004. CGIV 2004. Proceedings.
International Conference on, pages 41–45. IEEE, 2004.
64
Bibliography
[10] Budi Sugandi, Hyoungseop Kim, Joo Kooi Tan, and Seiji Ishikawa. Tracking of moving objects
by using a low resolution image. In Innovative Computing, Information and Control, 2007.
ICICIC’07. Second International Conference on, pages 408–408. IEEE, 2007.
[11] YUTAKA Sato, S Kaneko, and SATORU Igarashi. Robust object detection and segmentation
by peripheral increment sign correlation image. Trans. of the IEICE, 84(12):2585–2594, 2001.
[12] Mahbub Murshed1/2, Md Hasanul Kabir1/2, and Oksam Chae1/2. Moving object tracking-an
edge segment based approach. 2011.
[13] Weiming Hu, Tieniu Tan, Liang Wang, and Steve Maybank. A survey on visual surveillance
of object motion and behaviors. Systems, Man, and Cybernetics, Part C: Applications and
Reviews, IEEE Transactions on, 34(3):334–352, 2004.
[14] Zhan Chaohui, Duan Xiaohui, Xu Shuoyu, Song Zheng, and Luo Min. An improved moving
object detection algorithm based on frame difference and edge detection. In Image and
Graphics, 2007. ICIG 2007. Fourth International Conference on, pages 519–523. IEEE, 2007.
[15] Ismail Haritaoglu, David Harwood, and Larry S. Davis. W¡ sup¿ 4¡/sup¿: real-time surveillance
of people and their activities. Pattern Analysis and Machine Intelligence, IEEE Transactions
on, 22(8):809–830, 2000.
[16] Deep J Shah, Deborah Estrin, and Afrouz Azari. Motion based bird sensing using frame
differencing and gaussian mixture. Undergraduate Research Journal, page 47, 2008.
[17] Budi Sugandi, Hyoungseop Kim, Joo Kooi Tan, and Seiji Ishikawa. Tracking of moving objects
by using a low resolution image. In Innovative Computing, Information and Control, 2007.
ICICIC’07. Second International Conference on, pages 408–408. IEEE, 2007.
[18] Intan Kartika and Shahrizat Shaik Mohamed. Frame differencing with post-processing
techniques for moving object detection in outdoor environment. In Signal Processing and
its Applications (CSPA), 2011 IEEE 7th International Colloquium on, pages 172–176. IEEE,
2011.
[19] Robert T Collins, Alan Lipton, Takeo Kanade, Hironobu Fujiyoshi, David Duggins, Yanghai
Tsin, David Tolliver, Nobuyoshi Enomoto, Osamu Hasegawa, Peter Burt, et al. A system
for video surveillance and monitoring, volume 102. Carnegie Mellon University, the Robotics
Institute Pittsburg, 2000.
[20] Chris Stauffer and W Eric L Grimson. Adaptive background mixture models for real-time
tracking. In Computer Vision and Pattern Recognition, 1999. IEEE Computer Society
Conference on., volume 2. IEEE, 1999.
[21] Pakorn KaewTraKulPong and Richard Bowden. An improved adaptive background mixture
model for real-time tracking with shadow detection. In Video-Based Surveillance Systems,
pages 135–144. Springer, 2002.
65
Bibliography
[22] Cláudio Rosito Jung. Efficient background subtraction and shadow removal for monochromatic
video sequences. Multimedia, IEEE Transactions on, 11(3):571–577, 2009.
[23] Yung-Gi Wu and Chung-Ying Tsai. The improvement of the background subtraction and
shadow detection in grayscale video sequences. In Machine Vision and Image Processing
Conference, 2007. IMVIP 2007. International, pages 206–206. IEEE, 2007.
[24] Muhammad Shoaib, Ralf Dragon, and Jorn Ostermann. Shadow detection for moving humans
using gradient-based background subtraction. In Acoustics, Speech and Signal Processing,
2009. ICASSP 2009. IEEE International Conference on, pages 773–776. IEEE, 2009.
[25] Jianhua Ye, Tao Gao, and Jun Zhang. Moving object detection with background subtraction
and shadow removal. In Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th
International Conference on, pages 1859–1863. IEEE, 2012.
[27] Pakorn KaewTraKulPong and Richard Bowden. An improved adaptive background mixture
model for real-time tracking with shadow detection. In Video-Based Surveillance Systems,
pages 135–144. Springer, 2002.
[28] John Brandon Laflen, Christopher R Greco, Glen W Brooksby, and Eamon B Barrett.
Objective performance evaluation of a moving object super-resolution system. In Applied
Imagery Pattern Recognition Workshop (AIPRW), 2009 IEEE, pages 1–8. IEEE, 2009.
[29] Lloyd L Coulter, Douglas A Stow, Yu Hsin Tsai, Christopher M Chavis, Richard W McCreight,
Christopher D Lippitt, and Grant W Fraley. A new paradigm for persistent wide area
surveillance. In Homeland Security (HST), 2012 IEEE Conference on Technologies for, pages
51–60. IEEE, 2012.
[30] Shalini Agarwal and Shaili Mishra. A study of multiple human tracking for visual surveillance.
International Journal, 5, 1963.
[31] Fatih Porikli. Achieving real-time object detection and tracking under extreme conditions.
Journal of Real-Time Image Processing, 1(1):33–40, 2006.
[32] Huchuan Lu, Ruijuan Zhang, and Yen-Wei Chen. Head detection and tracking by mean-shift
and kalman filter. In Innovative Computing Information and Control, 2008. ICICIC’08. 3rd
International Conference on, pages 357–357. IEEE, 2008.
[33] Greg Welch and Gary Bishop. An introduction to the kalman filter, 1995.
66
Bibliography
[34] Bastian Leibe, Konrad Schindler, Nico Cornelis, and Luc Van Gool. Coupled object detection
and tracking from static cameras and moving vehicles. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 30(10):1683–1698, 2008.
[35] Bastian Leibe, Konrad Schindler, Nico Cornelis, and Luc Van Gool. Coupled object detection
and tracking from static cameras and moving vehicles. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 30(10):1683–1698, 2008.
67