Abstract
In this thesis we develop unsupervised and on-line learning algorithms
for codebook based visual recognition tasks. First, we study the Prob-
abilistic Latent Semantic Analysis (PLSA), which is one instance of
codebook based recognition models. It has been successfully applied
to visual recognition tasks, such as image categorization, action recog-
nition, etc. However it has been learned mainly in batch mode, and
therefore it cannot handle the data that arrives sequentially. We pro-
pose a novel on-line learning algorithm for learning the parameters of
the PLSA under that situation. Our contributions are two-fold: (i)
an on-line learning algorithm that learns the parameters of the PLSA
model from incoming data; (ii) a codebook adaptation algorithm that
can capture the full characteristics of all features during the learn-
ing. Experimental results demonstrate that the proposed algorithm
can handle sequentially arriving data that the batch PLSA learning
cannot cope with.
We then look at the Implicit Shape Model (ISM) for object detec-
tion. ISM is a codebook based model in which object information is
retained in codebooks. Existing ISM based methods require manual
labeling of training data. We propose an algorithm that can label the
training data automatically. We also propose a method for identify-
ing moving edges in video frames so that object hypotheses can be
generated only from the moving edges. We compare the proposed al-
gorithm with a background subtraction based moving object detection
algorithm. The experimental results demonstrate that the proposed
algorithm achieves comparable performance to the background sub-
traction based counterpart, and it even outperforms the counterpart
in complex situations.
We then extend the aforementioned batch algorithm for on-line learn-
ing. We propose an on-line training data collection algorithm and also
an on-line codebook based object detector. We evaluate the algorithm
on three video datasets. The experimental results demonstrate that
our algorithm outperforms the state-of-the-art on-line conservative
learning algorithm.