Learning Active Basis Model for Object Detection and Recognition

Wu, Ying Nian; Si, Zhangzhang; Gong, Haifeng; Zhu, Song-Chun

doi:10.1007/s11263-009-0287-0

Learning Active Basis Model for Object Detection and Recognition

Open access
Published: 26 August 2009

Volume 90, pages 198–235, (2010)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computer Vision Aims and scope Submit manuscript

Learning Active Basis Model for Object Detection and Recognition

Download PDF

Ying Nian Wu¹,
Zhangzhang Si¹,
Haifeng Gong^1,2 &
…
Song-Chun Zhu^1,2

3175 Accesses
110 Citations
Explore all metrics

Abstract

This article proposes an active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates. In our generative model, a deformable template is in the form of an active basis, which consists of a small number of Gabor wavelet elements at selected locations and orientations. These elements are allowed to slightly perturb their locations and orientations before they are linearly combined to generate the observed image. The active basis model, in particular, the locations and the orientations of the basis elements, can be learned from training images by the shared sketch algorithm. The algorithm selects the elements of the active basis sequentially from a dictionary of Gabor wavelets. When an element is selected at each step, the element is shared by all the training images, and the element is perturbed to encode or sketch a nearby edge segment in each training image. The recognition of the deformable template from an image can be accomplished by a computational architecture that alternates the sum maps and the max maps. The computation of the max maps deforms the active basis to match the image data, and the computation of the sum maps scores the template matching by the log-likelihood of the deformed active basis.

Article PDF

Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection

A Unified Framework for Compositional Fitting of Active Appearance Models

Article Open access 09 June 2016

Fast Algorithms for Fitting Active Appearance Models to Unconstrained Images

Article Open access 24 September 2016

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Amit, Y., & Trouve, A. (2007). Pop: Patchwork of parts models for object recognition. International Journal of Computer Vision, 75, 267–282.
Article Google Scholar
Borenstein, E., & Ullman, S. (2002). Class-specific, top-down segmentation. In Proceedings of European conference on computer vision.
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 681–685.
Article Google Scholar
Daugman, J. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of Optical Society of America, 2, 1160–1169.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B, 39, 1–38.
MATH MathSciNet Google Scholar
Ferrari, V., Jurie, F., & Schmid, C. (2007). Accurate object detection with deformable shape models learnt from images. In Proceedings of IEEE conference on computer vision and pattern recognition.
Ferryman, J. M. (2006). In Proceedings of ninth IEEE international workshop on performance evaluation of tracking and surveillance (PETS 2006).
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119–139.
Article MATH MathSciNet Google Scholar
Friedman, J. H. (1987). Exploratory projection pursuit. Journal of the American Statistical Association, 82, 249–266.
Article MATH MathSciNet Google Scholar
Geman, S., Potter, D. F., & Chi, Z. (2002). Composition systems. Quarterly of Applied Mathematics, 60, 707–736.
MATH MathSciNet Google Scholar
Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: active contour models. International Journal of Computer Vision, 1, 321–331.
Article Google Scholar
Lades, M., Vorbrggen, J. C., Buhmann, J., Lange, J., von der Malsburg, C., Wrtz, R. P., & Konen, W. (1993). Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42, 300–311.
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278–2324.
Article Google Scholar
Mallat, S., & Zhang, Z. (1993). Matching pursuit in a time-frequency dictionary. IEEE Transactions on Signal Processing, 41, 3397–3415.
Article MATH Google Scholar
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.
Article Google Scholar
Pietra, S. D., Pietra, V. D., & Lafferty, J. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 380–393.
Article Google Scholar
Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025.
Article Google Scholar
Tu, Z. (2007). Learning generative models via discriminative approaches. In Proceedings of IEEE conference on computer vision and pattern recognition.
Ullman, S. (1996). High-level vision: object recognition and visual cognition. Cambridge: MIT Press.
MATH Google Scholar
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154.
Article Google Scholar
Weber, M., Welling, M., & Perona, P. (2000). Towards automatic discovery of object categories. In Proceedings of IEEE conference on computer vision and pattern recognition.
Wu, Y. N., Shi, Z., Fleming, C., & Zhu, S. C. (2007). Deformable template as active basis. In Proceedings of international conference on computer vision.
Wu, Y. N., Guo, C., & Zhu, S. C. (2008). From information scaling of natural images to regimes of statistical models. Quarterly of Applied Mathematics, 66, 81–122.
MATH MathSciNet Google Scholar
Yuille, A. L., Hallinan, P. W., & Cohen, D. S. (1992). Feature extraction from faces using deformable templates. International Journal of Computer Vision, 8, 99–111.
Article Google Scholar
Zhu, L., Lin, C., Huang, H., Chen, Y., & Yuille, A. (2008). Unsupervised structure learning: hierarchical recursive composition, suspicious coincidence and competitive exclusion. In Proceedings of European conference on computer vision.
Zhu, S. C., & Mumford, D. B. (1997). Prior learning and Gibbs reaction-diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 1236–1250.
Article Google Scholar
Zhu, S. C., & Mumford, D. B. (2006). A stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision, 2, 259–362.
Article Google Scholar
Zhu, S. C., Wu, Y. N., & Mumford, D. B. (1997). Minimax entropy principle and its applications to texture modeling. Neural Computation, 9, 1627–1660.
Article Google Scholar
Zhu, S. C., Guo, C. E., Wang, Y. Z., & Xu, Z. J. (2005). What are textons? International Journal of Computer Vision, 62, 121–143.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of California, Los Angeles, USA
Ying Nian Wu, Zhangzhang Si, Haifeng Gong & Song-Chun Zhu
Lotus Hill Research Institute, Ezhou, China
Haifeng Gong & Song-Chun Zhu

Authors

Ying Nian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhangzhang Si
View author publications
You can also search for this author in PubMed Google Scholar
Haifeng Gong
View author publications
You can also search for this author in PubMed Google Scholar
Song-Chun Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Nian Wu.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Wu, Y.N., Si, Z., Gong, H. et al. Learning Active Basis Model for Object Detection and Recognition. Int J Comput Vis 90, 198–235 (2010). https://doi.org/10.1007/s11263-009-0287-0

Download citation

Received: 28 March 2008
Accepted: 03 August 2009
Published: 26 August 2009
Issue Date: November 2010
DOI: https://doi.org/10.1007/s11263-009-0287-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Learning Active Basis Model for Object Detection and Recognition

Abstract

Article PDF

Similar content being viewed by others

Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection

A Unified Framework for Compositional Fitting of Active Appearance Models

Fast Algorithms for Fitting Active Appearance Models to Unconstrained Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning Active Basis Model for Object Detection and Recognition

Abstract

Article PDF

Similar content being viewed by others

Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection

A Unified Framework for Compositional Fitting of Active Appearance Models

Fast Algorithms for Fitting Active Appearance Models to Unconstrained Images

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation