Simple method for high-performance digit recognition based on sparse coding

Kai Labusch; Erhardt Barth; Thomas Martinetz

doi:10.1109/TNN.2008.2005830

Simple method for high-performance digit recognition based on sparse coding

IEEE Trans Neural Netw. 2008 Nov;19(11):1985-9. doi: 10.1109/TNN.2008.2005830.

Authors

Kai Labusch¹, Erhardt Barth, Thomas Martinetz

Affiliation

¹ Neuro- and Bioinformatics, University of Lübeck, D-23538 Lübeck, Germany. [email protected]

PMID: 19000969
DOI: 10.1109/TNN.2008.2005830

Abstract

In this brief paper, we propose a method of feature extraction for digit recognition that is inspired by vision research: a sparse-coding strategy and a local maximum operation. We show that our method, despite its simplicity, yields state-of-the-art classification results on a highly competitive digit-recognition benchmark. We first employ the unsupervised Sparsenet algorithm to learn a basis for representing patches of handwritten digit images. We then use this basis to extract local coefficients. In a second step, we apply a local maximum operation to implement local shift invariance. Finally, we train a support vector machine (SVM) on the resulting feature vectors and obtain state-of-the-art classification performance in the digit recognition task defined by the MNIST benchmark. We compare the different classification performances obtained with sparse coding, Gabor wavelets, and principal component analysis (PCA). We conclude that the learning of a sparse representation of local image patches combined with a local maximum operation for feature extraction can significantly improve recognition performance.

MeSH terms

Algorithms*
Artificial Intelligence*
Electronic Data Processing / methods*
Handwriting*
Image Enhancement / methods
Image Interpretation, Computer-Assisted / methods*
Information Storage and Retrieval / methods*
Pattern Recognition, Automated / methods*
Reproducibility of Results
Sensitivity and Specificity