The ear has emerged as a new biometric trait to recognize humans from their profile faces. Stability over the years, noninvasive capturing process, expressionless images, and significant variation in shape among individuals make the ear a suitable choice when compared with other biometrics. Convolutional neural network (CNN)’s capability to learn and discriminate specific features irrespective of image variation makes it the most suitable choice for ear recognition in comparison to local geometrical feature extraction methods. The recognition performance affecting factors such as occlusion, image resolution, and head rotation during ear acquisition in an uncontrolled environment makes it a challenging task, and the use of spatial features only in CNN limits the recognition rate. We explored the various CNN configurations and propose the spectral–spatial feature-based CNN. An embedding algorithm is proposed to fuse multilevel spectral information from the image domain with spatial features extracted at each deep layer of the CNN network. The poor convergence problem of CNN is tackled by introducing batch normalization after every convolution. The analysis of the result shows that the proposed framework is superior among all CNN achieving 98.15% and 78.88% accuracy on the Annotated Web Ears with 1000 images and earVN1.0 with 28,412 images, respectively. In addition to fine-tuning and configuring the layers of CNN, the fusion of missing spectral features and spatial features can strengthen the CNN. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
CITATIONS
Cited by 8 scholarly publications.
Ear
Wavelets
Convolutional neural networks
Convolution
Biometrics
Data modeling
Image fusion