Icicct 2017 7975203
Icicct 2017 7975203
Icicct 2017 7975203
(ICICCT 2017)
Abstract— Optical Character Recognition is the process of is very suitable to represent the image structure. The
converting an input text image into a machine encoded format. properties of CNN that makes this possible are the local
Different methods are used in OCR for different languages. The connectivity strategy and the weight sharing strategy [7].
main steps of optical character recognition are pre-processing,
segmentation and recognition. Recognizing handwritten text is Handwritten character recognition is a difficult task as
harder than recognizing printed text. Convolutional Neural the characters usually has various appearances according to
Network has shown remarkable improvement in recognizing
characters of other languages. But CNNs have not been different writer, writing style and noise. Researchers have
implemented for Malayalam handwritten characters yet. The been trying to increase the accuracy rate by designing better
proposed system uses Convolutional neural network to extract features, using different classifiers and combination of
features. This is method different from the conventional method different classifiers. These attempts however are limited
that requires handcrafted features that needs to be used for when compared to CNN. CNNs can give better accuracy
finding features in the text. We have tested the network against a rates but it has some problems that needs to be addressed.
newly constructed dataset of six Malayalam characters. This is
method different from the conventional method that requires Malayalam is one among the twenty two scheduled
handcrafted features that needs to be used for finding features in languages in India and is the official language in the state of
the text. Kerala, where more than 95% people use Malayalam for
communication [10]. Malayalam characters are complex due
Keywords— Feature Extraction, Classification, Machine
recognition, Convolutional Neural Network, CNN, Malayalam. to their curved nature and there are characters which are
formed by the combination of two characters. These along
with the presence of ‘chillu’ make recognizing Malayalam
characters a challenging task.
I. INTRODUCTION
We attempt to use CNN to achieve better accuracy rate in
Deep learning Techniques has achieved top class Malayalam handwritten character recognition. The rest of the
performance in pattern recognition tasks. These include paper is organized as follows. Section 2 is literature survey;
image recognition [1, 2], human face recognition [3], human section 3 gives the methodology and section 4is the
pose estimation [4] and character recognition [5, 6]. These conclusion.
deep learning techniques have proved to outperform
II. LITERATURE SURVEY
traditional methods for pattern recognition. Deep learning
enables automation of feature extraction task. Traditional Shailesh Acharya et al [8] proposed a Deep Learning
methods involve feature engineering which is to be done Based Large Scale Handwritten Devanagari Character
Recognition that used convolutional neural network for
manually. This task of crafting features is time consuming
classifying Devanagari handwritten characters. The employed
and not very efficient. The features ultimately determine the data set increment and dropout layer in order to reduce over
effectiveness of the system. Deep learning methods outshine fitting. They tested two models of the network, model A
traditional methods by automatic feature extraction. consisted of three convolution layers and one fully connected
layer and model B was a shallow network. The highest testing
Convolutional Neural Networks (CNN) is a popular deep accuracy for Model A was 0.98471 and for model B was
learning method and is state of the art for image 0.982681.
recognition. CNN has achieved a breakthrough in the
IMGENET challenge 2011. The CNN used in the challenge Prashanth Vijayaraghavan et al. [9] proposed a handwritten
was Alexnet and gave an error rate of 16% in comparison to character recognition system for Tamil characters using
convolutional neural network. They augmented the
25% in 2010. From then on it was CNN all the way. CNN
ConvNetJS library for learning features by using stochastic challenges given the time period is short. However from
pooling, probabilistic weighted pooling, and local contrast literature survey, it is clear that there are techniques that can
normalization get an accuracy of 94.4% on the IWFHR-10 be applied to increase the number of dataset images. Overall
dataset. Anitha et al [10] proposed Multiple Classifier System architecture of the proposed system is shown in figure 1. The
for Offline Malayalam Character Recognition. The features input is first scanned using a scanner or taken as a photograph
used are the gradient and density based features. The best using a smart phone. The kernel weights are initialized using
combination ensemble with an accuracy of 81.82% was Gaussian distribution.
reported by using the Product rule combination scheme.
G Raju et al. [11] proposed a Malayalam character The proposed method consists of the following stages:
recognition system using gradient based features and Run
A. Pre-processing
length count. The authors have proposed another character
recognition scheme using the fusion of global and local In the preprocessing stage, the character image is
features for the recognition of isolated Malayalam characters4. processed for removing all the undesirable entities from an
The authors have also applied gradient features for the image to make the process of recognizing easier. The input
recognition of Malayalam vowels in5. Arora et al. [12] images are resized to a suitable format. It must not be too large
proposed a multiple classifier system using chain code or too small. If the image is too large, the amount of
histogram and moment invariants for the recognition of computation required will be high. If the image is too small, it
Devanagari character recognition. will be difficult to fit it into a large network. Larger images are
cropped and padding will be applied to smaller images to
Rajashekararadhya S. V et al. [13] suggests an efficient achieve a standard size. Padding is the process of adding white
approach for handwritten numeral recognition in Kannada and pixels to an image, which means that we are increasing the
Tamil. Projection distance was used as feature and background of the image.
additionally zone based method was also used for accurate
recognition of numeral. For training and testing, they used B. Dataset Creation
Nearest Neighbor classifier.
There is no open source dataset available for handwritten
They divided the whole image into 25 equal parts and from Malayalam characters. Hence it was necessary to build a new
each zone; pixel distance for grid column is computed from dataset from scratch. Creating a dataset is time consuming and
image centroid. If more than one pixel is found in a column of requires a lot of effort. To start with, we decided to first build
that grid, average pixel distance is computed and stored. a dataset of the first six characters of Malayalam. Characters
Repeat this process for entire grid to get 250 features (10 written by 112 different people were collected. A complete
features for a zone). An average of 95% accuracy was
Malayalam dataset is being constructed. The complete dataset
achieved.
will have 3 times the variety of the current dataset.
Giridharan. R et al. [14] propose zoning method to retrieve
information from temple Epigraphy. They decomposed the C. Dataset Augmentation
image in several ways. Decompositions to vertical zones, A large dataset is required for training the CNN. In order to
horizontal equal zones, right diagonals, left diagonals, octants, attain this, the images that are already obtained is modified
diagonal quadrants, quadrants etc were used as zoning. A total and transformed to get a large number of variations. Affine
of 54 zones were obtained. In each zone, they calculated the transformation is a linear mapping method that preserves
density of black pixel. Perceptron was used for the recognition points, straight lines, and planes. Sets of parallel lines remain
purpose. It takes multiple inputs and produces single output parallel after an affine transformation. Translation, Scaling,
using a linear combination of input values. Another OCR was Sheering and Rotation are the four major affine
advanced by M Abdul Rahman and M S Rajasree [15] using transformation. Different translations are used to augment the
wavelet transform feature extraction techniques and neural dataset. Affine transformation is a linear mapping method that
networks. preserves points, straight lines, and planes. Sets of parallel
There is no OCR system that yields 100% accuracy. CNN lines remain parallel after an affine transformation. Gaussian
was not yet implemented for handwritten character recognition smoothing is the result of blurring an image by a Gaussian
in Malayalam. CNN has shown a great deal of improvement in function. The visual effect of this blurring technique is a
the accuracy rates in other languages. smooth blur resembling that of viewing the image through a
The paper is organized in the remaining portion as follows. translucent screen. Salt-and-pepper noise is a form of noise
In Section III discusses the proposed method. Section IV sometimes seen on images. It presents itself as sparsely
concludes the paper. occurring white and black pixels. Contrast and brightness level
of an image is changed. After data augmentation a dataset of
nearly 2 lakh images will be obtained.
III. PROPOSED METHOD
There is no standard dataset for handwritten Malayalam D. CNN Modelling
characters. CNN requires a large set of training images. CNN This is the most important step. CNN modeling means
achieves a high accuracy rate only if it is trained with a modeling the structure of CNN. The number of convolution
substantially large training set. This is one of the biggest layers, max pooling layers, ReLu layers and fully connected
layers needs to be chosen. It is not possible to determine the This softmax layer is used to classify the character. The
exact number of layers that will yield the best outcome. Hence softmax function has a value between 0 and 1. The sum of
it is vital to try several configurations of the network and output of all the classes also sums to 1. The class with the
choose which network best suits. Size of Feature map=m*n+1, maximum value will be selected as the class for a particular
n <= m, where m is the height and width of the image, n is the input image.
height and width of the convolution layer.
IV. CONCLUSION
OCR has a wide variety of real time applications. It can be
used for office automation. This work implements a
handwritten Malayalam character recognition system. The
proposed method uses CNN to extract and classify Malayalam
characters. Both Sample generation and CNN modelling are
time consuming tasks and the later also requires a CUDA
enabled GPU for parallel processing.
Preprocessing helps to remove the undesired qualities of
an image and hence can play an important role in increasing
the role. So is the sample generation process that reduces
overfitting. The drop out layer also reduces overfitting while
also decreasing the overall training time. CNN has proved to
be the state-of-the-art technique for other languages and hence
provides the chance for giving higher accuracy rate for
Malayalam characters too.
References
on Computer Vision and Pattern Recognition, pages 1701–1708. IEEE, [12] Raju, Bindu S Moni, Madhu S. Nair, “A Novel Handwritten Character
2014. Recognition System Using Gradient Based Features and Run Length
[4] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler “Joint train- ing of a Count”, Sadhana Indian Academy of Sciences, Springer India, 2014, p.
convolutional network and a graphical model for human pose 1-23.
estimation”, In Advances in Neural Information Processing Systems, [13] Arora Sandhya, Bhattacharjee Debotosh, Nasipuri Mita, Basu Dipak
pages 1799–1807, 2014. Kumar and Kundu Mahantapas “Combining multiple feature extraction
[5] P. Y. Simard, D. Steinkraus, and J. C. Platt, “Best practices for techniques for handwritten Devnagari character recognition” Third
convolutional neural networks applied to visual document analysis. In international Conference on Industrial and Information Systems ICIIS,
2013 12th International Conference on Document Analysis and 2008, p. 1-6.
Recognition”, volume 2, pages 958–958. IEEE Computer Society, [14] Abdul Rahiman M and Rajasree M S, “An Efficient Character
2003. Recognition System for Handwritten Malayalam Characters Based on
[6] D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber. Intensity Variations”, International Journal of Computer Theory and
“Convolutional Neural Network Committees for Handwritten Character Engineering, Vol. 3, No. 3, June 2011.
Classi cation”, pages 1135–1139. IEEE, Sept. 2011. [15] Panyam Narahari Sastry, T.R. Vijaya Lakshmi, N.V. Koteswara
[7] Li Chen, Song Wang, Wei Fan, Jun Sun, Satoshi Naoi, "Beyond Human Rao,T.V. Rajinikanth, Abdul Wahab, “Telugu handwritten character
Recognition: A CNN-Based Framework for Handwritten Character recognition using zoning features”,IEEE, 2014.
Recognition",3rd IAPR Asian Conference on Pattern Recognition, 2015. [16] Manoj Kumar Mahto, Karamjit Bhatia R. K. Sharma, “Combined
[8] I. J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud, and V. Shet, ”Multi- horizontal and vertical projection feature extraction technique for
digit number recognition from street view imagery using deep Gurmukhi handwritten character recognition”, International Conference
convolutional neural networks”. arXiv preprint arX- iv:1312.6082, 2013. on Advances in Computer Engineering and Applications (ICACEA),
2015.
[9] Shailesh Acharya, Ashok Kumar Pant, Prashnna Kumar Gyawali “Deep
Learning Based Large Scale Handwritten Devanagari Character [17] Manju Manuel and Saidas S. R, “Handwritten Malayalam Character
Recognition”, 9th International Conference on Software, Knowledge, Recog nition using Curvelet Transform and ANN”, International
Information Management and Applications (SKIMA), 2015. Journal of Computer Applications (0975 8887) Volume 121 No.6, July
2015.
[10] Prashanth Vijayaraghavan and Misha SraHandwritten “Tamil
Recognition using a Convolutional Neural Network”, MIT Media Lab. [18] Yann LeCun, Leon Bottou, Yoshua Bengio and Patric Haffner,
“Gradient-Based Learning Applied to Document Recognition”, PROC.
[11] Anitha Mary M.O. Chackoa, Dhanya P.M, “Multiple Classifier System OF THE IEEE, NOVEMBER 1998.
for Offline Malayalam Character Recognition”, International
Conference on Information and Communication Technologies (ICICT), .
2014.