Sat - 23.Pdf - Handwritten Hindi Character Recognition Using CNN

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

ABSTRACT

The goal of this study is to employ a Convolutional Neural Network (CNN)-


based Deep Learning model to recognize handwritten Hindi characters.
The identified characters can then be saved digitally or used for other
purposes. The data was taken from the Kaggle dataset, which has 92,000
photos separated into two sets: training (80 percent) and test (20 percent).
It comprises a variety of handwritten Devanagari characters produced by
various people that can be used to train and test handwritten text
recognizers as well as conduct writer identification and verification tests.
On top of a TensorFlow backend, the model is built with Keras libraries. For
recognition, it has four CNN layers followed by three fully connected layers.
As input, grayscale handwritten character pictures are employed. Filters
are used to extract different features from the photos at each layer. The
Convolution operation does this. Pooling and flattening are the other two
major activities involved. The CNN layers' output is delivered to the fully
connected layers. Finally, each character's chance or probability score is
calculated, and the character with the highest probability score is displayed
as the output. 92.4% recognition accuracy is achieved. Similar models exist
for the purpose, but the new model outperformed and outperformed some
of the older models in terms of performance and accuracy

5
TABLE OF CONTENTS

ABSTRACT 5

TABLE OF CONTENTS 6

LIST OF FIGURES 7

LIST OF TABLES 7

ABBREVIATIONS 8

REFERENCES 9

KEYWORDS 10

1. Introduction 11
1.1. Introduction
1.2. Aim and Objective
1.3. Motivation
1.4. Software Requirements Specification
2. Literature Review 14
2.1. Different approaches used
2.2. Comparison of Existing methods with merits and demerits
3. Methodology 16
3.1. Models Used
3.2. Architecture diagram
4. Implementation 19

6
4.1. Data collection and analysis
4.2. Data Preprocessing
4.3. Tokenization
4.4. Feature Extraction
4.5. Splitting the dataset
4.6. Model Training and verification
5. Libraries Used 22
5.1. Numpy
5.2. Pandas
5.3. Sklearn
6. Coding and Testing 25
7. Conclusion and Enchantment 28

LIST OF FIGURES

● Architecture Diagram 18

LIST OF TABLES

● Literature Review 14
● Accuracy Table 28

7
ABBREVIATIONS

ML- Machine Learning


np - Numpy
pd – Pandas
CNN – Convolutional Neural Network

Reference :
8
[1] Apoorva Chaudhary, Raushan Lal Chhoker ,”Handwritten Hindi Numeric Character
Recognition and comparison of Algorithms”, 7th International Conference on Cloud Computing,
Data Science & Engineering – Conference (2017).
[2] R.Parthiban,R.Ezhilarasi,D.Saravanan,”OpticalCharacter Recognition for English Handwritten
Text Using Recurrent Neural Network”, University of Gothenburg (2020).
[3] Ripal Patel, Palak Patel,” Handwritten Character Recognition using Neural Network”,
International Journal of Scientific & Engineering Research Volume 2, Issue 3, March-2011.
[4] Mahmoud M. Abu Ghosh, Ashraf Y. Maghari, ” handwritten digit recognition using neural
networks research paper", 2017 International Conference on Promising Electronic Technologies
(2017).
[5] Parshuram Kamble, Ravinda, Hegadi,” Handwritten Marathi character recognition using R-
HOG Feature", International Conference on Advanced Computing Technologies and
Applications(2015).
[6] M. A. Pragathi, S. Saveetha, K. Priyadarshini, A. Shavar Banu, K. O. Mohammed Aarif,
“Handwritten Tamil Character Recognition Using Deep Learning",2019 International Conference
on Vision Towards Emerging Trends in Communication and Networking (ViTECoN) (2019).
[7] Naila , Adnan,” Urdu Optical Character Recognition Systems: Present Contributions and
Future Directions",2018 IEEE.
[8] Kaensar,” A Comparative Study on Handwritten Digit Recognition Classifier Using Neural
Network”, 2019 IEEE.
[9] Sujatha, Lalitha,” A Survey on Offline Handwritten Text Recognition of Popular Indian
Scripts”, International Journal of Computer Science and Engineering 2019.
[10] Wu, Chen.” Image Recognition Based on Deep Learning”, 2015 IEEE
[11] Sheng ,” A Review of Gradient-Based and Edge-Based Feature Extraction Method for
Object Detection”, 2011 IEEE

KEYWORDS

9
Handwritten Hindi Characters;
Devanagari;
Keras;
Convolutional Neural Network;

1.INTRODUCTION

10
1.1 INTRODUCTION:

In today's environment, handwritten text recognition technology is extremely


useful and necessary. Physical data storage and retrieval are challenging, and
there is a risk of harm or loss. Physical data generation is also prone to errors. If
the text is in a digital format, however, error scanning methods and autocorrect
tools can aid in the correct and efficient storage of data. This is one of the most
difficult jobs in the field of pattern recognition. One of the most significant benefits
of digitally preserving text is that it can be retrieved from any location. You do not
need to be present at the location where the data is stored. Data storage,
access, and analysis have all become considerably easier because of this
technology. Optical character recognition deals with an important issue called the
character classification and this is more challenging because of the similarities
between them. Character classification is something that identifies the characters
in Hindi that are trained initially and it can identify the characters written by
different users. Hindi language is most popular and its techniques grammar are
formulated for the recognition of a Hindi script is one of the foremost among the
16 major national languages spoken reading and writing by the North Indian
people. Character recognition can be online or offline. In On-line character
recognition system, the representation of two-dimensional coordinates of
successive points is done. It is automatic conversion of text into a digital form.
Off-line character recognition is used to convert written text into letter codes.
Since there are many hidden layers in deep neural network the parameters for
the training is very huge. To prevent over fitting we require a large set of
examples. One such magnificence of deep mastering is the Convolution Neural
Network. Convolution Neural Network is a special type of neural network used
effectively for image recognition and classification. By implementing these
techniques, we recognize the handwritten Hindi characters. Hindi is one of the
ancient languages spoken by many people.

1.2 AIM AND OBJECTIVES

11
• The aim of a handwriting character recognition system is to convert
handwritten character into machine readable formats.
• The main objective of this work is to ensure effective and reliable
approaches for hand-written character recognition and making machine
readability easier.
• Handwritten character recognition is meant for receiving and interpreting
handwritten input in the form of pictures or paper documents.

1.3 MOTIVATION

Hand writing recognition of characters has been around since the 1980s.The task of
handwritten character recognition, using a classifier, has great importance and use
such as – online handwriting recognition on computer tablets, recognize zip codes
on mail for postal mail sorting, processing bank check amounts, numeric entries in
forms filled up by hand like tax forms, bank forms and so on. The general problem
we predicted we would face in this digit classification problem was the similarity
between the digits like 1 and 7, 5 and 6, 3 and 8, 9 and 8 etc. Also people write the
same digit in many different ways ‐ the digit „1‟ is written as „1‟, „1‟, „1‟ or „1‟. Similarly
7 may be written as 7, 7, or 7. Finally the uniqueness and variety in the handwriting
of different individuals also influences the formation and appearance of the digits.

1.4 SOFTWARE AND HARDWARE REQUIREMENTS SPECIFICATION:

• Any modern Intel or AMD processor


• 4 GB RAM
• 50 GB or more storage
• Stable Internet Connection
• OS – Windows 7 or Above
• Python 3.7
• Anaconda Navigator
2. LITERATURE REVIEW

12
2.1 DIFFERENT APPROACES USED:

13
14
3. METHODOLOGY

3.1 MODELS USED:

 CONVOLUTIONAL NEURAL NETWORK


A Convolutional Neural Network is a Deep Learning algorithm which can take in
an input image, assign importance to various aspects/objects in the image and
be able to differentiate one from the other. The pre-processing required in a
ConvNet is much lower as compared to other classification algorithms. While in
primitive methods filters are hand-engineered, with enough training, ConvNets
have the ability to learn these filters/characteristics.

1. Convolutional Layer
This layer is the first layer that is used to extract the various features from the input
images. In this layer, the mathematical operation of convolution is performed
between the input image and a filter of a particular size MxM. By sliding the filter
over the input image, the dot product is taken between the filter and the parts of the
input image with respect to the size of the filter (MxM). The output is termed as the
Feature map which gives us information about the image such as the corners and
edges. Later, this feature map is fed to other layers to learn several other features of
the input image.

2. Pooling Layer
In most cases, a Convolutional Layer is followed by a Pooling Layer. The primary
aim of this layer is to decrease the size of the convolved feature map to reduce the
computational costs. This is performed by decreasing the connections between
layers and independently operates on each feature map. Depending upon method
used, there are several types of Pooling operations. In Max Pooling, the largest
element is taken from feature map. Average Pooling calculates the average of the
elements in a predefined sized Image section. The total sum of the elements in the
predefined section is computed in Sum Pooling. The Pooling Layer usually serves as
a bridge between the Convolutional Layer and the FC Layer Must Read: Neural
Network Project Ideas

3. Fully Connected Layer


The Fully Connected (FC) layer consists of the weights and biases along with the
neurons and is used to connect the neurons between two different layers. These
layers are usually placed before the output layer and form the last few layers of a
CNN Architecture. In this, the input image from the previous layers are flattened and
fed to the FC layer. The flattened vector then undergoes few more FC layers where

16

You might also like