C1 Projectreport
C1 Projectreport
C1 Projectreport
ON
“HANDWRITING RECOGNITION USING MACHINE LEARNING”
Submitted in partial fulfillment of the requirements for the award of degree of
BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING
By
GTaniyasri 1BG16EC037
MeghanaBN 1BG17EC407
NehaP 1BG17EC411
ShreeRakshaB 1BG17EC417
An Autonomous Institution under VTU, Approved by AICTE, Accredited as Grade A Institution by NAAC.
All Eligible UG branches – CSE, ECE, EEE, ISE &Mech.Engg. Accredited by NBA for academic years 2018-19 to 2021-22 & valid upto 30.06.2022
Post box no. 7087, 27th cross, 12th Main, Banashankari 2nd Stage, Bengaluru- 560070, INDIA
Ph: 91-80- 26711780/81/82 Email: [email protected], www.bnmit.org
CERTIFICATE
Certified that the project work entitled “HANDWRITING RECOGNITION USING MACHINE
LEARNING” carried out by G Taniyasri (1BG16EC037), Meghana B N (1BG17EC407), Neha P
(1BG17EC411) and Shree RakshaB (1BG17EC417) bona fide students of VIII semester in partial
fulfillment for the award of Bachelor of Engineering degree in Electronics and Communication
Engineering of the Visvesvaraya Technological University, Belagaviduring the year 2020-2021. It is
certified that all corrections/suggestions indicated for internal assessment have been incorporated in there
port deposited in the department library. The project work report has been approved as it satisfies the
academic requirements in respect of Project work prescribed for the said degree.
External Viva:
2
ACKNOWLEDGEMENT
We would like to place on record our sincere thanks and gratitude to the concerned people, whose
suggestions and words of encouragement has been valuable. We express our heartfelt gratitude to
BNM Institute of Technology, for giving us the opportunity to pursue Degree of Electronics and
Communication Engineering and helping us to shape our career. We take this opportunity to thank
Prof. T. J. Rama Murthy, Director Dr. S. Y. Kulkarni, Additional Director, Prof. Eishwar N
Maanay, Dean and Dr. Krishnamurthy G.N., Principal for their support and encouragement to
pursue this project. We would like to thank Dr. P.A. Vijaya, Professor and Head, Dept. of
Electronics and Communication Engineering, for her support and encouragement.
We would like to thank our guide Mrs. Anuradha J P, Assistant Professor, Dept. of Electronics and
Communication Engineering, who has been the source of inspiration throughout our project work
and has provided us with useful information at every stage of our project, who is kind enough to
extend her help for our project whenever the needarose.
Finally, we are thankful to all the teaching and non-teaching staff of Department of Electronics and
Communication Engineering for their help in the successful completion of our project. Last but not
the least we would like to extend our sincere gratitude to our parents and all our friends who were a
constant source of inspiration.
G Taniya Sri
Meghana B N
Neha P
Shree Raksha B
i
ABSTRACT
The goal of handwriting is to identify input characters or image correctly then analyzed too
many automated process systems. This system will be applied to detect thewritings of different
format. The development of handwriting is more sophisticated, which isfound various kinds of
handwritten character such as digit, numeral, cursive script, symbols, and scripts including
English and other languages. The automatic recognition of handwritten text can be extremely
useful in many applications where it is necessary to process large volumes of handwritten data,
such as recognition of addressesandpostcodes on envelopes, interpretation of amounts on bank
checks, document analysis, and verification of signatures. Therefore, computer is needed to be
able to read document or data for ease of document processing. Key Terms-E.G –For Example,
NLP -Natural Language Processing, CNN – Convolution NeuralNetwork
ii
CONTENTS
3 Contents iii
4 List of figures iv
5 Chapter 1 - Introduction
1.1 Why is handwriting recognition needed? 1
1.2 Motivation 2
1.3 ProblemStatement 4
1.4 Objective 5
1.5 Organization of theReport 6
6 Chapter 2 – Literature Survey
2.1 Paper 1 7
2.2 Paper 2 8
2.3 Paper 3 9
2.4 Paper 4 11
2.5 Paper 5 14
7 Chapter 3 – Project Decription
3.1 Steps Involved 17
3.2How do convolution neural network work 20
3.2.1 Convolution of an image 21
3.3 Market Potential 30
8 Chapter 4 – Software requirement
4.1 Python 3 32
4.2 Tensor Flow 33
4.3 NumPY 33
4.4Open CV 33
4.5 Application 34
4.6 Functional Requirements 35
4.7 Non Function requirements 35
iii
9 Chapter 5 – Software Description
5.1 UML Diagram 37
5.2 System Testing 38
5.3 Software Code 39
5.4 Future Enhancement 42
10 Chapter 6 - Conclusion 43
11 References 44
iv
LIST OF FIGURES
INTRODUCTION
HAND WRITING RECOGNITION USING MACHINE LEARNING
Chapter 1: INTRODUCTION
Neural networks are learning models used in machine learning. Their aim is to
simulate the learning process that occurs in an animal or human neural system.
Being one of the most powerful learning models, they are useful in automation
of tasks where the decision of a human being takes too long, or is imprecise. A
neural network can be very fast at delivering results and may detect connections
between seen in stances of data that human cannot see.
1.1 AIM
The main aim of our project is to develop an hand written characters
reorganization, This project seeks to classify an individual handwritten word so
that handwritten text can be translated to a digital form.
1.3 MOTIVATION
Several motivations led to the design of the proposed model. Leveraging research
experience from other fields: a lot of research and good results were reported with
convolution neural networks on images (e.g. for object recognition) and LSTMs
on language (e.g. speech recognition, machine translation). Since the inputs of
our system are images, and the outputs are sentences, it makes sense to use the
presented architecture.
A human can easily solve and recognize any problem, but this is not the same in
the case of a machine. Many techniques or methods should be implemented to
work as a human.
1.4 PROBLEMSTATEMENT
• Despite the abundance of technological writing tools, many people still choose to
take their notes traditionally: with pen and paper. However, there are drawbacks
to hand- writing text. It’s difficult to store and access physical documents in an
efficient manner, search through them efficiently and to share them with others.
• Thus, a lot of important knowledge gets lost or does not get reviewed
becauseofthefactthatdocumentsnevergettransferredtodigitalformat.
.
• We have thus decided to tackle this problem in our project because we
believe the significantly greater ease of management of digital text
compared to written text will help people more effectively access, search,
share, and analyze their records, while still allowing them to use their
preferred writing method.
• The purpose of the project is to take the hand written characters as an input
process the character, train the neural network effectively by using the
algorithm to recognize the pattern. To classify, we need to use the best
template to compare with segmented image and to determine how the
template will be used to compare with the image.
• Given a handwritten character, the system needs to predict the type of the
character. In other words if we can write the character“A” the system
predict the character that it is truly “A” or the input character is nearer to
“A” or something else. The purpose of this project is to take
The hand written characters as an input process the character, train the
neural network effectively by using the algorithm to recognize the pattern.
1.5 OBJECTIVE
Undertaken, the motivation which was the cause for taking up this project, the Problem
statement and the objective of the project.
Chapter-2: Literature Survey-This chapter summarizes the five chosen papers
which is Relevant for project.
Chapter-3: Project Description -This chapter shows the description of our
Project.
Chapter-4: Software Requirements-This chapter describes the software used in
our Project
Chapter-5: Software Description-.This chapter testing of the system, coding and future
enhancement.
Chapter-6: Conclusion- This chapter talks about the information been covered
so far.A referenceisal so mentioned towards the end of the report.
LITERATURE SURVEY
HAND WRITNG RECOGNITION USING MACHINE LEARNING
2.2 PAPER2:
Megha Agarwal Shalika, Vinam Tomar, Priyanka Gupta:
”Handwritten Character Recognition using Neural Network and Tensor Flow”,
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-8, Issue- 6S4, April 2019.
The paper will describe the best approach to get more than 90% accuracy in the
field of Handwritten Character Recognition (HCR). There have been plenty of
research done in the field of HCR but still it is an open problem as we are still
lacking in getting the best accuracy. In this paper, the offline handwritten
character recognition will be done using Convolution neural network and Tensor
flow.
A method called Soft Max Regression is used for assigning the probabilities to
handwritten characters being one of the several characters as it gives the values
between 0 and1 summing up to 1. The purpose is to develop the software with a
very high accuracy rate and with minimal time and space complexity and also
optimal. Keywords— Handwritten Character Recognition, Convolution Neural
Network, Tensor Flow, Soft Max Regression. Understanding the handwritten
characters or typed documents is simple to the human beings as we have the
ability to learn. The same ability can be induced to the Machines also by the use of
Machine Learning and Artificial Intelligence.
The field which deals with this problem is called as the OCR or also known as
Optical Character Recognition. It’s the area of study among various fields such as
recognizing of pattern, also Image vision and also AI. This is the system for
changing electronic and image text into the digital character to be read by the
machines. The time used in entering the data and also the storage space required
by the documents can be highly reduced by the use of OCR or in other words it can
be retrieved fast.
By using the OCR in banking field, legal scenarios, etc. many important and sensitive
documents can be processed faster without human intervention. OCR
In advance can be inferred in two ways based on type of the text and document
acquisition (Figure 1).Handwritten text and PCR (Printed Character Recognition).
2.3 PAPER3:
Karishma Verma, Dr. Manjeet Singh:
”A Literature Survey on Handwritten Character Recognition using Convolution Neural
Network” International Journal of Management, Technology And Engineering ISSN NO:
2249-7455 Volume 8, Issue VII, JULY/2018.
The convolution step: Convolution neural network derive their name from the
operator “convolution”. The primary purpose of this operator in case of CNNs is
to extract features from the input image. This is the first layer in CNN; the input
to this layer is a 3D array (32*32*3) of pixel value. Convolution is a mathematical
operation to merge two set of information in other our case, first is input image
and the second set is the filter.
On the left side we have input image, convolution filter is located in the center
which is also called kernel, detector or feature whereas on the right side of figure
2.3the output of the convolution process called activation map also called
feature map, convolved feature. We perform convolution operation by sliding this
filter over the input image. At every location we do an element wise matrix
multiplication and sum the result, this resultant matrix is called Convolved
feature. CNN learn the value of these filters on its own during training process.
2.4 PAPER4:
Tsehay Admassu Assegie, Pramod Sekharan Nair:
2.5 PAPER5:
KarishmaTyagi ,VedantRastogi: ”Survey on Character Recognition using OCR
Techniques”, International Journal of Engineering Research &Technology
(IJERT) Vol. 3 Issue 2, February – 2017 ISSN:2278-0181.
In this paper, Several techniques like OCR using correlation method and OCR using
neural networks has been discussed. OCR has three processing steps, Document scanning
process, Recognition process and Verifying process. In the document scanning step, a
scanner is used to scan the hand written or printed documents. So, a scanner with high
speed and color quality is desirable.
The basic idea of BP algorithm is the learning process is divided intotwo phases:
PHASE I: Forward Propagation.
PHASE II: Back Propagation.
Typically many such input/target pairs are used, in this supervised learning, to
train a network. An artificial neural Network as the backend is used for
performing classification and recognition tasks.Analgorithm that performs
handwriting recognition can acquire and detect characteristics from pictures,
touch-screen devices and convert them to a machine-readable form. There are
two basic types of handwriting recognition systems – online and offline .Both
types can be implemented in applications to progressively learn based on the
user’s feedback while performing offline learning on data in parallel.
Several approaches have been used for online and offline handwriting recognition
fields, such as statistical methods, structural methods, neural networks and
syntactic methods. There are two basic types of handwriting recognition systems
– online and offline.
The use of NN for our task, It consists of convolution NN (CNN) layers, recurrent
STEP 1: It is a gray-value image of size 128×32. Then, we copy the image into
a (white) target image of size 128×32. Finally, we normalize the gray-values of
theimagewhichsimplifiesthetaskfortheneuralnetwork.
Figure 3.2: Gray value image simplifies the task for NN.
STEP 2: CNN (convolution neural network): The input image is fed into the
CNN layers. These layers are trained to extract relevant features fromthe image.
CNN output: Fig. shows the output of the CNN layers which is a sequence of length 32.
Each entry contains 256 features. Of course, these features are further processed by the
RNN layers, however, some features already show a high correlation with certain high-
level properties of the input image: there are features which have a high correlation with
characters (e.g. “e”), or with duplicate characters (e.g. “t”), or with character-properties
such as loops (as contained in handwritten “l” s or “e” s).
STEP 3: RNN (recurrent neural network) : The feature sequence contains 256 features
per time-step, the RNN propagates relevant information through this sequence.
Figure 4.2.3 shows a visualization of the RNN output matrix for an image
containing the text “little”. The matrix shown in the top-most graph contains the
scores for the characters including the CTC blank label as its last (80th) entry.
The other matrix-entries, from top to bottom, correspond to the following
characters:“!”#&’()*+,/0123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZ
“Abcdefghijklmnopqrstuvwxyz”.
It can be seen that most of the time, the characters are predicted exactly at the
position they appear in the image (e.g. compare the position of the “i” in the
image and in the graph). Only the last character “e” is not aligned. But this is OK,
as the CTC operation is segmentation-free and does not care about absolute
positions. From the bottom-most graph showing the scores for the
characters “l”, “i”, “t”, “e” and the CTC blank label, the text can easily be decoded:
Example of CNN:
Consider the image below:
The computer understands every pixel. In this case, the white pixels are said to be
-1 while the black OneCare 1. This is just the way we’ve implemented to
differentiate the pixels in a basic binary classification.
Pooling Layer: In this layer we shrink the image stack into a smaller size. Pooling
is done after passing through the activation layer.
We do this by implementing the following 4 steps:
3.3 MARKETPOTENTIAL:
The most prominent factor fueling the market growth is the rising demand from corporate
and government enterprises for effective document management.
Substantial amount of the government and corporate enterprises still rely upon physical
documents and files.
Chapter 4: SOFTWAREREQUIREMENT
The software tools used in this project are:
1. Python3
2. Tensor Flow1.3
3. Numpy
4. OpenCV
4.1 PYTHON3:
Python is an open source programming language that was made to be easy to read.
A Dutch programmer named Guido van Rossum made Python in 1991. He named
it after the television show Monty Python's Flying Circus. Many Python examples
and tutorials include jokes from the show . Python is an interpreted language.
Interpreted languages do not need to be compiled to run. A program called an inter
preterruns Python code on almost any kind of computer.
This means that a programmer can change the code and quickly see the results.
This also means Python is slower than a compiled language like C, because it is
not running machine code directly. Python is a good programming language for
beginners. It is a high-level language, which means a programmer can focus on
what to do instead of how to do it. Writing programs in Python takes less time
than in some other languages. Python drew inspiration from other programming
languages like C,C++,Java Per land Lisp.
Python's developers try to avoid changing the language to make it better until they
have a lot of things to change. Also, they try not to make small repairs, called
patches, to unimportant parts of the C Python reference implementation that
would make it faster.
Python 3.0 was released in 2008. and is interpreted language i.e, it’s not compiled and the
interpreter will check the code line by line. This article can used to learn very basics of
Python programming language. Python makes the development and debugging fast
4.2 TENSORFLOW:
Tensor Flow is an open-source software library. Tensor Flow was originally developed
by researchers and engineers working on the Google Brain Team within Google’s
Machine Intelligence research organization for the purposes of conducting machine
learning and deep neural networks research, but the system is general enough to be
applicable in a wide variety of other domains as well.
Tensor Flow is a free and open-source software library for machine learning. It can be
used across a range of tasks but has a particular focus on training and inference of deep
neural networks. It is used for both research and production at Google. Among the
applications for which Tensor Flow is the foundation, are automated image-captioning
software, such as Deep Dream.
4.3 NUMPY:
Numpy is a library for the Python programming language, adding support for large,
multi-dimensional arrays matrices, along with a large collection of high- level
mathematical functions to operate on these arrays. The ancestor of Numpy,
Numeric, was originally created by Jim Hugunin with contributions from several
other developers.
4.4 OPENCV:
video capture and analysis including features like face detection and object detection.
Officially launched in 1999 the OpenCV project was initially an Intel Research
initiative to advance CPU intensive applications, part of a series of projects
including real-time ray tracing and 3D display walls. The main contributorsto the
project included a number of optimization experts in Intel Russia, as well as
Intel's Performance Library Team. In theearly days of Open CV, the goals ofthe
project weredescribed.
The first alpha version of Open CV was released to the public at the IEEE Conference on
Computer Vision and Pattern Recognition in 2000, and five betas were released between
2001 and 2005. The first 1.0 version was released in 2006. A version 1.1 "pre-release"
was released in October 2008. The second major release of the Open CV was in October
2009. Open CV 2 includes major changes to the C++ interface, aiming at easier, more
type-safe patterns, new functions, and better implementations for existing ones in terms
of performance. Official releases now occur every six months and development is now
done by an independent Russian team supported by commercial corporations. In July
2020, Open CV announced and began a Kick starter campaign for the Open CV AI Kit a
series of hardware modules and additions to Open CV supporting SpatialAi.
4.5 APPLICATIONS:
1. 2D and 3D featuretoolkits.
2. Ego motionestimation.
3. Facial recognitionsystem.
4. Gesturerecognition.
5. Human–computer interaction(HCI).
6. Mobilerobotics.
7. Motionunderstanding.
8. Objectdetection.
9. National ID number recognitionsystem.
10.Postal office automation with code number recognition onEnvelope.
11.Automatic license plate recognitionand
12. Bankautomation.
4.6 FUNCTIONALREQUIREMENTS
• The system should supportthe three stages of the writing process, these
are planning, translation (writing), andreview.
• It should include some spelling support and should incorporate file-
handling facilities. The recognition component should be able to workeven
when children write slowly, it should be able to deal with ‘wobbly’
writing, and should be able to recognize common miss constructions of
characters.
• Wehaveclassifiedthesefunctionalrequirementsasfollow:
1. Taking/ choosing the desired textimage.
2. Recognition of thetext.
3. Copying the text for differentuses.
1. UsabilityRequirements
The application shall be used friendly and doesn't require any guidance to be
used. In other words, the application has to be as simpleas possible, so itsusers
shall use it easily. Actually, the interface is quite simple and straight forward so
that anyone can understandit .
2. ReliabilityRequirements
The application should not have any unexpected failure. In order to avoid any
failure's occurrence, the specifications have been respected and followed
correctly. The only problem that may occur in some cases
isthattheapplicationdoesnotget100%ofthecharactersinthepicture.
3. EfficiencyRequirement
5.1 UMLDIAGRAM
In the Unified Modeling Language (UML), a use case diagram can summarize the details
of your system's users (also known as actors) and their interactions with the system. To
build one, you'll use a set of specialized symbols and connectors. An effective use case
diagram can help your team discuss and represent:
Firstly, to have more compelling and robust training, we could apply additional
pre-processing techniques such as jittering. We could also divide each pixel by its
corresponding standard deviation to normalize the data. Next, given time and
budget constraints, we were limited to 20 training examples for each given word
in order to efficiently evaluate and revise our model. Another method of
improving our character segmentation model would be to move beyond a greedy
search for the most likely solution. We would approach this by considering more
exhaustive but still efficient decoding algorithm such as beam search. We can use
a character/word-based language-based model to add a penalty/benefit score to
each of the possible final beam search candidate paths, along with their
combined individual soft max probabilities, representing the probability of the
sequence of characters/words. If the language model indicates perhaps the most
likely candidate word according to the soft max layer and beam search is very
unlikely given the context so far as opposed to some other likely candidate words,
then our model can correct it-self accordingly.
Chapter 6: CONCLUSION
The recognition system first accepts a scanned image as an input. The image can
be in JPG OR BMT format. Digital capture and conversion of an image often
introduces noise, which makes it hard to identify what is actually a part of the
object of interest. Considering the problem of character recognition, we want to
reduce as much noise as possible, while preserving the strokes of the characters,
since they are important for correct classification.
After reviewing the papers, it was observed that some techniques like direction
feature extraction and diagonal feature extraction techniques were proved to be
better ingenerating higher accuracy results compared to the traditional horizontal
and vertical methods. Also neural networks provide a plus feature of having
higher tolerance to noise.
It was also observed that bigger the training data set helps in achieving a higher
accuracy rate when features are extracted from similar looking characters. This is
highly beneficial; as handwritten characters appear similar, so good feature
extraction techniques need to be used to avoid such anomalies.
REFERENCES: