A Project Skill Lab Report On: Department of Computer Applications

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

A Project Skill Lab Report on

HANDWRITTEN DIGIT RECOGNITION


USING PYTHON

Submitted in partial fulfillment of the requirements for


the award of the degree of

Bachelor of technology

By

19751A0599 S.Durga Prasad


19751A0592 R.Balaji
19751A0571 N.Likhit
19751A05A6 R.Ravi Kishore

Under the esteemed guidance of

Mr . E.Purushotham
Assistant Professor
At

Department of Computer Applications


SREENIVASA INSTITUTE OF TECHNOLOGY
AND
MANAGEMENT STUDIES (AUTONOMOUS)
(Affiliated to JNTU Anantapur, Anantapur)
Murukambattu, Chittoor – 517 127
2021-2022

SREENIVASA INSTITUTE OF TECHNOLOGY AND MANAGEMENT


STUDIES (Autonomous)

Institute Vision

To emerge as a Centre of Excellence for Learning and Research in the domains of


engineering, computing and management.

Institute Mission

 Provide congenial academic ambience with state-art of resources for learning and
research.

 Ignite the students to acquire self-reliance in the latest technologies.

 Unleash and encourage the innate potential and creativity of students.

 Inculcate confidence to face and experience new challenges.

 Foster enterprising spirit among students.

 Work collaboratively with technical Institutes / Universities / Industries of National and


International repute

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


VISION
To contribute for the society through excellence in Computer Science and Engineering with a
deep passion for wisdom, culture and values.
MISSION
M 1: Provide congenial academic ambience with necessary infrastructure and learning
resources.
M 2: Inculcate confidence to face and experience new challenges from industry and society.
M 3: Ignite the students to acquire self-reliance in State-of-the-Art Technologies
M 4: Foster Enterprising spirit among students
PROGRAMME EDUCATIONAL OBJECTIVES (PEOs):
Graduates of Computer Science and Engineering shall
PEO1: Excel in Computer Science and Engineering program through quality studies, enabling
success in computing industry. (Professional Competency)
PEO2: Surpass in one’s career by critical thinking towards successful services and growth of
the organization, or as an entrepreneur or in higher studies. (Successful Career Goals) PEO3:
Enhance knowledge by updating advanced technological concepts for facing the rapidly
changing world and contribute to society through innovation and creativity.(Continuing
Education and Contribution to Society)
PROGRAM SPECIFIC OUTCOMES (PSO's):
Students Shall
PSO1: Have Ability to understand, analyze and develop computer programs in the areas like
algorithms, system software, web design, big data analytics, and networking. PSO2: Deploy
modern computer languages, environment, and platforms in creating innovative products and
solutions.

PROGRAMME OUTCOMES (PO’s)


Computer Science and Engineering Graduates will be able to:
PO1 - Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization for the solution of complex engineering
problems.
PO2 - Problem analysis: Identify, formulate, research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.
PO3 - Design/development of solutions: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with appropriate
consideration for public health and safety, and cultural, societal, and environmental
considerations.
PO4 - Conduct investigations of complex problems: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.
PO5 - Modern tool usage: Ability to design and develop hardware and software in emerging
technology environments like cloud computing embedded products, real-time systems, Internet
of Things, Big Data etc.
PO6- Engineering and Society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
PO7- Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and need
for sustainable development.
PO8- Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
PO9 - Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
PO10 - Communication: Communicate effectively on complex engineering activities with the
engineering community and with the society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give and
receive clear instructions.
PO11 - Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
PO12 - Life-long learning: Basic knowledge in hardware/software methods and tools for
solving real-life and R&D problems with an orientation to lifelong learning.
Course Outcomes for project work
On completion of project work we will be able to,
CO1. Demonstrate in-depth knowledge on the project topic.
CO2. Identify, analyze and formulate complex problem chosen for project work to attain
substantiated conclusions.
CO3. Design solutions to the chosen project problem.
CO4. Undertake investigation of project problem to provide valid conclusions.
CO5. Use the appropriate techniques, resources and modern engineering tools necessary for
project work.
CO6. Apply project results for sustainable development of the society.
CO7. Understand the impact of project results in the context of environmental sustainability.
CO8. Understand professional and ethical responsibilities while executing the project work.
CO9. Function effectively as individual and a member in the project team.
CO10. Develop communication skills, both oral and written for preparing and presenting
project report.
CO11. Demonstrate knowledge and understanding of cost and time analysis required for
carrying out the project.
CO12. Engage in lifelong learning to improve knowledge and competence in the chosen area
of the project.
CO – PO MAPPING

COs \
POs
CO1 √
CO2 √
CO3 √
CO4 √
CO5 √
CO6 √
CO7 √
CO8 √
CO9 √
CO10 √
CO11 √
CO12 √
Evaluation Rubrics for Project work:

Rubric (CO) Excellent (Wt = 3) Good (Wt = 2) Fair (Wt = 1)

Select a latest topic Select a topic


through complete Select a topic through through improper
Selection of Topic
(CO1) knowledge of facts and partial knowledge of knowledge of facts
concepts. facts and concepts. and concepts.

Thorough Reasonable Improper


Analysis and comprehension comprehension comprehension
Synthesis (CO2) through analysis/ through analysis/ through analysis/
synthesis. synthesis. synthesis.

Thorough Reasonable Improper


comprehension about comprehension about comprehension about
Problem Solving (CO3)
what is proposed in the what is proposed in the what is proposed in
literature papers. literature papers. the literature.

Incomplete literature
Extensive literature Considerable literature survey with
Literature Survey
survey with standard survey with standard substandard
(CO4)
references. references. references.

Clearly identified and


has complete Identified and has Identified and has
knowledge of sufficient knowledge inadequate
Usage of Techniques
techniques & tools of techniques & tools knowledge of
& Tools (CO5)
used in the project used in the project techniques & tools
work. work. used in project work.

Conclusion of
Conclusion of project Conclusion of project project work has
Project work impact on
work has strong impact work has considerable feeble impact on
Society (CO6)
on society. impact on society. society.
Conclusion of project Conclusion of
Conclusion of project
Project work impact work has strong impact work has considerable project work has
on Environment (CO7) on impact on feeble impact on
Environment. environment. environment.

Moderate Insufficient
Clearly understands understanding of understanding of
Ethical attitude (CO8) ethical and social ethical and social ethical and social
practices. practices. practices.

Did literature survey


Did literature survey and selected topic Selected a topic as
Independent Learning
and selected topic with with considerable suggested by the
(CO9)
a little guidance guidance supervisor

Presentation in logical
sequence with key Presentation with
Presentation with key
Oral Presentation points, clear insufficient key
points, conclusion and
(CO10) conclusion and points and improper
good language
excellent language conclusion

Status report with Status report with


clear and logical logical sequence of
Status report not
Report Writing (CO10) sequence of chapters chapters using
using excellent understandable properly organized
language language

Time and Cost Comprehensive time Moderate time and Reasonable time and
Analysis (CO11) and cost analysis cost analysis cost analysis

Highly enthusiastic Inadequate interest


Continuous learning Interested in
towards continuous in continuous
(CO12) continuous learning
learning learning
SREENIVASA INSTITUTE OF TECHNOLOGY AND MANAGEMENT
STUDIES (AUTONOMOUS)
CHITTOOR-517127

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


BONAFIDE CERTIFICATE

This is to certify that the project Skill Lab work entitled“HANDWRITTEN DIGIT
RECOGNITION USING PYTHON“ is carried out by S.Durga Prasad ( Reg. No
19751A0599) , R.Balaji (Reg.No.19751A0592) ,N.Likhit (Reg.No.19751A0571) S.Ravi
Kishore (Reg. No .19751A05A6).under my supervision and guidance during the academic year
2020-2021, in partial fulfillment of the requirements for the award of the degree of Bachelor of
Technology.

Guide Head of the Department


[Mr.E.Purushotham] M.tech [Mr Y.Sreeraman] M.Tech

Submitted for Project Skill Lab examination held on ___________________

Internal Examiner External Examiner


DECLARATION

I affirm that the project Skill Lab work titled being HANDWRITTEN DIGIT
RECOGNITIONUSING PYTHON submitted in partial fulfillment for the award of
Bachelor of technology is the original work carried out by me. It has not formed the
part of any other project work submitted for award of any degree, either in this or any
other university.

(Name of the candidate) Register number Signature of the candidate


S. Durga Prasad 19751A0599
R.Balaji 19751A0592
N.Likhit 19751A0571
S.Ravi Kishore 19751A05A6
ACKNOWLEDGEMENT

Predominantly our thanks goes to Late Sri D. K. AUDIKESAVULU Garu


Founder, Late Mrs. D.A. SATHYAPRABHA Garu and Sri
K.RANGANATHAM Garu chair person, SITAMS for the extensive lab facilities
provided in the college.
We would like to express our profound gratitude to our principal Dr. Saravanandh,
M.E., Ph.D., and Mr Y.Sreeraman, M.Tech , HOD, CSE Dept. for offering us/me a chance
to serve in our reputed institution and providing all possible facilities throughout the
completion of my project work. We express our sincere thanks to our Project
Coordinators Mr E.Purushotam, in CSE Dept. for offering us/me the opportunity to
do this work, for their benevolent advice and guidance at each step of my project.
Our sincere thanks to all teaching and non teaching staff members of CSE Dept.,
for their cooperation and guidance.
We are deprived of words to account the co operation, motivation and support
extended by my/our parents and friends at all the moments and hours of this academic
venture.
Finally, we extend our thanks to one and all, whoever helped me/us directly or
indirectly for this presentation in most appropriate and attractive from.
Table of contents:
CHAPTER NO. DETAILS PAGE

ABSTRACT i

LIST OF FIGURES ii

LIST OF TABLES Error! Bookmark not defined.

LIST OF ABBREVATIONS iii

1 INTRODUCTION 01-06
1.1 Aim and Objective
1.2 Motivation
1.3 pre-requests
1.4 The MNIST dataset
1.5 what is deep learning
1.6 what is machine learning
1.7 Deep Learning vs Machine
Learning
2 INSTALLATIONS 07-13
2.1 Python Installation using python
2.2 kerans installation using windows
3 METHODOLOGY AND RELATABLE 14-20
WORK
3.1.Dataset
3.2 Support Vector Machine
3.3 Multilayered Perceptron
3.4 Convolutional Neural
Network
3.5 Visualization
3.6 Related Works
4 IMPLEMENTATION 21-24
4.1 Pre-Processing
4.2 Support Vector Machine
4.3 Multilayered Perceptron
4.4 .Convolutional Neural
Network
5 PROJECT BUILDING 25-33 5.1
Import The Libraries And Load
The Dataset
5.2Preprocess The Data
5.3 Create The Model
5.4Train The Model
5.5 Evaluate The Model
5.6Create GUI To Predict Digits
5.7 Source Code
5.1.1 Code For Model Training
5.1.2 Code For GUI
5.8 Output
6 RESULT 34-38
7 CONCLUSION AND FUTURE 39-40
ENHANCEMENT
7.1 Conclusion
7.2 Future Enhancement

8 REFERENCES
i

ABSTRACT

The reliance of humans over machines has never been so high such that
from object classification in photographs to adding sound to silent movies
everything can be performed with the help of deep learning and machine learning
algorithms. Likewise, Handwritten text recognition is one of the significant areas
of research and development with a streaming number of possibilities that could
be attained. Handwriting recognition (HWR), also known as Handwritten Text
Recognition (HTR), is the ability of a computer to receive and interpret
intelligible handwritten input from sources such as paper documents,
photographs, touch-screens and other devices [1]. Apparently, in this paper, we
have performed handwritten digit recognition with the help of MNIST datasets
using Support Vector Machines (SVM), Multi-Layer Perceptron (MLP) and
Convolution Neural Network (CNN) models. Our main objective is to compare
the accuracy of the models stated above along with their execution time to get the
best possible model for digit recognition.

Keywords: Deep Learning, Machine Learning, Handwritten Digit


Recognition, MNIST datasets, Support Vector Machines (SVM), Multi-Layered
Perceptron (MLP), and Convolution Neural Network (CNN). I.
INTRODUCTION Handwritten digit recognition is the ability of a compute.
ii

LIST OF FIGURES

Sl no Figures Pg.no
1.1 Deep Learning 04
1.2 Machine Learning 5
3.1 Bar graph illustrating the MNIST handwritten digit 15
training dataset
3.2 Plotting of some random MNIST Handwritten digits 15
3.3 working mechanism of SVM Classification 16
3.4 the basic architecture of the Multi layeyer perceptron 17
3.5 the architectural design of CNN layers 18
4.3.1 Sequential Block Diagram of Multi-layers perceptron 22
model
6.1 CNN Bar graph depicting accuracy comparison 35

6.2 Bar graph showing execution time comparison of 35


SVM, MLP and CNN

6.3 ANN-Loss rate v/s Number of epochs 36


6.4 ANN- Accuracy v/s Number of epochs 36
6.5 CNN-Loss rate v/s Number of epochs 37
6.6 CNN- Accuracy v/s Number of epochs 38
iii

LIST OF ABBREVATIONS

AHDR -Automatic Handwritten Digit String Recognition


HOG -Histogram of Oriented Gradients
SVM-Support Vector Machine
ANN-Artificial Neural Network
CNN-Convolutional Neural Network
SRM-Structural Risk Minimization principle
ERM-Empirical Risk Minimization
IP-Interconnection Point
BP-Base Point
CD-Connected Digits
OD-Overlapped Digits
DJD-Disjoint Digits
1

CHAPTER-I
INTRODUCTION

1.1 Aim and objective:

Aim:

The aim of a handwriting digit recognition system is to convert handwritten digits


into machine readable formats. The main applications are vehicle license-plate
recognition, postal letter-sorting services, Cheque truncation system (CTS) scanning
and historical document preservation in archaeology departments, old documents
automation in libraries and banks, etc.
Objective:
The main objective of this work is to ensure effective and reliable approaches
for recognition of handwritten digits .

1.2 Motivation:

This thesis is conducted by using Machine learning concepts. Before going deep
into the topic, we must know about some of these concepts. Machine Learning is a
method which trains the machine to do the job by itself without any human interaction.
At a high level, machine learning is the process of teaching a computer system on how
to make accurate predictions when fed the data. Those predictions will be the output.
There are many sub-branches in machine learning like Neural Networking, Deep
Learning, etc[1].
Among these, Deep Learning is considered to be the most popular sub-branch
of Machine Learning. Initially, the idea of Machine Learning has come into existence
during the 1950s, with the definition of perception[2].
It is the first machine which was capable of sensing & learning. Further, there
was multilayer perceptron in the 1980s, with a limited number of hidden layers.
2

However, the concept of perceptron was not in usage because of its very limited
learning capability. After many years, in the early 2000s, a new concept called Neural
Networks came into existence with many hidden layers[3].
After the emergence of neural networks, many machine learning concepts like
deep learning came into force with multiple levels of representation. Because of these
multiple levels of representation phenomenon, it has become easy to learn and recognize
machines. The human brain is considered as a reference to build deep learning concepts,
as the human brain similarly processes information in multiple layers[4].
A human can easily solve and recognize any problem, but this is not the same in
the case of a machine. Many techniques or methods should be implemented to work as
a human. Apart from all the advancements that have been made in this area, there is still
a significant research gap that needs to be filled. Consider, for example, online
handwriting recognition vs offline recognition [5]. In online handwriting recognition of
letters, an on-time compilation of letters is performed while writing because stroke
information is captured dynamically[5]. Whereas, in offline recognition, the letters
aren’t captured dynamically. Online handwriting recognition is more accurate when
compared to offline handwriting recognition because of the lack of information[6].
Therefore, there can be research done in this area to improve offline handwriting
recognition.

1.3 PREREQUISITES
The interesting Python project requires you to have basic knowledge of Python
programming, deep learning with Keras library and the Tkinter library for building GUI.

1.4 THE MNIST DATASET


This is probably one of the most popular datasets among machine
learning and deep learning enthusiasts. The MNIST dataset contains 60,000
training images of handwritten digits from zero to nine and 10,000 images for
testing. So, the MNIST dataset has 10 different classes. The handwritten digits
3

images are represented as a 28×28 matrix where each cell contains grayscale
pixel value.

1.5 WHAT IS DEEP LEARNING?

Deep learning is a subset of machine learning, which is essentially a neural


network with three or more layers. These neural networks attempt to simulate the
behavior of the human brain—albeit far from matching its ability—allowing it to “learn”
from large amounts of data. While a neural network with a single layer can still make
approximate predictions, additional hidden layers can help to optimize and refine for
accuracy.
Deep learning drives many artificial intelligence (AI) applications and services
that improve automation, performing analytical and physical tasks without human
intervention. Deep learning technology lies behind everyday products and services (such
as digital assistants, voice-enabled TV remotes, and credit card fraud detection) as well
as emerging technologies (such as self-driving cars).

Figure 1.1 Deep learning


4

1.6 WHAT IS MACHINE LEARNING?

Machine learning is a branch of artificial intelligence (AI) and computer science


which focuses on the use of data and algorithms to imitate the way that humans learn,
gradually improving its accuracy.
Machine learning is an important component of the growing field of data science.
Through the use of statistical methods, algorithms are trained to make classifications or
predictions, uncovering key insights within data mining projects. These insights
subsequently drive decision making within applications and businesses, ideally
impacting key growth metrics. As big data continues to expand and grow, the market
demand for data scientists will increase, requiring them to assist in the identification of
the most relevant business questions and subsequently the data to answer them.

Figure 1.2 machine learning


5

1.6 DEEP LEARNING VS. MACHINE LEARNING

If deep learning is a subset of machine learning, how do they differ? Deep learning
distinguishes itself from classical machine learning by the type of data that it works with
and the methods in which it learns.
Machine learning algorithms leverage structured, labeled data to make
predictions—meaning that specific features are defined from the input data for the
model and organized into tables. This doesn’t necessarily mean that it doesn’t use
unstructured data; it just means that if it does, it generally goes through some
preprocessing to organize it into a structured format.
Deep learning eliminates some of data pre-processing that is typically involved
with machine learning. These algorithms can ingest and process unstructured data, like
text and images, and it automates feature extraction, removing some of the dependency
on human experts. For example, let’s say that we had a set of photos of different pets,
and we wanted to categorize by “cat”, “dog”, “hamster”, et cetera. Deep learning
algorithms can determine which features (e.g. ears) are most important to distinguish
each animal from another. In machine learning, this hierarchy of features is established
manually by a human expert.
Then, through the processes of gradient descent and backpropagation, the deep
learning algorithm adjusts and fits itself for accuracy, allowing it to make predictions
about a new photo of an animal with increased precision.
Machine learning and deep learning models are capable of different types of learning as
well, which are usually categorized as supervised learning, unsupervised learning, and
reinforcement learning. Supervised learning utilizes labeled datasets to categorize or
make predictions; this requires some kind of human intervention to label input data
correctly. In contrast, unsupervised learning doesn’t require labeled datasets, and
instead, it detects patterns in the data, clustering them by any distinguishing
characteristics. Reinforcement learning is a process in which a model learns to become
more accurate for performing an action in an environment based on feedback in order
to maximize the reward.
6

CHAPTER 2 INSTALLATIONS

Python is a widely used high-level programming language. To write and execute


code in python, we first need to install Python on our system.

Installing Python on Windows takes a series of few easy steps.

STEP 1 − SELECT VERSION OF PYTHON TO INSTALL


Python has various versions available with differences between the syntax and
working of different versions of the language. We need to choose the version which we
want to use or need. There are different versions of Python 2 and Python 3 available.

STEP 2 − DOWNLOAD PYTHON EXECUTABLE INSTALLER


On the web browser, in the official site of python (www.python.org), move to
the Download for Windows section.

All the available versions of Python will be listed. Select the version required by you
and click on Download. Let suppose, we chose the Python 3.9.1 version.

On clicking download, various available executable installers shall be visible with


different operating system specifications. Choose the installer which suits your system
operating system and download the instlaller. Let suppose, we select the Windows
installer(64 bits).

The download size is less than 30MB.


7

STEP 3 − RUN EXECUTABLE INSTALLER


We downloaded the Python 3.9.1 Windows 64 bit installer.

Run the installer. Make sure to select both the checkboxes at the bottom and then click
Install New.

On clicking the Install Now, The installation process starts.


8

The installation process will take few minutes to complete and once the installation is
successful, the following screen is displayed.

STEP 4 − VERIFY PYTHON IS INSTALLED ON WINDOWS

To ensure if Python is succesfully installed on your system. Follow the given steps −

• Open the command prompt.


9

• Type ‘python’ and press enter.

• The version of the python which you have installed will be displayed if the
python is successfully installed on your windows.

STEP 5 − VERIFY PIP WAS INSTALLED


Pip is a powerful package management system for Python software packages. Thus,
make sure that you have it installed.

To verify if pip was installed, follow the given steps −

• Open the command prompt.

• Enter pip –V to check if pip was installed.

• The following output appears if pip is installed successfully.

We have successfully installed python and pip on our Windows system.


KERAS-TENSORFLOW-GPU-WINDOWS-INSTALLATION

10 easy steps on the installation of TensorFlow-GPU and Keras in Windows

Step 1: Install NVIDIA Driver Download


Select the appropriate version and click search

STEP 2: INSTALL ANACONDA (PYTHON 3.8 VERSION) Download


10

STEP 3: UPDATE ANACONDA

Open Anaconda Prompt to type the following command(s)

conda update conda conda


update --all

STEP 4: INSTALL CUDA TOOKIT 10.0 Download

Choose your version depending on your Operating System

step 5: Download cuDNN Download


Choose your version depending on your Operating System. Membership registration

is required. put your unzipped folder in C drive as follows:

D:\cudnn-10.1-windows10-x64-v7.5.0.56

STEP 6: ADD CUDNN INTO ENVIRONMENT PATH

Add the following path in your Environment. Subjected to changes in your installation
path.

D:\cudnn-8.0-windows10-x64-v5.1\cuda\bin
You can either follow this Tutorial here or the following steps (for Windows 10).
Step 6.1: open the Start Search, type in “env”
11

Step 6.2: choose “Edit environment variables for your account”: step

6.3:

under the “Users' Variables” section (the upper half), find the row with “Path” in the
first column, and click edit.

Now you can add a new path to the environment varible

Turn off all the prompts. Open a new Anaconda Prompt to type the following
command(s)

echo %PATH%
You shall see that the new Environment PATH is there.

STEP 7: CREATE AN ANACONDA ENVIRONMENT WITH PYTHON=3.6

Open Anaconda Prompt to type the following command(s) conda

create -n keras-gpu python=3.6 numpy scipy keras-gpu

STEP 8: ACTIVATE THE ENVIRONMENT

Open Anaconda Prompt to type the following command(s) activate

keras-gpu

STEP 9: TESTING

Let's try running mnist_mlp.py in your prompt.

Open Anaconda Prompt to type the following command(s)

activate keras-gpu python


mnist_mlp.py
Congratulations ! You have successfully run Keras (with Tensorflow backend) over
GPU on Windows !
12

In the event that you get a tensorflow Attribute error, ensure you do the following then
try again:

conda install -c anaconda tensorflow-estimator=2.1

Step 10: Done !

CHAPTER 3
METHODOLOGY AND RELATABLE WORK
The comparison of the algorithms (Support vector machines, Multi-layered perceptron
and Convolutional neural network) is based on the characteristic chart of each algorithm
on common grounds like dataset, the number of epochs, complexity of the algorithm,
accuracy of each algorithm, specification of the device (Ubuntu 20.04 LTS, i5 7th gen
processor) used to execute the program and runtime of the algorithm, under ideal
condition.
13

3.1.DATASET

Handwritten character recognition is an expansive research area that already


contains detailed ways of implementation which include major learning datasets,
popular algorithms, features scaling and feature extraction methods. MNIST dataset
(Modified National Institute of Standards and Technology database) is the subset of the
NIST dataset which is a combination of two of NIST’s databases: Special Database 1
and Special Database 3. Special Database 1 and Special Database 3 consist of digits
written by high school students and employees of the United States Census Bureau,
respectively. MNIST contains a total of 70,000 handwritten digit images (60,000 -
training set and 10,000 - test set) in 28x28 pixel bounding box and anti-aliased. All these
images have corresponding Y values which apprises what the digit is.

Figure 3.1. Bar graph illustrating the MNIST handwritten digit training dataset
(Label vs Total number of training samples).
14

Figure 3.2. Plotting of some random MNIST Handwritten digits

3.2. SUPPORT VECTOR MACHINE

Support Vector Machine (SVM) is a supervised machine learning algorithm. In


this, we generally plot data items in n-dimensional space where n is the number of
features, a particular coordinate represents the value of a feature, we perform the
classification by finding the hyperplane that distinguishes the two classes. It will choose
the hyperplane that separates the classes correctly. SVM chooses the extreme vectors
that help in creating the hyperplane. These extreme cases are called support vectors, and
hence the algorithm is termed as Support Vector Machine. There are mainly two types
of SVMs, linear and non-linear SVM. In this paper, we have used Linear SVM for
handwritten digit recognition [10].

Figure 3.3. This image describes the working mechanism of SVM Classification with
supporting vectors and hyperplanes.

3.3. MULTILAYERED PERCEPTRON

A multilayer perceptron (MLP) is a class of feedforward artificial neural


networks (ANN). It consists of three layers: input layer, hidden layer and output layer.
15

Each layer consists of several nodes that are also formally referred to as neurons and
each node is interconnected to every other node of the next layer. In basic MLP there
are 3 layers but the number of hidden layers can increase to any number as per the
problem with no restriction on the number of nodes. The number of nodes in the input
and output layer depends on the number of attributes and apparent classes in the dataset
respectively. The particular number of hidden layers or numbers of nodes in the hidden
layer is difficult to determine due to the model erratic nature and therefore selected
experimentally. Every hidden layer of the model can have different activation functions
for processing. For learning purposes, it uses a supervised learning technique called
backpropagation. In the MLP, the connection of the nodes consists of a weight that gets
adjusted to synchronize with each connection in the training process of the model[11].

Figure 3.4. This figure illustrates the basic architecture of the Multi layer perceptron
with variable specification of the network.

3.4. CONVOLUTIONAL NEURAL NETWORK

CNN is a deep learning algorithm that is widely used for image recognition and
classification. It is a class of deep neural networks that require minimum pre-processing.
It inputs the image in the form of small chunks rather than inputting a single pixel at a
time, so the network can detect uncertain patterns (edges) in the image more efficiently.
CNN contains 3 layers namely, an input layer, an output layer, and multiple hidden
layers which include Convolutional layers, Pooling layers(Max and Average pooling),
16

Fully connected layers (FC), and normalization layers [12]. CNN uses a filter (kernel)
which is an array of weights to extract features from the input image. CNN employs
different activation functions at each layer to add some non-linearity [13]. As we move
into the CNN, we observe the height and width decrease while the number of channels
increases. Finally, the generated column matrix is used to predict the output [14].

Figure 3.5. This figure shows the architectural design of CNN layers in the form of a Flow chart.

3.5. VISUALIZATION

In this research, we have used the MNIST dataset (i.e. handwritten digit dataset)
to compare different level algorithm of deep and machine learning (i.e. SVM,
ANNMLP, CNN) on the basis of execution time, complexity, accuracy rate, number of
epochs and number of hidden layers (in the case of deep learning algorithms). To
visualize the information obtained by the detailed analysis of algorithms we have used
bar graphs and tabular format charts using module matplotlib, which gives us the most
precise visuals of the step by step advances of the algorithms in recognizing the digit.
The graphs are given at each vital part of the programs to give visuals of each part to
bolster the outcome.

3.6 RELATED WORKS:


17

The following are some of the terms and concepts used in this research. Our work
performance of machine learning methods by using a support vector machine, artificial
neural network and convolutional neural network on handwritten digits recognition is
inspired by a few related works[55]. While, applying this three classifier SVM, ANN,
and CNN to recognizing digits with noise. It demonstrated that SVM, ANN and CNN
system can achieve high accuracy on recognition of handwritten digits on documented
images[39]. However, these methods are used in this work to find the best algorithm for
handwritten digits recognition. They were few drawbacks identified by the research
area, by this, we can say that it is important to conduct a pre-study in order to understand
the work that has been already done on classifying the methods and to understand the
limitations of existing machine learning methods[10]. The results from the literature
review give us a lot of existing research area on preprocessing, segmentation, feature
extraction with specific techniques and classification to recognize the digits In the paper
[93], the authors have conducted research related to “Handwritten Word Recognition

Using Multi-view Analysis”. The major contribution of this research is a solution to the
problem of efficiently recognizing handwritten words from a limited size lexicon. The
authors developed a multiple classifier systems, that analyzes the words from three
different approximation levels, in order to get a computational approach inspired by the
human reading process. The authors of the paper [94] have conducted research related
to “Handwriting Recognition On Form Document”. The author used Freeman Chain
Code, with the division of a region into nine sub-regions, histogram normalization of
chain code as feature extraction and Artificial Neural Networks, to classify the
characters on the form document. In the paper [95], the authors have conducted research
related to “Neural Networks for Handwritten English Alphabet Recognition.” They
have developed a system to recognize handwritten English alphabets by using neural
networks. In this system, each alphabet has been represented by binary values that are
used as an input to a simple feature extraction system, whose output is fed to the neural
network system. In the paper [96], The authors have extracted the features of numeral
and mathematical operators. They have used SVM for classification as well as to remove
18

the noise from the dataset. A feature extraction method has been used on NIST dataset
which consists of uppercase, lowercase, and merger of uppercase and lowercase.
The authors of the paper [97] “Sunspot drawings handwritten character recognition
method based on deep learning”, presented a deep learning method for scanned sunspot
drawings handwritten characters recognition. A Convolution Neural Network, which is
a type of deep learning algorithm and is truly successful in the training of multi-layer
network structure, is used to train the recognition model of handwritten character
images. The advantages of the proposed method by Chinese Academy Yunnan and the
experimental results show that the proposed method achieves a high recognition
accuracy rate. The authors of [98] “New approach for segmentation and recognition of
handwritten numeral strings” have proposed a new system for segmentation and
recognition of unconstrained handwritten numeral strings. The proposed system uses a
combination of foreground and background features for segmentation of touching digits.
In this paper [99], the authors have proposed a directional method for feature extraction
on English handwritten characters. The collected data has been classified based on the
similarity between the vector feature of data training and the vector feature of data
testing. The authors of the paper [100] “New efficient algorithm for recognizing
handwritten Hindi digits”, have presented a new algorithm for recognizing handwritten
Hindi digits, which is based on using the topological characters combined with
statistical properties of the given digits in order to extract a set of features that can be
used in the process of digit classification.
19

CHAPTER 4
IMPLEMENTATION

To compare the algorithms based on working accuracy, execution time,


complexity, and the number of epochs (in deep learning algorithms) we have used three
different classifiers: • Support Vector Machine Classifier • ANN - Multilayer Perceptron
Classifier • Convolutional Neural Network Classifier We have discussed in detail about
the implementation of each algorithm explicitly below to create a flow of this analysis
to create a fluent and accurate comparison.

4.1 PRE-PROCESSING

Pre-processing is an initial step in the machine and deep learning which focuses
on improving the input data by reducing unwanted impurities and redundancy. To
simplify and break down the input data we reshaped all the images present in the dataset
in 2-dimensional images i.e (28,28,1). Each pixel value of the images lies between 0 to
255 so, we Normalized these pixel values by converting the dataset into ’float32’ and
then dividing by 255.0 so that the input features will range between 0.0 to 1.0. Next, we
20

performed one-hot encoding to convert the y values into zeros and ones, making each
number categorical, for example, an output value 4 will be converted into an array of
zero and one i.e [0,0,0,0,1,0,0,0,0,0].

4.2 SUPPORT VECTOR MACHINE

The SVM in scikit-learn [16] supports both dense (numpy.ndarray and


convertible to that by numpy.asarray) and sparse (any scipy.sparse) sample vectors as
input. In scikit-learn, SVC, NuSVC and LinearSVC are classes capable of performing
multi-class classification on a dataset. In this paper we have used LinearSVC for
classification of MNIST datasets that make use of a Linear kernel implemented with the
help of LIBLINEAR [17]. Various scikit-learn libraries like NumPy, matplotlib, pandas,
Sklearn and seaborn have been used for the implementation purpose. Firstly, we will
download the MNIST datasets, followed by loading it and reading those CSV files using
pandas. After this, plotting of some samples as well as converting into matrix followed
by normalization and scaling of features have been done. Finally, we have created a
linear SVM model and confusion matrix that is used to measure the accuracy of the
model [9].

4.3 MULTILAYERED PERCEPTRON

The implementation of Handwritten digits recognition by Multilayer perceptron [18]


which is also known as feedforward artificial neural network is done with the help of
Keras module to create an MLP model of Sequential class and add respective hidden
layers with different activation function to take an image of 28x28 pixel size as input.
After creating a sequential model, we added a Dense layer of different specifications
and Drop out layers as shown in the image below. The block diagram is given here for
reference. Once you have the training and test data, you can follow these steps to train
a neural network in Keras
21

Figure 4.3.1. Sequential Block Diagram of Multi-layers perceptron model built with
the help of Keras module .

We used a neural network with 4 hidden layers and an output layer with 10 units (i.e.
total number of labels). The number of units in the hidden layers is kept to be 512. The
input to the network is the 784-dimensional array converted from the 28×28 image. We
used the Sequential model for building the network. In the Sequential model, we can
just stack up layers by adding the desired layer one by one. We used the Dense layer,
also called a fully connected layer since we are building a feedforward network in which
all the neurons from one layer are connected to the neurons in the previous layer. Apart
from the Dense layer, we added the ReLU activation function which is required to
introduce non-linearity to the model. This will help the network learn non-linear
decision boundaries. The last layer is a softmax layer as it is a multiclass classification
problem [19].

4.4.CONVOLUTIONAL NEURAL NETWORK

The implementation of handwritten digit recognition by Convolutional Neural


Network [15] is done using Keras. It is an open-source neural network library that is
used to design and implement deep learning models. From Keras, we have used a
Sequential class which allowed us to create model layer-by-layer. The dimension of the
input image is set to 28(Height),
22

28(Width), 1(Number of channels). Next, we created the model whose first layer is a
Conv layer [20]. This layer uses a matrix to convolve around the input data across its
height and width and extract features from it. This matrix is called a Filter or Kernel.
The values in the filter matrix are weights. We have used 32 filters each of the
dimensions (3,3) with a stride of 1. Stride determines the number of pixels shifts.
Convolution of filter over the input data gives us activation maps whose dimension is
given by the formula: ((N + 2P - F)/S) + 1 where N= dimension of input image, P=
padding, F= filter dimension and S=stride. In this layer, Depth (number of channels) of
the output image is equal to the number of filters used. To increase the non-linearity,
we have used an activation function that is Relu [21]. Next, another convolutional layer
is used in which we have applied 64 filters of the same dimensions (3,3) with a stride of
1 and the Relu function. Next, to these layers, the pooling layer [22] is used which
reduces the dimensionality of the image and computation in the network. We have
employed MAX-pooling which keeps only the maximum value from a pool. The depth
of the network remains unchanged in this layer. We have kept the pool-size (2,2) with
a stride of 2, so every 4 pixels will become a single pixel. To avoid overfitting in the
model, Dropout layer [23] is used which drops some neurons which are chosen
randomly so that the model can be simplified. We have set the probability of a node
getting dropped out to 0.25 or 25%. Following it, Flatten Layer [23] is used which
involves flattening i.e. generating a column matrix (vector) from the 2-dimensional
matrix. This column vector will be fed into the fully connected layer [24]. This layer
consists of 128 neurons with a dropout probability of 0.5 or 50%. After applying the
Relu activation function, the output is fed into the last layer of the model that is the
output layer. This layer has 10 neurons that represent classes (numbers from 0 to 9) and
the SoftMax function [25] is employed to perform the classification. This function
returns probability distribution over all the 10 classes. The class with the maximum
probability is the output.
23

CHAPTER-5
PROJECT BUILDING

5.1 IMPORT THE LIBRARIES AND LOAD THE DATASET


First, we are going to import all the modules that we are going to need for training
our model. The Keras library already contains some datasets and MNIST is one of them.
So we can easily import the dataset and start working with it.
The mnist.load_data() method returns us the training data, its labels and also the testing
data and its labels.

5.2 PREPROCESS THE DATA

The image data cannot be fed directly into the model so we need to perform some
operations and process the data to make it ready for our neural network. The
24

dimension of the training data is (60000,28,28). The CNN model will require one more
dimension so we reshape the matrix to shape (60000,28,28,1).

5.3 CREATE THE MODEL

Now we will create our CNN model in Python data science project. A CNN model
generally consists of convolutional and pooling layers. It works better for data that are
represented as grid structures, this is the reason why CNN works well for image
classification problems. The dropout layer is used to deactivate some of the neurons and
while training, it reduces offer fitting of the model. We will then compile the model with
the Adadelta optimizer.

5.4 TRAIN THE MODEL

The model.fit() function of Keras will start the training of the model. It takes the
training data, validation data, epochs, and batch size.
It takes some time to train the model. After training, we save the weights and model
definition in the ‘mnist.h5’ file.

5.5 EVALUATE THE MODEL

We have 10,000 images in our dataset which will be used to evaluate how good
our model works. The testing data was not involved in the training of the data therefore,
it is new data for our model. The MNIST dataset is well balanced so we can get around
99% accuracy.

5.6 CREATE GUI TO PREDICT DIGITS

Now for the GUI, we have created a new file in which we build an interactive
window to draw digits on canvas and with a button, we can recognize the digit. The
25

Tkinter library comes in the Python standard library. We have created a function
predict_digit() that takes the image as input and then uses the trained model to predict
the digit.
Then we create the App class which is responsible for building the GUI for our app.
We create a canvas where we can draw by capturing the mouse event and with a button,
we trigger the predict_digit() function and display the results.

5.7 SOURCE CODE

5.7.1 HERE IS THE SOURCE FOR TRAINING THE MODEL:

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

# the data, split between train and test sets


(x_train,y_train),(x_test,y_test) = mnist.load_data()

print(x_train.shape,y_train.shape)
26

x_train=x_train.reshape(x_train.shape[0],28,28,1)
x_test=x_test.reshape(x_test.shape[0],28,28,1)
input_shape=(28,28,1)

# convert class vectors to binary class matrices


y_train=keras.utils.to_categorical(y_train,10)
y_test=keras.utils.to_categorical(y_test,10)

x_train=x_train.astype("float32")
x_test=x_test.astype("float32")

x_train/=255
x_test/=255

batch_size=128
num_classes=10
epochs=10

model=Sequential()
model.add(Conv2D(32,kernel_size=(5,5),activation='relu',input_sh
ape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D,kernal_size=(5,5),activation='relu')
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes,activation='softmax'))
27

model.compile(loss=keras.losses.categorical_crossentropy,optimiz
er=keras.optimizer.Adadelta(),metrics=['accuracy'])

hist=model.fit(x_train,y_train,batch_size=batch_size,epochs=epoc
hs,verbose=1,validation_data=(x_test,y_test))
print("The model has successfully trained")

score = model.evaluate(x_test,y_test,verbose=0)
print('Test loss:',score[0])
print('Test accuracy:',score[1])

model.save('mnist.h5')
print("Saving the model as mnist.h5")

5.7.2 Here’s the full code for our gui_digit_recognizer.py file:


from keras.models import load_model
from tkinter import *
import tkinter as tk
import win64gui
from PIL import ImageGrab, Image
import numpy as np

model = load_model('mnist.h5')

def predict_digit(img):
#resize image to 28x28 pixels
img = img.resize((28,28))
#convert rgb to grayscale
img = img.convert('L')
img = np.array(img)
28

#reshaping to support our model input and normalizing


img = img.reshape(1,28,28,1)
img = img/255.0
#predicting the class
res = model.predict([img])[0]
return np.argmax(res), max(res)

class App(tk.Tk):
def __init__(self):
tk.Tk.__init__(self)

self.x = self.y = 0

# Creating elements
self.canvas = tk.Canvas(self, width=300, height=300, bg = "white",
cursor="cross")
self.label = tk.Label(self, text="Draw..",font=("Helvetica", 48))
self.classify_btn = tk.Button(self, text = "Recognise", command
=self.classify_handwriting)
self.button_clear = tk.Button(self, text = "Clear",command = self.clear_all)

# Grid structure
self.canvas.grid(row=0, column=0,pady=2, sticky=W, )
self.label.grid(row=0,column=1,pady=2, padx=2)
self.classify_btn.grid(row=1,column=1, pady=2, padx=2)
self.button_clear.grid(row=1, column=0, pady=2)

self.canvas.bind("<B1-Motion>", self.draw_lines)

def clear_all(self):
self.canvas.delete("all")
29

def classify_handwriting(self):
HWND = self.canvas.winfo_id() # get the handle of the canvas
rect = win32gui.GetWindowRect(HWND) # get the coordinate of the canvas
a,b,c,d = rect
rect=(a+4,b+4,c-4,d-4)
im = ImageGrab.grab(rect)

digit, acc = predict_digit(im)


self.label.configure(text=str(digit)+', '+ str(int(acc*100))+'%')

def draw_lines(self, event):


self.x = event.x
self.y = event.y
r=8
self.canvas.create_oval(self.x-r, self.y-r, self.x + r, self.y + r, fill='black')

app = App()
mainloop()
30

5.8 Output

Here Are The Screenshots Of The Obtained Output From The Code
31

CHAPTER-6 RESULT

After implementing all the three algorithms that are SVM, MLP and CNN we
have compared their accuracies and execution time with the help of experimental graphs
for perspicuous understanding. We have taken into account the Training and Testing
Accuracy of all the models stated above. After executing all the models, we found that
SVM has the highest accuracy on training data while on testing dataset CNN
accomplishes the utmost accuracy. Additionally, we have compared the execution time
to gain more insight into the working of the algorithms. Generally, the running time of
an algorithm depends on the number of operations it has performed. So, we have trained
our deep learning model up to 30 epochs and SVM models according to norms to get
the apt outcome. SVM took the minimum time for execution while CNN accounts for
the maximum running time.

TABLE I COMPARISON ANALYSIS OF DIFFERENT MODELS

MODEL TRAINING RATE RATE EXECUTION


TESTING TIME

SVM 99.98% 94.005% 1:35 min

MLP 99.92% 98.85% 2:32 min

CNN 99.53% 99.31% 44:02 min

This table represents the overall performance for each model. The table contains
5 columns, the 2nd column represents model name, 3rd and 4th column represents the
training and testing accuracy of models, and 5th column represents execution time of
models.
32

Figure 6.1. Bar graph depicting accuracy comparison

Figure 6.2. Bar graph showing execution time comparison of SVM,


MLP and CNN

Furthermore, we visualized the performance measure of deep learning


models and how they ameliorated their accuracy and reduced the error rate
concerning the number of epochs. The significance of sketching the graph is to
know where we should apply early stop so that we can avoid the problem of
overfitting as after some epochs, change in accuracy becomes constant.
33

Figure 6.3. ANN-Loss rate v/s Number of epochs.

Figure 6.4 ANN-Accuracy v/s Number of epochs.


34

Figure 6.5. CNN-Loss rate v/s Number of epochs.

Figure 6.6. CNN-Accuracy v/s Number of epochs.


35

CHAPTER-7

7.1 CONCLUSION

In this research, we have implemented three models for handwritten digit


recognition using MNIST datasets, based on deep and machine learning algorithms. We
compared them based on their characteristics to appraise the most accurate model
among them. Support vector machines are one of the basic classifiers that’s why it’s
faster than most algorithms and in this case, gives the maximum training accuracy rate
but due to its simplicity, it’s not possible to classify complex and ambiguous images as
accurately as achieved with MLP and CNN algorithms. We have found that CNN gave
the most accurate results for handwritten digit recognition. So, this makes us conclude
that CNN is best suitable for any type of prediction problem including image data as an
input. Next, by comparing execution time of the algorithms we have concluded that
increasing the number of epochs without changing the configuration of the algorithm is
useless because of the limitation of a certain model and we have noticed that after a
certain number of epochs the model starts overfitting the dataset and give us the biased
prediction.

7.2 FUTURE ENHANCEMENT

The future development of the applications based on algorithms of deep and


machine learning is practically boundless. In the future, we can work on a denser or
hybrid algorithm than the current set of algorithms with more manifold data to achieve
the solutions to many problems. In future, the application of these algorithms lies from
the public to high-level authorities, as from the differentiation of the algorithms above
and with future development we can attain high-level functioning applications which
can be used in the classified or government agencies as well as for the common people,
we can use these algorithms in hospitals application for detailed medical diagnosis,
treatment and monitoring the patients, we can use it in surveillances system to keep
36

tracks of the suspicious activity under the system, in fingerprint and retinal scanners,
database filtering applications, Equipment checking for national forces and many more
problems of both major and minor category. The advancement in this field can help us
create an environment of safety, awareness and comfort by using these algorithms in
day to day application and high-level application (i.e. Corporate level or Government
level). Application-based on artificial intelligence and deep learning is the future of the
technological world because of their absolute accuracy and advantages over many major
problems.
37

REFERENCES:

[1] “Handwriting recognition”: https : //en.wikipedia.org/wiki/Handwritingrecognition


[2] “What can a digit recognizer be used for?”: https : //www.quora.com/W
hat−can−a−digit−recognizer− be − used − f or

[3] ”Handwritten Digit Recognition using Machine Learning Algorithms”, S M


Shamim, Mohammad Badrul Alam Miah, Angona Sarker, Masud Rana &
Abdullah Al
Jobair.
[4] ”Handwritten Digit Recognition Using Deep Learning”, Anuj Dutt and
Aashi Dutt.
[5] ”Handwritten recognition using SVM, KNN, and Neural networks”,
Norhidayu binti Abdul Hamid, Nilam Nur Binti Amir Sharif.
[6] ”Recognition of Handwritten Digit using Convolutional Neural Network in
Python with Tensorflow and Comparison of Performance for Various
Hidden Layers”, Fathma Siddique, Shadman Sakib, Md. Abu Bakr Siddique.
[7] ”Advancements in Image Classification using Convolutional Neural
Network” by Farhana Sultana, Abu Sufian & Paramartha Dutta.
[8] ”Comparison of machine learning methods for classifying mediastinal lymph
node metastasis of non-small cell lung cancer from 18 FFDG PET/CT
images” by Hongkai Wang, Zongwei Zhou, Yingci Li, Zhonghua Chen,
Peiou Lu, Wenzhi Wang, Wanyu Liu, and Lijuan Yu.
[9] https : //github.com/rishikakushwah16/SV M digit recognition using MNIST
dataset
[10] “Support Vector Machine Algorithm”: https :

//www.javatpoint.com/machine−learning−support− vector − machine − algorithm


38

You might also like