BIS RaysRecog

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

qwertyuiopasdfghjklzxcvbnmqwertyui

opasdfghjklzxcvbnmqwertyuiopasdfgh
Ray
jklzxcvbnmqwertyuiopasdfghjklzxcvb
sRe
nmqwertyuiopasdfghjklzxcvbnmqwer
cog
tyuiopasdfghjklzxcvbnmqwertyuiopas
dfghjklzxcvbnmqwertyuiopasdfghjklzx
cvbnmqwertyuiopasdfghjklzxcvbnmq
wertyuiopasdfghjklzxcvbnmqwertyuio
pasdfghjklzxcvbnmqwertyuiopasdfghj
klzxcvbnmqwertyuiopasdfghjklzxcvbn
mqwertyuiopasdfghjklzxcvbnmqwerty
uiopasdfghjklzxcvbnmqwertyuiopasdf
ghjklzxcvbnmqwertyuiopasdfghjklzxc
vbnmqwertyuiopasdfghjklzxcvbnmrty
uiopasdfghjklzxcvbnmqwertyuiopasdf
ghjklzxcvbnmqwertyuiopasdfghjklzxc
2

RaysRecog™
New handwriting analysis software using KohoNet™

Saranya Roy

Submitted to -

Dr. Prithwis Mukherjee
Head – Research

Contents

Serial No. Topic Page Number

1 Buzzwords 2
3

2 Abstract 3

3 Introduction 3

4 The Software 4

5 Applications of the software – A business perspective 6

6 Installation Process 6

7 Technical overview 8

8 Conclusion & Future Prospects 15

9 References 15

Buzzwords

HMM: Hidden Markov Model


KohoNet: Kohonen Neural Network
OCR: Optical Character Recognition
WVS: Word Vector Space

ABSTRACT

As Database Technology and the Data Mining Paradigm are advancing rapidly, the Reposition and Retrieval of
information has become easier beyond the imagination. But current Information Technology is still inadequate to
manage the analysis of data. Though we are using the most efficient technologies like Java, Dot Net, SAS, SAP,
Advanced Excel etc. but we are still re -entering data from a Hand Written Form to a Database manually!!!
4

One prescription to this may be introduction of Hand Written Character


Recognition (HWCR) System. But this system is too complex to implement for commercial purpose.

Through this effort, I tried to partially solve this problem by building a new Simulation Software, which is capable
of performing basic OCR functionalities. The principle idea behind this simulation software is to convert the
scanned character image to the corresponding editable text. The Work Space will consist of a Kohonen Neural
Network which will primarily guide the interface for the character extraction and analysis.

INTRODUCTION

The term OCR is abbreviated for Optical Character Recognition. OCR is basically a paradigm which locates and
extracts character(s) from a text image to a computer editable text. Several different ways are there for text
extraction such as through a scanner, or can be also done through a graphics mat (a device through which using
photo pen text is written) or any other similar means. The transformed text is found either in ASCII format or may
be in Unicode Text Format (UTF).
This paradigm can be beneficial in different ways.
Useful in the case of huge amount of data entry.
Useful in word processing systems to extract characters from the graphics mat or scanned document.
It can also be used in case of Digital Signature Manipulation and in many other significant cases.
These types of systems can be fundamentally categorized into Embedded Applications, as these systems are
intentionally developed to be dedicated for some special tasks. Here in this case the OCR approach is defined for
English Character Symbols which are set to follow the classifications such as Upper Case Alphabets (A-Z), Lower
Case Alphabets (a-z) and Numbers (0-9).

Here the Kohonen Neural Network will be used as a means to provide training and analysis procedures for the
purpose of classifying the character pattern stages in the pattern matrix. The edifice of words has been done by a
probabilistic model, called Hidden Markov Model (HMM). The HMM scheme represents each word via
consecutive frame(s) in which each frame stands for a unit character matrix. For example if we consider the word
COBALT then it would have six frames like F0: Mat(C), F1: Mat (O), F2: Mat (B), F3: Mat (A), F4: Mat (L) and F5:
Mat (T). During the extraction stage each character pattern would be stored in a set of six 5(C) x 7(R) matrices (i.e.
Mat (?)) which will be considered as a Word Vector Space (WVS).

THE Software – RaysRecog™

Script File Instructions:

RaysRecog™ is able to run simple scripts to make it easier to recognize large amounts of handwriting at once.
Scripts can be used to import and export everything that the program can, and they can use the train and read
5

buttons. The scripting system is also has the additional ability to create an "accuracy file", which is just the number
of letter discrepancies between two text files, and it can combine accuracy files. This makes accuracy testing much
easier and faster.

RaysRecog™ script files should be saved with the file extension .ssf. They can be run using the Tools>Run Script
menu item in the program. Each command on an SSF file must be on a separate line. Although there may be empty
lines, comments are currently not supported. Each command may have one or more arguments. The format for a
line is command name, space (if there are any arguments), argument1 (if applicable), comma (if there is a second
argument), argument2 (if applicable), comma (if there is a third argument), argument3, etc. Note that there are no
spaces in between arguments, and that quotation marks are never necessary.

Screen Shot:

Fig. 01: RaysRecog™ screenshot


List of commands:

train
read
run_script
import_image
import_text
export_image
6

export_text
generate_accuracy_file
combine_accuracy_files

Description of each command:

train
Arguments: 0
Description: This works just like pressing the train button. It takes the active area contents and the text area
contents and trains the selected writer using them.

read
Arguments: 0
Description: This works just like pressing the read button. It takes the active area contents, interprets them, and
outputs the result into the text area.

run_script
Arguments: 1: The path of an existing .ssf file.
Description: This takes another script file and runs it. Note that since conditional statements are currently not
possible, scripts should never be called recursively.

import_image
Arguments: 1: The path of an existing .bmp or .isd file.
Desctiption: This takes any bitmap or ink stroke data file and loads it into the active area. It will automatically
detect the file type.

import_text
Arguments: 1: The path of an existing .txt file.
Description: This loads the given text file into the text area.

export_image
Arguments: 1: The path to a .isd file (does not need to exist).
Description: This exports the active area into an ink stroke data file. Note that exporting bitmaps is not yet
supported.

export_text
Arguments: 1: The path to a .txt file (does not need to exist).
Description: This exports the text area into a text file.
7

generate_accuracy_file
Arguments: 3: A .txt file for the accuracy file (does not need to exist), then two different .txt files that will be
compared.
Description: This is used to compare the program output text with a text file containing the correct text. It looks at
the files given by arguments 2 and 3 letter-by-letters and finds the number of differences. It then outputs the
resulting letter and word accuracies into the file given by argument 1.

combine_accuracy_files
Arguments: 3 or more: A .txt file for the output accuracy file, then any number of input accuracy files (should be
at least 2).
Description: This takes all of the input accuracy files and finds the total of the numerators and denominators then
recalculates the percentages. The result is put into the file defined by the first argument.

APPLICATIONS OF THE SOFTWARE – A BUSINESS PERSPECTIVE

It facilitates a Human and Computer Interactive Agent for Hand Written Character Extraction.
It maintains a Character Dictionary for the Initial Character Sets as well as Experienced Character Sets.
It provides a Sample Work Space to play/draw the characters which are supposed to be recognized by
the system.
By using KohoNet Network, it tries to trace the edges from the Human Handwritten characters.
It builds a Knowledge Model that will upgrade the experience level of the system for different variety of
patterns.

INSTALLATION PROCESS

1. Insert the installation media into your computer or mount the disk image supplied with your online order.
Double-click the Installer application icon to launch the Installer.

Fig. 02: RaysRecog™ installation icons

Follow the Installer instructions if using a .pkg installer. After a few seconds an installer dialog will appear:
8

Fig. 03: RaysRecog™ Installer screenshot

Enter the serial number that is printed on the serial number sticker which is found either on the CD sleeve, on
your order invoice, or on the inside covers of your product manual .

2. Under normal circumstances the installer will automatically locate the proper destination for your software.
Unless you have a special reason to install your RaysRecog software in a different location, use the default location
provided by the installer. If you need to install the software in a custom location, click the “Select Destination
Directory...” button to choose the software installation location.
A folder selection dialog box will appear:

Fig. 04: RaysRecog™ Installer screenshot2

Navigate until you have selected the appropriate folder to contain the software and then click the “Open” button at
the bottom of the dialog. The folder selection dialog box will close.

3. Now click the “Install” button in the installer dialog. Installation will take generally less than five minutes.
4. For complete support from us, you have to fill the following registration form: Then press “OK” button.
9

Fig. 05: RaysRecog™ Registration Information screenshot

5. Congratulations! Your software is now installed.

TECHNICAL OVERVIEW

CHARACTER ANALYSIS PROCEDURE:

Now the main overview of technical part of the software is given. As this paper is beyond the scope of showing
codes for the programming, only depict the flow-diagram and the core technology behind it are depicted. The
Character Analysis Process (CAP) has three basic steps (out of total seven steps) which are called as ‘basic building
blocks’ for all analysis systems. These are three steps -
Data Collection (Dictionary Management) Stage -
The first of all steps deals with gathering a large amount of data where system-specific data should be
preserved for the future training purposes. If irrelevant data are chosen, they may lead the system to a
misguided training stage. Again, overestimated data are selected; complexity of the system may be
increased drastically. So in nutshell, utmost care must be taken while selecting data.
Pre-processing Stage -
Next stage is the pre processing stage where mainly the image processing procedures are done. ‘Grey
scale image conversion’, ‘Binary image conversion’ and ‘Skew correction’ – these are three sub-processes
which are dependent to the pre processing stages, so great care should be taken during designing the pre
processing stages. Activities like Pixel Ratio Analysis, Thinning, Edge Detection, Chain Code, Pixel
Mapping, Histogram Analysis etc come into picture at this stage. These steps are required to convert the
set of raw data into the trainable components. KohoNet is being chosen for classification scheme where
printed or hand written English character is acquired by means of Scanner and the corresponding bytes
which are taken for processing purposes as the raw data to the system.

Processing Stage -
Now at 3rd stage, which is also known as ‘Processing Stage’, the extracted characters may be found
either in Grey Scale or in Black & White Scale. Those pixels which are in Grey Scale must be converted to
the Black & White Scale during the stage of Pre Processing, so that they can be represented by truth value
10

(0 or 1). By convention, the truth value 1 symbolizes the black pixels and truth value 0 symbolizes white
pixel. Now the pixels are examined and mapped into a specific area and vector which is extracted from
the image containing the English Word or character.

Fig. 06: Image Processing With Kohonen Network

IMPLEMENTATION OF CHARACTER RECOGNITION STEPS

Data Collection (Dictionary Management)


As the flexibility as well as the durability of the system absolutely depends upon the correct selection of data, the
upper case English Characters are taken for the experiment. A 7R x 5C (7nos. of Rows and 5nos. Columns) Matrix
for each character has been taken.
11

A 00110001100111001010111111100110001
B 11111100011000111111100111000110111
C 11111100001000010000100001100001111
D 11111100011000110000100011000111111
E 11111100001000011111100001000011111
F 11111100001000011110100001000010000
G 0111011000100001011110001100011 1111
I 10001100001000111001111111000110001
K 11111001000010000100001000010000111
L 11111001000010000100101001010011100
M 10001100111111011010100101001110011
N 10000100001000010000100001000011111
O 10000100001000010000100001000011111
P 11111110111000110001100001000110001
Q 11111100011000110000100011000111111
R 11111100011001111110100001000010000
S 01111110011000110001100111101101111
T 11111100011000111011111101001110001
U 01111110001100000111100001000011111
V 11111001000010000100001000010000100
W 10001100001000110001100011001111110
X 10001100011101101011011100111000110
Y 10101101011010110101101011011111111
Z 10011110100111001100111001011010010
Y 10001110110111001100010000100001000
Z 11111000110011001100110001000011111

Fig.07. Char. Matrix Bit Pattern

As this experiment is based on embedded based application system and the characters are chosen in fixed matrix
pattern, no need to concentrate on the skew correction technique. From Fig. 07, it is evident that each character
has 35Bit pattern. These Bit Patterns are required for representing the Unit Character Matrices for the
corresponding characters. For example, the character A has got the following Pattern –
12

Actual Character Pattern Matrix Character Matrix

Fig. 08: Sample Character Pattern

Pre-processing Stage –

Here all the above characters are measured to represent a fixed ratio for them. Now du ring the design of the
Character Entry Palate one thing should be kept in mind that the palate must be designed on the basis of
7(H):5(W) proportion. Now if the size of the palate is not in this proper ratio then the recognition scheme can be
mistaken. So it is necessary to follow the size ratio during the design of the character palate. While the character
patterns are determined, if a cell in the matrix poses 1 that means the cell contains a black pixel and white pixel
for 0. During the secondary stage of the Data Collection, a Work Space is implemented from which any English
Character Pattern can be entered of our preferences. This editor is allowable for one to enter the English Character
along with its pattern. It will improve the recognition scheme to recognize every type of English Characters. such
as if there is a need to recognize lower case letters like a, b, c, d …… then the pattern can be entered to the
dictionary before the execution of the recognition technique, as a result it can be recognized properly. I have
defined this scheme as Dictionary Collection Entry. The implemented workspace is as follows:
13

Fig. 09: New Character Pattern Creation Scheme Fig. 10: Wrong Character Pattern For 3

The example has been plotted for the new character 6. Now if the pattern f or 6 is to be created, in the Character
Field 6 should be entered and the bit pattern into the Bit Pattern field in terms of 0 and 1. While the pattern is
entered it will simultaneously show the character pattern in the right most pattern matrix/palate. The pattern can
be saved to the existing dictionary by clicking the Save Button. By this way, any type of pattern can be stored
which can be used for the further recognition technique.
Finally the collection scheme completes with pattern editing functionality. Now what is pattern editing? To
explain this let us take an instance that somebody has created the pattern of 3 but actually entered the pattern for
7.

Now for correcting the pattern of 7 the patter n needs to be modified. This system provides a work space from
which one can edit the wrong pattern of the characters by replacing the 0’s and 1’s to provide the required
pattern.

Processing Stage -

The fundamental difference between Training and Recognition:


The image processing methodology is exactly same for the operation of training and recognition. The only
difference is that, training mechanism uses a certain number of samples to model a particular character or word
that is used to measure model parameters. Alternatively recognition operation is for creating model for
recognizing the particular image character or word.

Recognizing Pattern with the help of Kohonen Neural Network:


14

This is the most responsive stage of Optical Character Recognition System. In this stage the system actually finds
out the bit pattern of the individual character matrices. After predicting the bit pattern, the system will compare
the pattern with the existing character patterns stored in the character pattern dictionary. During this recognition
process if the input patters n matches about 70% or above of the dictionary character pattern then it will be
recognized with success. Or else the input pattern will be added to the character dictionary as a new character
pattern or it could be a variety of any existing pattern depending on the user’s preference. The steps involved in
Pattern Recognition with KohoNet are as follows:

Pattern Vector creation:


Once the Hidden Markov Model has been found from the system and the model is sampled it have the two distinct
areas in the vector. One is the white area and another is black area. For example see Fig. 11 which explains the
black and white areas of the character B.

Fig. 11: Character Vector of B Fig. 12: Pattern Vector of B

Next the system will recognize the pattern vector corresponding to the pattern area. For example while retrieving
the pattern from the above character (Ref. Fig. 10), if a white area is found in the input character matrix then it
will mark the corresponding pattern vector element as 0 and for black area it will mark as 1. Hence the pattern
vector of the Fig. 10 will be as per the Fig. 11.
Now each of these vectors will be given a new unique ID and stored in the database so
that they can be utilized during the future recognition processes.

Introduction to Kohonen Neural Network


Now the system will generate the pattern area corresponding to the pattern vector. For example while retrieving
the pattern from the above character (Ref. Fig. 11), if a white area is found in the input character matrix then it
will mark the corresponding pattern vector element as 0 and for black area it will mark as 1. Hence the pattern
vector of the Fig. 11 will be as per the Fig. 12.
Now a new unique ID is given to each of the vector. Those unique IDs are stored in the
database so that they can be utilized during the future recognition processes.
15

Introduction to Kohonen Neural Network


So far we have been through a lot of pre processing stages to procure the pattern vector. Now the procured vector
is to be considered as the input to the Kohonen Neural Network. But the question may arise what is the concept
behind the Kohonen Neural Network and what it does?
The answer is not at all complex to understand - it is a Neural Network Architecture which has been invented by
Dr. Tuevo Kohonen and it differs from Clausal Neural Network (i.e. Feed forward Back Propagation Neural
Network) apart from several aspects as follows:
The way by which the Training and Recall of the Patterns done.
Any kind of OS activation function is not required.
Usage of bias weight is not applicable here.
The network can be trained in an unsupervised mode i.e. any supporting artificial intelligence is not
required to recognize the network.
It takes the absolute responsibility to gather data i.e. if the input of the system does not provide
specification for data it can automatically identify that.
In the Kohonen Neural Network all the input neurons does not deliver all the outputs successively. The m -path
detection scheme is used to analyze the pixel neurons whenever an input neuron is produced to the training
algorithm. During this process either of the input neuron is selected to be as a “Winner”. Finally if all the neurons
are traversed successfully then the Winner Neuron will be considered as the output of the system.
For example let us consider the scheme for Fig. 12 – here the input vector has 7Rows and 5Columns. So the system
has 7x5=35 input pixels. Therefore there will be 35 2=1225 nos. of input neurons. Now if the system selects 8 sets
of the neurons as an output the scheme will be as follows (Ref. Fig. 13):

Fig. 13: Kohonet Output Selection Scheme


16

As per the Fig. 12 it is shown that only four neurons amongst the input pixels haven been chosen as the winners.

CONCLUSION & FUTURE WORK

I have chosen Kohonen Neural Network because I have found it as an innovative and independent pattern
recognition process. So far I have tried to demonstrate the major functionalities involved in this system. Along with
these there are also lots of other concerned mathematical and technical theories to learn and develop.
The above described Kohonen Scheme describes the basic principle of the
architecture. There are lots of topics like Kohonet Structure, Rate of Accuracy, Weight Adjustment, Height
Adjustment, Error Detection and Correction etc are still under research. Also there are multiple ways to implement
OCR Paradigm for the purpose of different language character recognition, digital signature recognition etc. and
also introduction of new algorithms to reduce the present complexity issues.

REFERENCES

1. SAHANA Official Website for XFORM ( http://wiki.sahana.lk/doku.php?id=dev:sahana_xform)


2. SAHANA Official Website for OCR ( http://wiki.sahana.lk/doku.php/sahanaocr )
3. Journal of Zhejiang University SCIENCE by Dr. BALAKRISHNAN N. on Universal Digital Library—Future
Research directions
4. Journal by Dr. AJ Palkovic on Improving Optical Character Recognition
5. OCR FOR ANYDOC IS JUST WHAT THE DOCTOR ORDERED, by Dr. LEONARD – AnyDOC Software
6. DRR Research beyond COTS OCR Software: A Survey by Dr. Xiaofan Lin HP Imaging Systems Laboratory
7. Optical Character Recognition System: KohoNet approach by Anindya Chatterjee

You might also like