BIS RaysRecog
BIS RaysRecog
BIS RaysRecog
opasdfghjklzxcvbnmqwertyuiopasdfgh
Ray
jklzxcvbnmqwertyuiopasdfghjklzxcvb
sRe
nmqwertyuiopasdfghjklzxcvbnmqwer
cog
tyuiopasdfghjklzxcvbnmqwertyuiopas
dfghjklzxcvbnmqwertyuiopasdfghjklzx
cvbnmqwertyuiopasdfghjklzxcvbnmq
wertyuiopasdfghjklzxcvbnmqwertyuio
pasdfghjklzxcvbnmqwertyuiopasdfghj
klzxcvbnmqwertyuiopasdfghjklzxcvbn
mqwertyuiopasdfghjklzxcvbnmqwerty
uiopasdfghjklzxcvbnmqwertyuiopasdf
ghjklzxcvbnmqwertyuiopasdfghjklzxc
vbnmqwertyuiopasdfghjklzxcvbnmrty
uiopasdfghjklzxcvbnmqwertyuiopasdf
ghjklzxcvbnmqwertyuiopasdfghjklzxc
2
RaysRecog™
New handwriting analysis software using KohoNet™
Saranya Roy
Submitted to -
Dr. Prithwis Mukherjee
Head – Research
Contents
1 Buzzwords 2
3
2 Abstract 3
3 Introduction 3
4 The Software 4
6 Installation Process 6
7 Technical overview 8
9 References 15
Buzzwords
ABSTRACT
As Database Technology and the Data Mining Paradigm are advancing rapidly, the Reposition and Retrieval of
information has become easier beyond the imagination. But current Information Technology is still inadequate to
manage the analysis of data. Though we are using the most efficient technologies like Java, Dot Net, SAS, SAP,
Advanced Excel etc. but we are still re -entering data from a Hand Written Form to a Database manually!!!
4
Through this effort, I tried to partially solve this problem by building a new Simulation Software, which is capable
of performing basic OCR functionalities. The principle idea behind this simulation software is to convert the
scanned character image to the corresponding editable text. The Work Space will consist of a Kohonen Neural
Network which will primarily guide the interface for the character extraction and analysis.
INTRODUCTION
The term OCR is abbreviated for Optical Character Recognition. OCR is basically a paradigm which locates and
extracts character(s) from a text image to a computer editable text. Several different ways are there for text
extraction such as through a scanner, or can be also done through a graphics mat (a device through which using
photo pen text is written) or any other similar means. The transformed text is found either in ASCII format or may
be in Unicode Text Format (UTF).
This paradigm can be beneficial in different ways.
Useful in the case of huge amount of data entry.
Useful in word processing systems to extract characters from the graphics mat or scanned document.
It can also be used in case of Digital Signature Manipulation and in many other significant cases.
These types of systems can be fundamentally categorized into Embedded Applications, as these systems are
intentionally developed to be dedicated for some special tasks. Here in this case the OCR approach is defined for
English Character Symbols which are set to follow the classifications such as Upper Case Alphabets (A-Z), Lower
Case Alphabets (a-z) and Numbers (0-9).
Here the Kohonen Neural Network will be used as a means to provide training and analysis procedures for the
purpose of classifying the character pattern stages in the pattern matrix. The edifice of words has been done by a
probabilistic model, called Hidden Markov Model (HMM). The HMM scheme represents each word via
consecutive frame(s) in which each frame stands for a unit character matrix. For example if we consider the word
COBALT then it would have six frames like F0: Mat(C), F1: Mat (O), F2: Mat (B), F3: Mat (A), F4: Mat (L) and F5:
Mat (T). During the extraction stage each character pattern would be stored in a set of six 5(C) x 7(R) matrices (i.e.
Mat (?)) which will be considered as a Word Vector Space (WVS).
RaysRecog™ is able to run simple scripts to make it easier to recognize large amounts of handwriting at once.
Scripts can be used to import and export everything that the program can, and they can use the train and read
5
buttons. The scripting system is also has the additional ability to create an "accuracy file", which is just the number
of letter discrepancies between two text files, and it can combine accuracy files. This makes accuracy testing much
easier and faster.
RaysRecog™ script files should be saved with the file extension .ssf. They can be run using the Tools>Run Script
menu item in the program. Each command on an SSF file must be on a separate line. Although there may be empty
lines, comments are currently not supported. Each command may have one or more arguments. The format for a
line is command name, space (if there are any arguments), argument1 (if applicable), comma (if there is a second
argument), argument2 (if applicable), comma (if there is a third argument), argument3, etc. Note that there are no
spaces in between arguments, and that quotation marks are never necessary.
Screen Shot:
train
read
run_script
import_image
import_text
export_image
6
export_text
generate_accuracy_file
combine_accuracy_files
train
Arguments: 0
Description: This works just like pressing the train button. It takes the active area contents and the text area
contents and trains the selected writer using them.
read
Arguments: 0
Description: This works just like pressing the read button. It takes the active area contents, interprets them, and
outputs the result into the text area.
run_script
Arguments: 1: The path of an existing .ssf file.
Description: This takes another script file and runs it. Note that since conditional statements are currently not
possible, scripts should never be called recursively.
import_image
Arguments: 1: The path of an existing .bmp or .isd file.
Desctiption: This takes any bitmap or ink stroke data file and loads it into the active area. It will automatically
detect the file type.
import_text
Arguments: 1: The path of an existing .txt file.
Description: This loads the given text file into the text area.
export_image
Arguments: 1: The path to a .isd file (does not need to exist).
Description: This exports the active area into an ink stroke data file. Note that exporting bitmaps is not yet
supported.
export_text
Arguments: 1: The path to a .txt file (does not need to exist).
Description: This exports the text area into a text file.
7
generate_accuracy_file
Arguments: 3: A .txt file for the accuracy file (does not need to exist), then two different .txt files that will be
compared.
Description: This is used to compare the program output text with a text file containing the correct text. It looks at
the files given by arguments 2 and 3 letter-by-letters and finds the number of differences. It then outputs the
resulting letter and word accuracies into the file given by argument 1.
combine_accuracy_files
Arguments: 3 or more: A .txt file for the output accuracy file, then any number of input accuracy files (should be
at least 2).
Description: This takes all of the input accuracy files and finds the total of the numerators and denominators then
recalculates the percentages. The result is put into the file defined by the first argument.
It facilitates a Human and Computer Interactive Agent for Hand Written Character Extraction.
It maintains a Character Dictionary for the Initial Character Sets as well as Experienced Character Sets.
It provides a Sample Work Space to play/draw the characters which are supposed to be recognized by
the system.
By using KohoNet Network, it tries to trace the edges from the Human Handwritten characters.
It builds a Knowledge Model that will upgrade the experience level of the system for different variety of
patterns.
INSTALLATION PROCESS
1. Insert the installation media into your computer or mount the disk image supplied with your online order.
Double-click the Installer application icon to launch the Installer.
Follow the Installer instructions if using a .pkg installer. After a few seconds an installer dialog will appear:
8
Enter the serial number that is printed on the serial number sticker which is found either on the CD sleeve, on
your order invoice, or on the inside covers of your product manual .
2. Under normal circumstances the installer will automatically locate the proper destination for your software.
Unless you have a special reason to install your RaysRecog software in a different location, use the default location
provided by the installer. If you need to install the software in a custom location, click the “Select Destination
Directory...” button to choose the software installation location.
A folder selection dialog box will appear:
Navigate until you have selected the appropriate folder to contain the software and then click the “Open” button at
the bottom of the dialog. The folder selection dialog box will close.
3. Now click the “Install” button in the installer dialog. Installation will take generally less than five minutes.
4. For complete support from us, you have to fill the following registration form: Then press “OK” button.
9
TECHNICAL OVERVIEW
Now the main overview of technical part of the software is given. As this paper is beyond the scope of showing
codes for the programming, only depict the flow-diagram and the core technology behind it are depicted. The
Character Analysis Process (CAP) has three basic steps (out of total seven steps) which are called as ‘basic building
blocks’ for all analysis systems. These are three steps -
Data Collection (Dictionary Management) Stage -
The first of all steps deals with gathering a large amount of data where system-specific data should be
preserved for the future training purposes. If irrelevant data are chosen, they may lead the system to a
misguided training stage. Again, overestimated data are selected; complexity of the system may be
increased drastically. So in nutshell, utmost care must be taken while selecting data.
Pre-processing Stage -
Next stage is the pre processing stage where mainly the image processing procedures are done. ‘Grey
scale image conversion’, ‘Binary image conversion’ and ‘Skew correction’ – these are three sub-processes
which are dependent to the pre processing stages, so great care should be taken during designing the pre
processing stages. Activities like Pixel Ratio Analysis, Thinning, Edge Detection, Chain Code, Pixel
Mapping, Histogram Analysis etc come into picture at this stage. These steps are required to convert the
set of raw data into the trainable components. KohoNet is being chosen for classification scheme where
printed or hand written English character is acquired by means of Scanner and the corresponding bytes
which are taken for processing purposes as the raw data to the system.
Processing Stage -
Now at 3rd stage, which is also known as ‘Processing Stage’, the extracted characters may be found
either in Grey Scale or in Black & White Scale. Those pixels which are in Grey Scale must be converted to
the Black & White Scale during the stage of Pre Processing, so that they can be represented by truth value
10
(0 or 1). By convention, the truth value 1 symbolizes the black pixels and truth value 0 symbolizes white
pixel. Now the pixels are examined and mapped into a specific area and vector which is extracted from
the image containing the English Word or character.
A 00110001100111001010111111100110001
B 11111100011000111111100111000110111
C 11111100001000010000100001100001111
D 11111100011000110000100011000111111
E 11111100001000011111100001000011111
F 11111100001000011110100001000010000
G 0111011000100001011110001100011 1111
I 10001100001000111001111111000110001
K 11111001000010000100001000010000111
L 11111001000010000100101001010011100
M 10001100111111011010100101001110011
N 10000100001000010000100001000011111
O 10000100001000010000100001000011111
P 11111110111000110001100001000110001
Q 11111100011000110000100011000111111
R 11111100011001111110100001000010000
S 01111110011000110001100111101101111
T 11111100011000111011111101001110001
U 01111110001100000111100001000011111
V 11111001000010000100001000010000100
W 10001100001000110001100011001111110
X 10001100011101101011011100111000110
Y 10101101011010110101101011011111111
Z 10011110100111001100111001011010010
Y 10001110110111001100010000100001000
Z 11111000110011001100110001000011111
As this experiment is based on embedded based application system and the characters are chosen in fixed matrix
pattern, no need to concentrate on the skew correction technique. From Fig. 07, it is evident that each character
has 35Bit pattern. These Bit Patterns are required for representing the Unit Character Matrices for the
corresponding characters. For example, the character A has got the following Pattern –
12
Pre-processing Stage –
Here all the above characters are measured to represent a fixed ratio for them. Now du ring the design of the
Character Entry Palate one thing should be kept in mind that the palate must be designed on the basis of
7(H):5(W) proportion. Now if the size of the palate is not in this proper ratio then the recognition scheme can be
mistaken. So it is necessary to follow the size ratio during the design of the character palate. While the character
patterns are determined, if a cell in the matrix poses 1 that means the cell contains a black pixel and white pixel
for 0. During the secondary stage of the Data Collection, a Work Space is implemented from which any English
Character Pattern can be entered of our preferences. This editor is allowable for one to enter the English Character
along with its pattern. It will improve the recognition scheme to recognize every type of English Characters. such
as if there is a need to recognize lower case letters like a, b, c, d …… then the pattern can be entered to the
dictionary before the execution of the recognition technique, as a result it can be recognized properly. I have
defined this scheme as Dictionary Collection Entry. The implemented workspace is as follows:
13
Fig. 09: New Character Pattern Creation Scheme Fig. 10: Wrong Character Pattern For 3
The example has been plotted for the new character 6. Now if the pattern f or 6 is to be created, in the Character
Field 6 should be entered and the bit pattern into the Bit Pattern field in terms of 0 and 1. While the pattern is
entered it will simultaneously show the character pattern in the right most pattern matrix/palate. The pattern can
be saved to the existing dictionary by clicking the Save Button. By this way, any type of pattern can be stored
which can be used for the further recognition technique.
Finally the collection scheme completes with pattern editing functionality. Now what is pattern editing? To
explain this let us take an instance that somebody has created the pattern of 3 but actually entered the pattern for
7.
Now for correcting the pattern of 7 the patter n needs to be modified. This system provides a work space from
which one can edit the wrong pattern of the characters by replacing the 0’s and 1’s to provide the required
pattern.
Processing Stage -
This is the most responsive stage of Optical Character Recognition System. In this stage the system actually finds
out the bit pattern of the individual character matrices. After predicting the bit pattern, the system will compare
the pattern with the existing character patterns stored in the character pattern dictionary. During this recognition
process if the input patters n matches about 70% or above of the dictionary character pattern then it will be
recognized with success. Or else the input pattern will be added to the character dictionary as a new character
pattern or it could be a variety of any existing pattern depending on the user’s preference. The steps involved in
Pattern Recognition with KohoNet are as follows:
Next the system will recognize the pattern vector corresponding to the pattern area. For example while retrieving
the pattern from the above character (Ref. Fig. 10), if a white area is found in the input character matrix then it
will mark the corresponding pattern vector element as 0 and for black area it will mark as 1. Hence the pattern
vector of the Fig. 10 will be as per the Fig. 11.
Now each of these vectors will be given a new unique ID and stored in the database so
that they can be utilized during the future recognition processes.
As per the Fig. 12 it is shown that only four neurons amongst the input pixels haven been chosen as the winners.
I have chosen Kohonen Neural Network because I have found it as an innovative and independent pattern
recognition process. So far I have tried to demonstrate the major functionalities involved in this system. Along with
these there are also lots of other concerned mathematical and technical theories to learn and develop.
The above described Kohonen Scheme describes the basic principle of the
architecture. There are lots of topics like Kohonet Structure, Rate of Accuracy, Weight Adjustment, Height
Adjustment, Error Detection and Correction etc are still under research. Also there are multiple ways to implement
OCR Paradigm for the purpose of different language character recognition, digital signature recognition etc. and
also introduction of new algorithms to reduce the present complexity issues.
REFERENCES