Yerrijdnewpaper
Yerrijdnewpaper
Yerrijdnewpaper
INTRODUCTION
Advances in Technology have lead to increased usage of smart phones, tablets and digital
cameras, resulting in large collection of heterogeneous data consisting of video images, natural
scene images and web based images with text. These images contain useful text that can be used
for numerous applications such as machine language translation, safe driving, license plate
tracking and recognition, blind navigation, spot identification, house number tracking from maps
etc. Text detection and recognition in video/scene images is a major research issue. This provides
the basis and significant clues for content-based retrieval applications.
Given an image, the goal of text detection is to determine the existence of text and return the
location if it is present. The text recognition identifies and generates text from these images. In
other words, Text detection task is to find a minimum sized region of interest with all of the text
in the image inside it. Text Detection and Recognition find all areas in an image, mark
boundaries of the text areas and output a sequence of characters associated with its content.
Scene text reading promotes many compute vision application such as image retrieval, intelligent
transportation
Characteristics of scene Text contains varying features as mentioned below:
Style: Text in images appears either in printed block letters or in handwritten cursive form.
Size - Text in images can acquire any percentage of image area
Spacing - Inter-character and inter-word spacing’s together with the size of text can make
detection difficult.
Colour: Text in scene images can have multiple font colour.
Background: Scene images has complex background where Text is embedded and sometimes get
merged with background. Hence text detection against complex background with low resolution
is challenging.
fig 2: Scene Text images with variation in background, multi font color, orientation
Due to variation in text size, color, and font style, multiple orientations of text, complex
background, and geometric distortions in images, there is tendency of failing to detect true text
regions. Further, the growth in Optical Character Recognition (OCR) systems has made
computers to read text from images. Since Images may have many other non-character textures,
it is difficult for the OCR to read text. We need to extract character strings from images.
In the recent, researchers have proposed several approaches for detecting and recognizing the
text in natural scene images. In this paper, we present the extensive review of the literature of
text detection and recognition in scene images in the recent past. Secondly, the paper also
detailizes the various datasets used for the study. Thirdly, we also compare the performances of
various algorithms for scene text detection and recognition
2. Structural problems of STR:
Text detection:
Text Recognition:
End-End text system:
Arpit Jain et al 2014 [17], proposed an end-to-end system for text detection and
recognition from videos. Maximally Stable External Regions (MSER) can detect text in very low
illuminated background with view point variations .Then a Super vector machine (SVM)
classifier is used to classify the text /non text regions using shape descriptors and reduces their
dimensionality using Partial Least Squares (PLS) technique for achieving increased performance.
Finally, the detected text is binarized and sent to OCR for text recognition. The proposed
approach is efficient in detecting pixel level text and word recognition task
Juli P et al 2016 [5], used stroke width transform (SWT) to detect text in natural scenes. Here the
deskwing algorithm is used for deskewing in order to detect text for image irrespective of its
orientation..The algorithm is able to detect text of any font, orientation, direction and scale