Detection of Text Lines of Handwritten Arabic Manuscripts ...
Arabic Handwritten Script Recognition Towards Generalization: A Survey
-
Upload
randa-elanwar -
Category
Science
-
view
108 -
download
1
Transcript of Arabic Handwritten Script Recognition Towards Generalization: A Survey
1
Arabic Handwritten Script Arabic Handwritten Script Recognit ion Towards Recognit ion Towards
General ization: A SurveyGeneral ization: A Survey Authors:Authors: Randa I. M. ElanwarRanda I. M. ElanwarAssistant Researcher, Electronic Research Institute
Prof. Dr. Mohsen A. A. RashwanProf. Dr. Mohsen A. A. RashwanProfessor of Digital Signal Processing, Electronic and communication dept, Cairo University
Prof. Dr. Samia A. A. MashaliProf. Dr. Samia A. A. MashaliHead of computers and systems dept, Electronic Research Institute
2
Presentation ContentsPresentation Contents
Introduction
Paper Objective
Arabic handwriting recognition problem
Main Challenges
Recent off-line Arabic handwriting recognition systems
Recent on-line Arabic handwriting recognition systems
Summary and Conclusion
3
IntroductionIntroduction Handwriting recognition can be defined as the task of transforming text represented in the spatial form of graphical marks into its symbolic representation
The main components of a recognizer are:1. Capturing Data & acquisition
2. Preprocessing & segmentation
3. Defining patterns and model selection
4. Feature Extraction
5. Training
6. Classification
4
IntroductionIntroduction
• First the input device captures an image and convert it to a usable format
• Data is then preprocessed to eliminate noise for simplification without loosing relevant information and may also be segmented to smaller data units
5
IntroductionIntroduction
• The information of each data unit is sent to feature extractor to reduce them by measuring certain “features” or “properties”
• Patterns (or classes) should be defined and models should be selected. These models are trained using the extracted features.
6
IntroductionIntroduction
• The model for a pattern may be a single specific set of features
• To recognize (or classify) a novel pattern means to recover the model that generated the pattern based on the extracted features
7
IntroductionIntroduction The feature extractor has reduced the data unit to a
point or feature vector X in a 2D feature space (or observation space)
Classification rule: Classify the input as Class I if its feature vector falls below the decision boundary shown, and as Class II otherwise.
8
IntroductionIntroduction The problem is that designing a very complex
recognizer is unlikely to give good generalization since it seems to be “tuned” to the particular training samples
The question is how to optimize this tradeoff: generalization versus simple classifier
9
IntroductionIntroduction Usually there is an action taken based on the
classification decision. Each action should be assigned a certain cost.
We design our decision boundary (classification rule) so that on the average, the Risk will be as small as possible.
The Risk (R) is the expected value of cost
Minimizing (R) leads to complex boundaries
The question is how to optimize this tradeoff: generalization versus minimum risk?
10
IntroductionIntroduction In order to achieve general purpose recognizer
(unbiased) we should have a sufficient number of training samples (N) for each class in the data set.
A theoretical estimate claims that
N ≅ 100 / P where P ≡ prob. of misclassification
I.e., for P ≈ 0.01, N ≈ 10000 and for P ≈ 0.03, N ≈ 3000
Such large data set (if available) needs large storage and long processing time (time complexity)
The question is how to optimize this tradeoff: generalization versus complexity?
11
Paper Object ivePaper Object ive
Our concern in this paper is to:
1. provide a comprehensive review of recent off-line
and on-line trends in Arabic cursive handwriting
recognition (last 10 years publications)
2. clarify the challenges standing against obtaining a
reliable, accurate, simple, general purpose recognizer
based on these trends.
12
Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem
Arabic Script Recognition Systems are categorized as:
1. On-line or Off-line
2. Writer Dependent or Writer Independent
3. Open-vocabulary or closed-vocabulary
13
Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem
Types of Recognition:
When the input device is a digitizer tablet that
transmits the signal in real time or includes timing
information together with pen position, this is mostly
referred to as on-line or dynamic recognition
14
Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem
Types of Recognition:
When the input device is a still camera or a scanner,
which captures the position of digital ink on the page
but not the order in which it was laid down, this is
defined as off-line or image-based OCR
15
Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem
Special Characteristics of Arabic Script:
Always written from right to left
Arabic word consists of one or more portions; each
has one or more characters
Many characters differ only by the position and the
number of dots attached
16
Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem
Special Characteristics of Arabic Script:
Every character has more than one shape, depending
on its position
Characters overlap
17
Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem
Special Characteristics of Arabic Script:
Existence of ligatures
Due to having these special characteristics, Arabic handwriting recognition systems still need more research to be established commercially
18
Main ChallengesMain Challenges
Feature Extraction
Noise
Model Selection and Complexity
Segmentation
Context
Evidence Pooling
Costs and Risks
Computational Complexity
Learning and Adaptation
19
Main ChallengesMain Challenges
Feature Extraction:
A good feature set should helps distinguishing a class
from other classes, be invariant to differences and
contains no redundant information
20
Main ChallengesMain Challenges
Feature Extraction:
A good feature set should helps distinguishing a class
from other classes, be invariant to differences and
contains no redundant information
… How to know which features are most
promising ?
… Is there ways to automatically learn which features are
best for a c lassifier?
21
Main ChallengesMain Challenges
Feature Extraction:
A good feature set should helps distinguishing a class
from other classes, be invariant to differences and
contains no redundant information
… How to know which features are most
promising ?
… Is there ways to automatically learn which features are
best for a c lassifier?
It should be limited in number for computational ease
and to limit the amount of training data
22
Main ChallengesMain Challenges
Feature Extraction:
A good feature set should helps distinguishing a class
from other classes, be invariant to differences and
contains no redundant information
… How to know which features are most
promising ?
… Is there ways to automatically learn which features are
best for a c lassifier?
It should be limited in number for computational ease
and to limit the amount of training data
… How many features
to use?
… How to train or used a c lassifier when some
features are miss ing?
23
Main ChallengesMain Challenges
Noise:
Random error in a pixel value (deformation) due to
signal-independent, signal-dependent and salt &
pepper noise.
Noise cannot always be totally eliminated; but
smoothing is done
24
Main ChallengesMain Challenges
Noise:
Random error in a pixel value (deformation) due to
signal-independent, signal-dependent and salt &
pepper noise.
Noise cannot always be totally eliminated; but
smoothing is done
… Is the deformation in some signal is noise? or natural
varieties in true models?
… How can we use this information to improve
our c lass ifier?
25
Main ChallengesMain Challenges
Modeling Selection and Complexity:
Determining the complexity of the model: not so
simple that it cannot explain the differences between
the categories, yet not so complex as to give poor
classification on novel patterns.
26
Main ChallengesMain Challenges
Modeling Selection and Complexity:
Determining the complexity of the model: not so
simple that it cannot explain the differences between
the categories, yet not so complex as to give poor
classification on novel patterns.
… how to know when to re ject a c lass of models and
try another one?
… Are there principled methods for finding the best
complexity for a c lass ifier?
… Is it a matter of random tr ial & error not even guided by
expectations of performance?
27
Main ChallengesMain Challenges
Segmentation:
Segmentation subdivides image into its constituent
regions or objects. Segmentation should stop when the
objects of interest in an application have been isolated.
28
Main ChallengesMain Challenges
Segmentation:
Segmentation subdivides image into its constituent
regions or objects. Segmentation should stop when the
objects of interest in an application have been isolated.
… How do we know where one character “ends” and the
next one “begin”?
… Shall we segment the images before they have been categorized or
categorize them
before they have been segmented?
29
Main ChallengesMain Challenges
Context:
The accuracy of automatic handwriting recognition
systems based on purely visual information seems to
have a ceiling
Incorporating Symantec and syntactic knowledge
sources into the automatic recognition of text can offer
potential improvements in performance
… how, precise ly , should we incorporate such
information?
30
Main ChallengesMain Challenges
Evidence Pooling:
For high classification performance or for increased
class coverage, different classification tools are
developed either in parallel or sequentially
When having several component classifiers, and
these categorizers agree on a particular pattern, there
is no difficulty. But suppose they disagree !!!
31
Main ChallengesMain Challenges
Evidence Pooling:
For high classification performance or for increased
class coverage, different classification tools are
developed either in parallel or sequentially
When having several component classifiers, and
these categorizers agree on a particular pattern, there
is no difficulty. But suppose they disagree !!!
… How should a “super” c lassif ier pool the evidence from the component
recognizers to achieve the best decis ion?
… How would the “super” categorizer know when to base a decision on
a minority opinion when required?
32
Main ChallengesMain Challenges
Costs and Risks:
A classifier is generally used to recommend actions,
each action having an associated cost or risk
We often design our classifier to recommend actions
that minimize some total expected cost or risk
33
Main ChallengesMain Challenges
Costs and Risks:
A classifier is generally used to recommend actions,
each action having an associated cost or risk
We often design our classifier to recommend actions
that minimize some total expected cost or risk
… How do we incorporate knowledge about such r isks and how wil l they
affect the c lassification decision?
… Is there a way to estimate the total r isk and thus te l l whether our
c lassif ier is acceptable even before we f ie ld it?
34
Main ChallengesMain Challenges
Computational Complexity:
Although we might achieve error-free recognition, the
time & storage requirements would be quite prohibitive
Some pattern recognition problems can be solved
using algorithms that are highly impractical.
35
Main ChallengesMain Challenges
Computational Complexity:
Although we might achieve error-free recognition, the
time & storage requirements would be quite prohibitive
Some pattern recognition problems can be solved
using algorithms that are highly impractical.
… What is the tradeoff between computational ease
and performance?
… How can we optimize an exce l lent recognizer within the
engineer ing constraints ?
36
Main ChallengesMain Challenges
Learning and Adaptation: Any method that incorporates information from training
samples in the design of a classifier employs learning
If the models were extremely complicated, the classifier
would have complex decision boundaries
To overcome this, more training samples are needed to
obtain a better estimate of the true underlying features
In case of limited training samples, we should incorporate
knowledge of the problem domain. The production
representation is the “best” representation for classification.
37
Main ChallengesMain Challenges
Learning and Adaptation: Any method that incorporates information from training
samples in the design of a classifier employs learning
If the models were extremely complicated, the classifier
would have complex decision boundaries
To overcome this, more training samples are needed to
obtain a better estimate of the true underlying features
In case of limited training samples, we should incorporate
knowledge of the problem domain. The production
representation is the “best” representation for classification.
… How much training samples are needed for good general ization?
… How can we insure that the learning algorithm favors “s imple”
so lutions rather than complicated ones?
38
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
proposed a recognition system based on a semi-continuous 1-D HMM using the IFN/ENIT database of handwritten Tunisian town/village names.
Preprocessing:
1. Extracting image contour and Performing a noise reduction filtering.
2. Skeletonization and normalization are performed.
3. Baseline estimation and word length normalization are performed.
39
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Feature Extraction:
1. A rectangular window is shifted from right to left across the normalized gray level script image .
2. A Loeve-Karhunen Transformation is performed on the gray values of each frame to reduce the number of features.
Modeling:
1. A HMM-model is generated for each character shape (all possible positions) up to 160 different HMM-models.
2. Semi Continuous HMMs are used with 7 states per character.
40
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Database:
1. This database is split into four sets A, B, C & D.
2. The 4 sets contain 26,459 images of segmented Tunisian town names (115,585 PAWs) handwritten by 411 unique writers.
3. 946 unique word labels, and 762 unique PAW labels.
4. For each image the ground truth information is available.
Lexicon:
The character shape HMM-models are combined to valid word models using a tree structured lexicon with all 946 different Tunisian town/village names.
41
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Recognition:
The standard Viterbi Algorithm is used together with the lexicon.
The authors applied the recognition algorithm to the database twice, once using the baseline coming from GT (ground truth) and once using baseline they estimated.
Results:
Recognition rates 82 – 89% are obtained using baseline estimation
Recognition rates 89 – 95% are obtained using GT baseline
42
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Challenges:
1. Working on available database skips the limited training samples challenge
43
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Challenges:
1. Working on available database skips the limited training samples challenge
2. It is not easy to generalize this classifier for open vocabulary applications because it works on a limited lexicon of words (segmentation-free recognizer) otherwise context will be a must.
44
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Challenges:
1. Working on available database skips the limited training samples challenge
2. It is not easy to generalize this classifier for open vocabulary applications because it works on a limited lexicon of words (segmentation-free recognizer) otherwise context will be a must.
3. Generating the same HMM structure for all characters and ligatures i.e., modeling selection & complexity .. we think it would be much better to vary the model structure according to each character requirement (ض shouldn’t have the same model as ة for example).
45
Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition
systemssystems
Example: Pechwitz et al research [17]
Challenges:
1. Working on available database skips the limited training samples challenge
2. It is not easy to generalize this classifier for open vocabulary applications because it works on a limited lexicon of words (segmentation-free recognizer) otherwise context will be a must.
3. Generating the same HMM structure for all characters and ligatures i.e., modeling selection & complexity .. we think it would be much better to vary the model structure according to each character requirement (ض shouldn’t have the same model as ة for example).
4. Feature Extraction: The idea of normalizing the word width to use a sliding window feature extractor is pretty good except for the great dependency on the baseline estimation which is in itself a great source of error.
46
Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition
systemssystems
Example: Biadsy et al research [24]
Preprocessing:
1. Geometrical processing phase to minimize handwriting variations.
2. A low-pass filter is used to reduce noise and remove imperfections caused by acquisition devices.
3. The writing-speed is normalized by re-sampling the consequent point sequences.
Feature Extraction:
Mainly angles (with x-axis) and loop-presence
47
Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition
systemssystems
Example: Biadsy et al research [24]
Modeling:
1. The recognition framework uses discrete Left-to-right HMMs to represent each Arabic letter shape (isolated, initial, medial, and final).
2. The number of states for each letter shape model is based on the geometric complexity of the letter shape. It varies from 5 to 11 states.
For example: 11 states are assigned to isolated ش, and 5 states to isolated أ.
48
Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition
systemssystems
Example: Biadsy et al research [24]
Lexicon:
1. The Arabic dictionary D is subdivided into a set of sub-dictionaries {D1, D2, …, Dn} based on the number of word parts in each word.
2. Letter-shape models are embedded in a network that represents a word-part dictionary. The segmentation of word parts into letter-shapes and their recognition are performed simultaneously in an integrated process. D = {D = {وسام، هل، معلم، محمود، محمد، فادى، رواية، جامعة، ثقافة، التحدى، انسانوسام، هل، معلم، محمود، محمد، فادى، رواية، جامعة، ثقافة، التحدى، انسان}}
Sub-dictionaries of DSub-dictionaries of D Word-Part Dictionary for D3Word-Part Dictionary for D3
D1 = {D1 = {هل، معلم، محمدهل، معلم، محمد}}
D2 = {D2 = {محمود، جامعة، ثقافةمحمود، جامعة، ثقافة}}
D3 = {D3 = {وسام، فادى، التحدى، انسانوسام، فادى، التحدى، انسان}}
D4 = {D4 = {روايةرواية}}
WPD3,1 = {WPD3,1 = {و، فا، او، فا، ا}}
WPD3,2 = {WPD3,2 = {سا، د، لتحد، نساسا، د، لتحد، نسا}}
WPD3,3 = {WPD3,3 = {م، ى، نم، ى، ن}}
49
Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition
systemssystems
Example: Biadsy et al research [24]
Database:
1. 4 trainers are asked to write 800 selected words each.
2. For testing, 10 testers (the 4 trainers, in addition to 6 new volunteers) are asked to write 280 words not in the training data (2,358 words in total).
3. 5 different dictionary sizes (5K, 10K, 20K, 30K, and 40K words) selected from different Arabic websites are used. The 280 test words are present in
all dictionary sizes.
Recognition:
Writer dependent (WD) and writer independent (WI) experiments are done and average word recognition rates 88 – 96% are obtained. The
performance degrades as ambiguity (dictionary size) increases.
50
Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition
systemssystems
Example: Biadsy et al research [24]
Challenges:
1. Feature Extraction: The features they use are not enough to lead to satisfying classification of general unconstrained handwritings. Thus they are in a great need to work under limited vocabulary. The word parts must be present in the dictionary or the will not be recognized.
51
Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition
systemssystems
Example: Biadsy et al research [24]
Challenges:
1. Feature Extraction: The features they use are not enough to lead to satisfying classification of general unconstrained handwritings. Thus they are in a great need to work under limited vocabulary. The word parts must be present in the dictionary or the will not be recognized.
2. Database they use looks unnatural. Volunteers are asked to follow restrict methodology of writing which affects their individual writing style. Besides, the system handles limited handwriting varieties due to the small number of volunteers who wrote the database.
52
Summary and ConclusionSummary and Conclusion
Foreign recognizers have found their way to the
markets as commercial products since years while
Arabic recognizers still need more time.
53
Summary and ConclusionSummary and Conclusion
Foreign recognizers have found their way to the
markets as commercial products since years while
Arabic recognizers still need more time.
in the case of Arabic handwritten words many
researchers use a specific, more or less small data set
of their own ∴ it is impossible to compare different
results which would be important to improve existent
methods
54
Summary and ConclusionSummary and Conclusion
Foreign recognizers have found their way to the
markets as commercial products since years while
Arabic recognizers still need more time.
in the case of Arabic handwritten words many
researchers use a specific, more or less small data set
of their own ∴ it is impossible to compare different
results which would be important to improve existent
methods
The complexity of the problem is greatly increased by
noise and by the infinite variability of handwritings
55
Summary and ConclusionSummary and Conclusion
Cursive script requires the segmentation of words in
characters or parts of characters, i.e. graphemes, and
then the detection of individual features.
56
Summary and ConclusionSummary and Conclusion
Cursive script requires the segmentation of words in
characters or parts of characters, i.e. graphemes, and
then the detection of individual features.
Generally, the holistic approach can be used if the
size of the vocabulary is small (such as the recognition
of the legal amount in cheques)
57
Summary and ConclusionSummary and Conclusion
Cursive script requires the segmentation of words in
characters or parts of characters, i.e. graphemes, and
then the detection of individual features.
Generally, the holistic approach can be used if the
size of the vocabulary is small (such as the recognition
of the legal amount in cheques)
The character-based approach is the preferred
method for recognition applications that are
unconstrained or involve large-size vocabularies to
insure good generalization together with reasonable
complexity
58
Thank Thank YouYou