Due Monday by class’s end: ◦ Note sheets (handwritten or printed from wiki ◦ Outline (see…
Machine Printed Handwritten Text Discrimination
-
Upload
tahir-zemouri -
Category
Technology
-
view
865 -
download
0
description
Transcript of Machine Printed Handwritten Text Discrimination
Machine Printed Handwritten Text Discrimination
Using Radon Transform and SVM Classifier
ET-Tahir Zemouri1 and Youcef Chibani
2
Signal Processing Laboratory, Faculty of Electronic and Computer Sciences
University of Sciences and Technology Houari Boumediene
USTHB, EL-Alia, B.P. 32, 16111, Algiers, Algeria 1 tzemouri @usthb.dz, 2 [email protected]
Abstract—Discrimination of machine printed and
handwritten text is deemed as major problem in the
recognition of the mixed texts. In this paper, we address the
problem of identifying each type by using the Radon transform
and Support Vector Machines, which is conducted at three
steps: preprocessing, feature generation and classification. New
set of features is generated from each word using the Radon
transform. Classification is used to distinguish printed text
from handwritten. The proposed system is tested on IAM
databases. The recognition rate of the proposed method is
calculated to be over 98%.
Keywords-document analysis; machine printed and
handwritten text discrimination; Radon transform; Support
Vector Machines (SVM).
I. INTRODUCTION
Machine printed and handwritten text are often met in application forms, question papers, mail as well as notes, corrections and instructions in printed documents.
In all mentioned cases it is crucial to detect, distinguish and process differently the areas of handwritten and printed text (OCR for machine printed text and ICR for handwritten annotations) for obvious reasons such as: (a) retrieval of important information (identification of handwriting in application forms), (b) removal of unnecessary information (removal of handwritten notes from official documents), and (c) application of different recognition algorithms in each case.
The main difference between machine printed and handwritten text is their shape structure. Characters in machine printed text have a uniform shape. Whereas handwritten text are of arbitrary curly allograph styles. This difference can be exploited for generating features by exploring the regularity of the machine printed words comparatively of the handwritten words.
There exist a few papers on the discrimination of machine printed and handwritten text. Kuhnke et al. [1] proposed a neural network-based approach with straightness and symmetry as features. Pal and Chaudhuri [2] have used horizontal projection profiles for separating the printed and
handwritten lines in Bangla script. Guo and Ma [3] proposed an approach based on the vertical projection profile of the segmented words, which used a Hidden Markov Model (HMM) as the classifier. Zheng et al. [4] reported on printed and handwritten text segmentation using k-NN, Support Vector Machines (SVM) and Fisher classifier with features like pixel density, aspect ratio and Gabor features. Kandan et al. [5] used invariant moments, which are insensitive to translation, scale, mirroring and rotation as the feature for distinguishing the printed and handwritten elements and the SVM classifier.
We propose in this paper a new method for text discrimination by using the Radon transform and Support Vector Machines.
The Radon transform is adapted for detecting linear features. Hence, printed words generate Radon coefficients more regular comparatively to handwritten words. This property can be used for distinguishing between printed and handwritten words. While, the SVM is well adapted for a robust separation of two classes.
The paper is organized as follows. In section 2, we describe the proposed system. Experiments and conclusions are discussed in Sections 3 and 4, respectively.
II. THE PROPOSED SYSTEM
The system for the discrimination between machine printed and handwritten text can be decomposed into three stages [1], as shown in Fig. 1. The first stage is the preprocessing stage, in which the document is cleaned of all the noise components present such as spurious dots and lines. In the second stage, features are generated based on Radon transform, for which the elements are classified into printed or handwritten using SVM classifiers.
A. Preprocessing stage
Due to large variations in image data, preprocessing, which is used to reduce variations and produce a more consistent set of data, is essential for accurate character recognition. In our system, preprocessing includes the filtering, binarization, skew angle correction, smoothing, and word segmentation.
Figure 1. Block-diagram of the classification system.
1) Image filtering: Generally, the image acquired from a scanner contains the noise, which can be reduced using a
3x3 Wiener filter [6].
2) Binarization: the text is separated from background by automatic thresholding. The Wolf approach [7] is used to
the binary image.
3) Skew angle correction: The skew estimation and
correction is an important step in any document analysis and recognition system. Hence, we use the projection profile for
estimating the skew angle [8], which can be performed for
different angles and the largest magnitude variations
correspond to the skew angle.
4) Smoothing: For smoothing binary document images, four filters [9] can be used to smooth the edges and removing
the small pieces of noise.
5) Segmentation: Segmentation aims to extract the words from the document. Segmentation is performed in two
consecutive steps: line segmentation and word segmentation.
Both steps make use of the projection profiles [10].
B. Feature Generation
Many kinds of features can be generated for distinguish the printed from handwritten text, Kuhnke et al. [1] proposed a straightness of vertically/horizontally oriented lines and symmetry relative to different points as features. Pal and Chaudhuri [2] used the distinctive structural and statistical features. Guo and Ma [3] evaluated their scheme using the vertical projection profile. Zheng et al. [4] used features like Gabor filter, Run length histogram features etc. Kandan et al. [5] used the invariant moments that are invariant under translation, scaling, rotation and reflection.
The main idea of our approach is to take advantage of the structural properties that help to discriminate printed from handwritten text. More precisely, the shape of the printed
characters is more or less stable within a text word. On the other hand, the distribution of the shape of handwritten characters is quite diverse.
The Radon transform has been used in many pattern recognition applications as shape recognition [11]. In our approach, the Radon transform is used as a tool for generating a feature vector. Hence, we briefly review its main properties.
1) Radon Transform
The Radon transform computes projections of an image
along specified directions. A projection of a two-dimensional
function ),( yxI is a set of line integrals. The Radon
transform computes the line integrals from multiple sources
along parallel paths in a certain direction. To represent an
image, the Radon transform takes multiple and parallel
projections of the image from different angles by rotating the
source around the center of the image. Formally, the Radon
transform of an image is defined as [12]:
∫ ∫ −+=x y
I
R dxdyyxyxIT )sincos(),(),( ρθθδθρ (1)
where δ is the Dirac function, ]801]0, °∈θ and
],-] +∞∞∈ρ . In other words, I
RT is the integral of ),( yxI
over the line defined by θθρ sincos yx += .
The Radon transform has several useful properties, as periodicity, symmetry, translation invariance, rotation invariance and scaling invariance.
In our approach, we only are interested on periodicity and symmetry. Fig. 2 shows an example of the Radon transform computed on the printed and handwritten words.
(a) (b)
Figure 2. A shape (a) and its Radon transform (b).
We can easily see that the Radon transform generates
more coefficients of the handwritten word comparatively to
the printed word.
Preprocessing
Document image
Filtering
Binarization
Skew correction
Smoothing
Segmentation
Classification
Handwritten Machine printed
Feature generation
2) Feature vector generation
To generate features of printed and handwritten words,
we fix the angular direction number denoted by θN
( ]360]0, °∈θ ). Since, the Radon transform generates
redundant coefficients (Fig 2.b), hence, in our approach, we select the positive radial projections and taking all directions from 0 to 360°. The feature vector is then generated by computing for a given column in positive space of the Radon transform, the sum of the square coefficient by setting the
number of angular direction θN . The feature values )(θIE
are defined as:
∑=ρ
ρ
θρθ NI
RI TN
E2),(
1)( (2)
Fig. 3 illustrates an example of feature generation values
which include the Radon transform energy for each angle θ .
(a)
(b)
(c)
Figure 3. Feature vector generation, (a) Printed word and its Radon
transform, (b) handwritten word and its Radon transform, (b) Radon
transform, (c) Radon energy versus angle.
We can see that the energy based-Radon transform
generates more energy of the handwritten word
comparatively to the printed word.
3) Feature vector normalization In many practical situations, a designer is confronted
with features whose values lie within different dynamic
ranges. Thus, features with large values may have a larger
influence in the cost function than features with small values,
although this does not necessarily reflect their respective
significance in the design of the classifier. The problem is overcome by normalizing the features so that their values lie
within similar ranges. This is achieved by using nonlinear
transformation [13].
C. Classification
SVM are supervised learning methods, which have been
widely and successfully used for pattern recognition in different applications as digit recognition [14]. The main
concept of SVM lies to find a hyperplane that allows
separating two classes, leaving the largest margin between
the vectors of the two classes [14]. However, in real life,
problems can be linearly non separable. To deal with this
problem, a nonlinear decision surface is obtained by lifting
the feature space into a higher dimensional space. A linear
separating hyperplane is found in the higher dimensional
space that gives a nonlinear decision surface in the original
feature space. The decision function of the SVM can be
expressed as follows:
∑ +=i
iii bxxKyxf ),()( α (3)
Where { }1),( X ±ℜ∈ d
ii yx are the feature vectors and
labels, respectively. In our case, the feature vectors and
labels correspond to the Radon energy { }ix , printed words
{+1} and handwritten words {-1}, respectively. Parameters
iα and b are found by maximizing a quadratic function
subject to some constraints [14]. ),( ixxK is the kernel
function, which allows mapping the feature vectors into a higher dimension inner product space. In our case, we use the RBF kernel (Radial Function Basis) since it offers better discrimination than other kernels. The RBF kernel is defined as:
)2
),(exp(),(
2σi
i
xxdxxK −= (4)
2),( ii xxxxd −= (5)
σ is user defined.
The optimization algorithm adopted for training SVMs is
the Sequential Minimal Optimization (SMO) which provides
practical advantages [15].
III. EXPERIMENTAL RESULTS
A. Data set
For evaluating the performances of the proposed method, we use the IAM database (Institut für Informatik und angewandte Mathematik) [16]. They are scanned with resolution of 300 dpi, 8 bits/pixel, gray-scale and converted into binary images using the Wolf binarization method. This database is formed for more than 1500 documents containing printed and handwritten text. An example of a document can be seen in Fig. 4. Regions of printed and handwritten words are easily separable. They present no auxiliary lines to fill or to supply with written texts. This characteristic facilitates the identification and classification of each type of words.
For testing the performances of our system, 21 images are chosen and preprocessed. The set of words are divided into three subsets for training (1/3), validating (1/3) and testing (1/3), respectively. Table 1 summaries the data set.
For each word, a vector with the energy based-Radon Transform is calculated. We use the recognition rate (RR) as a metric to evaluate the performances of our system, which is defined as:
wordsof total #
classifiedcorrectly wordsof#RR = (%) (7)
Figure 4. IAM Database form.
TABLE I. DATA SET
Data set Training Validation Testing
Machine printed 447 447 438
Handwritten 525 525 484 Total 972 972 922
B. System validation
In order to validate our system various experiments are
conducted for finding the SVM regularization parameter
(fixed at 10), kernel parameter (σ ) and the best angular
direction number ( θN ). Fig. 5 shows the recognition rate
obtained on the validation set for each angular direction
number. We can note that the RR is not very sensitive to the
number of the angular direction. However, the best
performances (RR=77.06%) are obtained for θN =20 and
σ =2.1.
Figure 5. Recognition rate using Radon transform
for the system validation.
In order to improve the recognition rate, we add by
concatenation statistical features to the energy based-Radon
transform, which are mean, variance, variance of projection profile (vertical and horizontal) and entropy. Fig. 6 shows
the recognition rate versus the number of the angular
direction.
Figure 6. Recognition rate using Radon transform and statistical features.
We can see that statistical feature sets are very suitable
information for the discrimination between machine printed
and handwritten text since the RR has been improved to
92.8% for θN =10 and σ =2 using validation set. This
constitutes an additional advantage when adding the
statistical features.
C. System testing
After the validation of the system, the testing set is used for evaluating its performances. Hence, the optimal values of
the system validation are used for computing the recognition rate. The obtained results are 98.32%, which constitutes encouraging performances compared to other works [1-5].
D. Comparaison with other similar works
We compare our results with some other published
research works in terms of RR. Hence, Kuhnke et al. [1]
proposed a neural network-based approach with straightness
of vertically/horizontally oriented lines and symmetry
relative to different points as features. The system reached a
RR of 78.5%. Pal and Chaudhuri [2] approach based on the distinctive structural and statistical features of machine
printed and handwritten text lines in Bangla script. The
classification scheme has a RR of 98.3%. Guo and Ma [3]
evaluated their scheme using the vertical projection profile of
the segmented word and obtained a 92.86% from their
scheme using HMM. Zheng et al. [4] got a RR of 96% using
SVM classifier and features like Gabor filter, Run length
histogram features etc. Kandan et al. [5] obtained a RR of
93.22% using the invariant moments that are invariant under
translation, scaling, rotation and reflection as features and
SVM classifier. Our proposed method obtains a RR of 98.32% by using
Radon transform and statistical features and SVM classifier,
which constitutes encouraging performances compared to
other works.
IV. CONCLUSION
In this paper, we proposed a new method for
discriminating printed and handwritten text in document
images using the Radon transform and SVM classifiers. The
system was implemented and tested in IAM databases.
Our approach presents encouraging results by combining
Radon energy and statistical features using SVM classifiers
with the RBF kernel. In the future, we plane to implement our methodology to
distinguish machine printed/handwritten with Arabic and
Latin texts.
REFERENCES
[1] K. Kuhnke, L. Simoncini, and Z.M. Kovacs-V, “A System for
Machine-Written and Hand-Written Character Distinction,” Proc. 3rd International Conference on Document Analysis and Recognition,
vol. 2, pp 811-814, 1995.
[2] U. Pal, and B. B. Chaudhuri, “Machine-printed and Hand-written
Text Line Identification,” Pattern Recognition Letters, vol. 22, n. 3-4, pp. 431-441, 2001.
[3] J. K. Guo, and M. Y. Ma, “Separating Handwritten Material from
Machine Printed Text Using Hidden Markov Models,” Proc. 6th International Conference on Document Analysis and Recognition, pp.
439-443, 2001.
[4] Y. Zheng, H. Li, and D. Doermann, “Machine Printed Text and Handwriting Identification in Noisy Document Images,” IEEE Trans
on Pattern Analysis and Machine Intelligence, vol. 26, n. 3, pp. 337-353, 2004.
[5] R. Kandan, N. K. Reddy, K. R. Arvind, and A. G. Ramakrishnan, “A
Robust Two Level Classification Algorithm for Text Localization in Documents,” Advances in Visual Computing, 3rd Int Symp, (ISVC
07), Part II, LNCS 4842, pp. 96–105, 2007.
[6] B. Gatos, I. Pratikakis and S. J. Perantonis, “Adaptive degraded
document image binarization,” Pattern Recognition, vol. 39, pp. 317-327, 2006.
[7] C. Wolf, and J.M. Jolion, “Extraction and recognition of artificial text in multimedia documents,” Pattern Analysis and Applications, vol. 6,
n. 4, pp. 309-326, 2003.
[8] T. Akiyama, and N. Hagita, “Automatic entry system for printed documents,” Pattern Recognition, vol. 23, n. 11, pp. 1141-1154, 1990.
[9] M. Cheriet, N. Kharma, C. L. Liu, and C. Suen, “Character
Recognition Systems: A Guide for Students and Practitioners,” Wiley-Interscience editor, p 321, 2007.
[10] E. Ataer, and P. Duygulu, “Retrieval of Ottoman Documents,” Proc
8th ACM international workshop on Multimedia information retrieval, pp. 155-162, 2006.
[11] S. Tabbone ,L. Wendling, and J. P. Salmon, “A new shape descriptor
defined on the Radon transform,” Computer Vision and Image Understanding, vol.102, n. 1, pp. 42–51, 2006.
[12] S. R. Deans, “The Radon Transform and Some of Its Applications.
New York: Wiley, 1983.
[13] S. Theodoridis, and K. Koutroumbas, “Pattern Recognition,” 4th Ed,
Elsevier Inc, 2009.
[14] H. Nemmour, Y. Chibani, “Handwritten digit recognition based on a neural-SVM combination”, Int journal of computers and applications
(Acta Press Editor), vol. 32, n.1, pp. 104-109, 2010.
[15] H. Nemmour, Y. Chibani, “Integrating class-dependant tangent vectors into SVMs for handwritten digit recognition,” Int Conf on
Signals, Circuits and Systems (ICSCS), pp. 1-4, 2009.
[16] U.V. Marti, and H. Bunke, “The IAM-Database: an english sentence database for offline handwriting recognition,” International Journal
on Document Analysis and Recognition, vol. 5, n. 1, pp. 39-46, 2002.