Bayesian Frameworks for Deformable Pattern Classification and Retrieval by Kwok-Wai Cheung January...
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Bayesian Frameworks for Deformable Pattern Classification and Retrieval by Kwok-Wai Cheung January...
Bayesian Frameworks for Deformable Pattern Classificatio
n and Retrieval
by Kwok-Wai Cheung
January 1999
Model-Based Scene AnalysisKnowledge Input Output
An “H” model
Integrated Segmentation and Recognition
Template Matching: Limitation
ReferenceModels
MatchingScore of “H”
10/12
MatchingScore of “A”
10/12
Knowledge
Input
Output
Deformable Models
• A deformable model is mostly referred to as an object shape abstraction and possesses shape varying capability for modeling non-rigid objects.
A Deformable “6” Model
Modeling• Model representation Hj • Model shape parameter vector w
parameter space w1
w2 w3
Hj(w3)Hj(w2)
Hj(w1)
A Common Formulation
A Common FormulationMatching
•A search process (multi-criterion optimization)
parameter space w0
w1 wf
Model Deformation Criterion
Data Mismatch Criterion
Combined Criterion and Regularization
),;( wHwE jdef
),;( DHwE jmis
),;(),;( DHwEwHwE jmisjdef
Hj(wf)
Thesis OverviewReasoning:Bayesian Framework
Approach:Deformable Models
Problem: Deformable Pattern Classification
Problem: Deformable Pattern Retrieval
Application: Handwritten Digit Recognition
Application: Handwritten Word Retrieval
Presentation Outline
• A Bayesian framework for deformable pattern classification (applied to handwritten character recognition)
• Extensions of the framework
– A competitive mixture of deformable models
– Robust deformable matching
• A Bayesian framework for deformable pattern detection (applied to handwritten word retrieval)
• Conclusions and future works
A Bayesian Framework for Deformable Pattern Classification
with Application to Isolated Handwritten Character Recognition
Bayesian BackgroundPrior Distribution
w w
Posterior Distribution
Likelihood Function
w
Data Distribution
D
Bayesian FormulationShape Parameter Distribution
• Prior distribution (without data)
• Likelihood function
• Posterior distribution (with data)
))(exp()(
1)( wE
ZHw|p def
wi
)),;(exp()(
1)( DwE
ZHD|wp mis
Di
)),;()(exp(),(
1),,,( DwEwE
ZHw|Dp misdef
Mi
Bayesian Inference: Matching
• Matching by maximum a posteriori (MAP) estimation.
),|(max iw
def Hwpw ),,|(max iw
mis HwDpw
parameter space
),,,|(max ***i
wHDwpw
MAP estimate
),|,(max},{,
**iHDp
Bayesian Inference: Classification
• Classification by computing the model evidence (Laplacian approximation).
loglog
)()(
),()|(
**
**
Dw
Mi ZZ
ZHDp
Model Representation• Cubic B-splines for modeling handwritten cha
racter shape.• Shape parameter vector { w, A, T }
– w = spline control points (local deformation)– {A,T} = affine transform parameter (global
deformation)• Mixture of Gaussians for modeling black pixel
s.
Model Representation
6
1
2
3
8
4
5
7
Stroke width
Control points with sequence number
Spline curve
/1
Gaussian distributionsmodeling black pixels
Criterion Function Formulation
• Model Deformation Criterion
• Data Mismatch Criterion
)()(2
1)( 1 hwhwwE t
def
gN
j
lj
g
N
lmis
yTAwm
NDTAwE
1
2
1 2
,,exp
1log),;,,(
Mahalanobis distance
Negative log of product of a mixture of Gaussians
Matching
• MAP estimation for {w, A, T, , } using the expectation-maximization (EM) algorithm [Dempster et al. 1977].
• No closed form solutions and iterations between the estimation of {w, A, T} (linear) and that of {, are required.
Matching Results
= 3.54 ~ 0.9
deformed less
= 0.89 ~ 0.9
deformed more
~ 3.0 = 0.9
thinner stroke
~ 3.0 = 0.52
thicker stroke
Critical Factors for Higher Accuracy
• Size of the Model Set– how many models for each class?
• Model Flexibility Constraints
• Likelihood Inaccuracy– use prior only for the best few candidates.
Unconstrained Constrained
Critical Factors for Higher Accuracy
• Filtering Normalized “1”
• Sub-part Detection
These are the unmatched portions for matching model
“2” to data “0”.
For the NIST dataset we used, all the characters are normalized to 20x32. Some abnormal “1”s are observed.
Experiment
• Training Set (NIST SD-1)– 11,660 digits (32x32 by 100 writers)
• Test Set (NIST SD-1)– 11,791 digits (32x32 by 100 writers)
• Size of Model Set = 23 (manually created)
Experimental Results
Methods 1,000 digits 11,791 digits
B 78% -B+R 92.1% -
B+R+O 94.3% 93.8%B+R+O+P 95.1% 94.4%
B+R+O+P+S - 94.7%B+R+O+P+S+Rj-4.9 - 95.9%
Previous Works
# of Models Test Set Accuracy Reject[Wakahara1994]
10 Private (2,400) 96.8% 3%
10 CEDAR (2,711) 98.5% 0%[Revow et al.1996] ANN used 10 CEDAR (2,213) 96.8% 0%
[Jain et al.1997]
2000 NIST (2,000) 99.3% 0%
Our system 23 NIST (11,791) 94.7% 0%
Accuracy and Size of Model Set
No. of models
Accuracy
23 2000
94.7%
99.25% [Jain et al.1997]
[Our system]
Optimal accuracy curve
Nearest NeighborManual
Summary
• A unified framework based on Bayesian inference is proposed for modeling, matching and classifying non-rigid patterns with promising results for handwritten character recognition.
• Several critical factors related with the recognition accuracy are carefully studied.
Major Limitations of the Framework
• The Scale-up Problem– The classification time increases linearly with
the size of the model set.
• The Outlier Problem– The framework is very sensitive to the prese
nce of outlier data (e.g., strokes due to the adjacent characters)
The Scale-up Problem Solns.
• Hardware solution– Independent Matching Process -> Highly
Parallel Computing Architecture
• Software solution– Cutting down the unnecessary computation
by carefully designing the data structure and the implementation of the algorithm.
A Competitive Mixture of Deformable Models
• Let H = {H1, H2, … , HM, 1, 2, … , M} denote a mixture of M models.
Input data D
H1 H2 HM
1 2 M
M
iii HDpDp
1
)|()|( H
ii ,0
i
i 1
A Competitive Mixture of Deformable Models
• The Bayesian framework is extended and {i} can then be estimated using the EM algorithm.
• By maximizing p(D|H) and assuming the data D comes from Hi , the ideal outcome of {i} = [0 0 .. 0 1 0 .. 0]
i
Speed up: Elimination ProcessInput data D
H1 H2 HM
1 2 M
Iteration 1
2 ...
M
1 0.1 0.02 0.085
2 0.12 X 0.056
3 0.21 X 0.012
4 0.38 X X
Experiment
• Training Set (NIST SD-1)– 2,044 digits (32x32 by 30 writers)
• Test Set (NIST SD-1)– 1,427 digits (32x32 by 19 writers)
• Size of Model Set = 10 (manually created)
• Elimination Rule– After the first iteration, only best R models are
retained.
The Outlier Problem
• The mixture of Gaussians noise model fails when some gross errors (outliers) are present.
Badly Segmented InputWell Segmented Input
The Outlier Problem
• There is a necessity to distinguish between the true data and the outliers.
• Utilize true data and suppress outliers.
True data Outliers
Use of Robust Statistics
• Robust statistics takes into account the outliers by either:1) Modeling them explicitly using probability di
stributions, e.g. uniform distribution
2) Discounting their effect (M-estimation),
e.g. defining the data mismatch measure
(which is normally quadratic) such that(x)
o.w.x
kxc(x)
2x(x)
Robust Deformable Matching
x)(w,A,T;yE)l.,x,()l.,x,(x) lmiswwxwwx 350350(
• An M-estimator is proposed such that
Original Data Mismatch Criterion
Data Mismatch
Criterion with Robust
Statistics
Experiment
• Goal: To extract the leftmost characters from handwritten words.
• Test Set - CEDAR database
• Model Set - manually created
• Model Initialization– Chamfer matching based on a distance
transform.
Experimental Results
Initialization
Fixed WindowWidth 1
Fixed Window Width 2
Fixed Window Width 3
Robust Window
Summary
• The basic framework can be extended to a competitive mixture of deformable models where significant speedup can be achieved.
• The robust statistical approach is found to be an effective solution for robust deformable matching in the presence of outliers.
A Bayesian Framework for Deformable Pattern Detection with Application to Handwritten Wor
d Retrieval
The Bayesian Framework RevisitModel, Hi
(Uniform prior)
Shape parameter, w(Prior distribution of w)
Regularizationparameter,
(Uniform prior)
Data, D(Likelihood function of w)
Stroke widthparameter,
(Uniform prior)
MultivariateGaussian
Mixture ofGaussians
Direction of GenerationFrom Model to Data
Forward Framework
Model, Hi
Shape parameter, w
Regularizationparameter,
(Uniform prior)
Data, D(Uniform prior)
Model localizationparameter,
(Uniform prior)
MultivariateGaussian
Mixture of Gaussians(each data point is
a Gaussian center)
Direction of GenerationFrom Data to Model
New Criterion Function
• Sub-data Mismatch Criterion
N
l
ljN
jmissub
yTAwm
NDTAwE
g
1
2
1 2
,,exp
1log),;,,(
Negative log of product of a mixture of Gaussians
gN
j
lj
g
N
lmis
yTAwm
NDTAwE
1
2
1 2
,,exp
1log),;,,(
Old Data Mismatch Criterion
Forward Matching
• Matching– Optimal estimates {w*, A*, T*, *, *} are
obtained by maximizing
– The EM algorithm is used.
)|,(),|,,(),|,,( DpDTAwpHTAwp i
Pattern Detection
• Detection – by computing the forward evidence (Laplacia
n approximation)
loglog
)()(
),()|(
**
**
Dw
Mi ZZ
ZHDp
Formula for these three parts are different when
compared with the reverse evidence computation.
Comparison between Two Frameworks
• Shape Discriminating Properties
– The reverse evidence does not penalize models resting on the white space. [Proof: see Proposition 1]
– The forward evidence does penalize white space. [Proof: see Proposition 2] (The sub-part problem is solved implicitly.)
Comparison between Two Frameworks
• Shape Matching Properties
– Reverse matching is sensitive to outliers but possesses good data exploration capability. [Proof: see Proposition 3]
– Forward matching is insensitive to outliers but with weak data exploration capability. Thus, its effectiveness relies on some good initialization. [Proof: see Proposition 4] (The outlier problem is solved implicitly.)
Bidirectional Matching Algorithm
• A matching algorithm is proposed which possesses the advantages of the two frameworks.
• The underlying idea is to try to obtain a correspondence between the model and data such that the model looks like the data AND vice versa (i.e., the data mismatch measures for the two frameworks should both be small.).
Bidirectional Matching AlgorithmInitialization by Chamfer matching
Forward Matching
Compute the data mismatch measures
for the two frameworks, Emis and Esub-mis
ReverseMatching
Emis>Esub-mis
?
:=(1+)if:=4
Converge ?
yesno
no
Convergence Property
• The local convergence property of the bidirectional matching algorithm has been proved. [see Theorem 1]
Experiment (I)
• Goal: To extract the leftmost characters from handwritten words.
• Test Set - CEDAR database– 300 handwritten city name images
• Model Set - manually created• Model Initialization
– Chamfer matching based on a distance transform
Experimental Results
Matching Results 300 Images
Correct Matching 85.5%
Bad Initialization 9.7%
Bad Matching 3.4%
Others 1.4%
* Results are obtained by visual checking.
Experiment (II)
• Goal: To retrieve handwritten words with its leftmost character similar to an input shape query.
• Test Set - CEDAR database– 100 handwritten city name images
• Query Set
Performance Evaluation
candidatescorrect of no. Total
retrievedcorrectly of No.Recall
candidates retrieved of no. Total
retrievedcorrectly of No.Precision
More false positive cases, the precision rate decreases.
More true negative cases, the recall rate decreases.
Experimental Results
Evidence Thresholding
Recall = 65%
Precision = 45%
Averaged # of candidates = 12.7
Related Works
[Huttenlocher et al. 1993] Hausdorff matching for image comparison.
[Burl et al. 1998] Keyword spotting in on-line handwriting data.
[Jain et al. 1998] Shape-based retrieval of trademark images.
Summary
• A novel Bayesian framework is proposed for deformable pattern detection.
• By combining the two proposed frameworks, the bidirectional matching algorithm is proposed and applied to handwritten word retrieval. Both theoretical and experimental results show that the algorithm is robust against outliers and possesses good data exploration capability.
Summary of Contributions
• A comprehensive study on deformable pattern classification based on a unified framework.
• A competitive mixture of deformable models for alleviating the scale-up problem.
• A study on using non-linear robust estimation for alleviating the outlier problem.
Summary of Contributions
• A novel Bayesian framework for deformable pattern detection, the theoretical comparison between the frameworks and the newly proposed bidirectional matching algorithm.
• Portability to other shape recognition problems.