Bayesian Frameworks for Deformable Pattern Classification and Retrieval by Kwok-Wai Cheung January...

Bayesian Frameworks for Deformable Pattern Classificatio

n and Retrieval

by Kwok-Wai Cheung

January 1999

Model-Based Scene AnalysisKnowledge Input Output

An “H” model

Integrated Segmentation and Recognition

Template Matching: Limitation

ReferenceModels

MatchingScore of “H”

10/12

MatchingScore of “A”

10/12

Knowledge

Input

Output

Deformable Models

• A deformable model is mostly referred to as an object shape abstraction and possesses shape varying capability for modeling non-rigid objects.

A Deformable “6” Model

A Common Formulation

Modeling

RetrievalClassification

Matching

Modeling• Model representation Hj • Model shape parameter vector w

parameter space w1

w2 w3

Hj(w3)Hj(w2)

Hj(w1)

A Common Formulation

A Common FormulationMatching

•A search process (multi-criterion optimization)

parameter space w0

w1 wf

Model Deformation Criterion

Data Mismatch Criterion

Combined Criterion and Regularization

),;( wHwE jdef

),;( DHwE jmis

),;(),;( DHwEwHwE jmisjdef

Hj(wf)

A Common FormulationClassification

A Common FormulationRetrieval

Thesis OverviewReasoning:Bayesian Framework

Approach:Deformable Models

Problem: Deformable Pattern Classification

Problem: Deformable Pattern Retrieval

Application: Handwritten Digit Recognition

Application: Handwritten Word Retrieval

Presentation Outline

• A Bayesian framework for deformable pattern classification (applied to handwritten character recognition)

• Extensions of the framework

– A competitive mixture of deformable models

– Robust deformable matching

• A Bayesian framework for deformable pattern detection (applied to handwritten word retrieval)

• Conclusions and future works

A Bayesian Framework for Deformable Pattern Classification

with Application to Isolated Handwritten Character Recognition

Bayesian BackgroundPrior Distribution

w w

Posterior Distribution

Likelihood Function

w

Data Distribution

D

Bayesian FormulationShape Parameter Distribution

• Prior distribution (without data)

• Likelihood function

• Posterior distribution (with data)

))(exp()(

1)( wE

ZHw|p def

wi

)),;(exp()(

1)( DwE

ZHD|wp mis

Di

)),;()(exp(),(

1),,,( DwEwE

ZHw|Dp misdef

Mi

Bayesian Inference: Matching

• Matching by maximum a posteriori (MAP) estimation.

),|(max iw

def Hwpw ),,|(max iw

mis HwDpw

parameter space

),,,|(max ***i

wHDwpw

MAP estimate

),|,(max},{,

**iHDp

Bayesian Inference: Classification

• Classification by computing the model evidence (Laplacian approximation).

loglog

)()(

),()|(

**

**

Dw

Mi ZZ

ZHDp

Model Representation• Cubic B-splines for modeling handwritten cha

racter shape.• Shape parameter vector { w, A, T }

– w = spline control points (local deformation)– {A,T} = affine transform parameter (global

deformation)• Mixture of Gaussians for modeling black pixel

s.

Model Representation

6

1

2

3

8

4

5

7

Stroke width

Control points with sequence number

Spline curve

/1

Gaussian distributionsmodeling black pixels

Criterion Function Formulation

• Model Deformation Criterion

• Data Mismatch Criterion

)()(2

1)( 1 hwhwwE t

def

gN

j

lj

g

N

lmis

yTAwm

NDTAwE

1

2

1 2

,,exp

1log),;,,(

Mahalanobis distance

Negative log of product of a mixture of Gaussians

Matching

• MAP estimation for {w, A, T, , } using the expectation-maximization (EM) algorithm [Dempster et al. 1977].

• No closed form solutions and iterations between the estimation of {w, A, T} (linear) and that of {, are required.

Matching Results

SimpleInitialization

Affine TransformInitialization

FinalMatch

Matching Results

= 3.54 ~ 0.9

deformed less

= 0.89 ~ 0.9

deformed more

~ 3.0 = 0.9

thinner stroke

~ 3.0 = 0.52

thicker stroke

Classification

Best Matchwith highest

P(D|H6).The output

class is “Six”.

Critical Factors for Higher Accuracy

• Size of the Model Set– how many models for each class?

• Model Flexibility Constraints

• Likelihood Inaccuracy– use prior only for the best few candidates.

Unconstrained Constrained

Critical Factors for Higher Accuracy

• Filtering Normalized “1”

• Sub-part Detection

These are the unmatched portions for matching model

“2” to data “0”.

For the NIST dataset we used, all the characters are normalized to 20x32. Some abnormal “1”s are observed.

Experiment

• Training Set (NIST SD-1)– 11,660 digits (32x32 by 100 writers)

• Test Set (NIST SD-1)– 11,791 digits (32x32 by 100 writers)

• Size of Model Set = 23 (manually created)

Experimental Results

Methods 1,000 digits 11,791 digits

B 78% -B+R 92.1% -

B+R+O 94.3% 93.8%B+R+O+P 95.1% 94.4%

B+R+O+P+S - 94.7%B+R+O+P+S+Rj-4.9 - 95.9%

Previous Works

# of Models Test Set Accuracy Reject[Wakahara1994]

10 Private (2,400) 96.8% 3%

10 CEDAR (2,711) 98.5% 0%[Revow et al.1996] ANN used 10 CEDAR (2,213) 96.8% 0%

[Jain et al.1997]

2000 NIST (2,000) 99.3% 0%

Our system 23 NIST (11,791) 94.7% 0%

Accuracy and Size of Model Set

No. of models

Accuracy

23 2000

94.7%

99.25% [Jain et al.1997]

[Our system]

Optimal accuracy curve

Nearest NeighborManual

Summary

• A unified framework based on Bayesian inference is proposed for modeling, matching and classifying non-rigid patterns with promising results for handwritten character recognition.

• Several critical factors related with the recognition accuracy are carefully studied.

Extensions of the Bayesian Framework

Major Limitations of the Framework

• The Scale-up Problem– The classification time increases linearly with

the size of the model set.

• The Outlier Problem– The framework is very sensitive to the prese

nce of outlier data (e.g., strokes due to the adjacent characters)

The Scale-up Problem Solns.

• Hardware solution– Independent Matching Process -> Highly

Parallel Computing Architecture

• Software solution– Cutting down the unnecessary computation

by carefully designing the data structure and the implementation of the algorithm.

A Competitive Mixture of Deformable Models

• Let H = {H1, H2, … , HM, 1, 2, … , M} denote a mixture of M models.

Input data D

H1 H2 HM

1 2 M

M

iii HDpDp

1

)|()|( H

ii ,0

i

i 1

A Competitive Mixture of Deformable Models

• The Bayesian framework is extended and {i} can then be estimated using the EM algorithm.

• By maximizing p(D|H) and assuming the data D comes from Hi , the ideal outcome of {i} = [0 0 .. 0 1 0 .. 0]

i

Speed up: Elimination ProcessInput data D

H1 H2 HM

1 2 M

Iteration 1

2 ...

M

1 0.1 0.02 0.085

2 0.12 X 0.056

3 0.21 X 0.012

4 0.38 X X

Experiment

• Training Set (NIST SD-1)– 2,044 digits (32x32 by 30 writers)

• Test Set (NIST SD-1)– 1,427 digits (32x32 by 19 writers)

• Size of Model Set = 10 (manually created)

• Elimination Rule– After the first iteration, only best R models are

retained.

Experimental Results: Accuracy

92.7%

94.2%

95.1%

Experimental Results: Speedup

2.1

1.9

1.4

The Outlier Problem

• The mixture of Gaussians noise model fails when some gross errors (outliers) are present.

Badly Segmented InputWell Segmented Input

The Outlier Problem

• There is a necessity to distinguish between the true data and the outliers.

• Utilize true data and suppress outliers.

True data Outliers

Use of Robust Statistics

• Robust statistics takes into account the outliers by either:1) Modeling them explicitly using probability di

stributions, e.g. uniform distribution

2) Discounting their effect (M-estimation),

e.g. defining the data mismatch measure

(which is normally quadratic) such that(x)

o.w.x

kxc(x)

2x(x)

Use of Robust Statistics• Suppressing the outliers’ contribution

o.w.x

kxc(x)

2x(x)

Robust Linear Regression

Without RobustStatistics

With RobustStatistics

Robust Deformable Matching

x)(w,A,T;yE)l.,x,()l.,x,(x) lmiswwxwwx 350350(

• An M-estimator is proposed such that

Original Data Mismatch Criterion

Data Mismatch

Criterion with Robust

Statistics

Experiment

• Goal: To extract the leftmost characters from handwritten words.

• Test Set - CEDAR database

• Model Set - manually created

• Model Initialization– Chamfer matching based on a distance

transform.


Initialization

Fixed WindowWidth 1

Fixed Window Width 2

Fixed Window Width 3

Robust Window

More Experimental Results

Summary

• The basic framework can be extended to a competitive mixture of deformable models where significant speedup can be achieved.

• The robust statistical approach is found to be an effective solution for robust deformable matching in the presence of outliers.

Deformable Pattern Detection

A Bayesian Framework for Deformable Pattern Detection with Application to Handwritten Wor

d Retrieval

The Bayesian Framework RevisitModel, Hi

(Uniform prior)

Shape parameter, w(Prior distribution of w)

Regularizationparameter,

(Uniform prior)

Data, D(Likelihood function of w)

Stroke widthparameter,

(Uniform prior)

MultivariateGaussian

Mixture ofGaussians

Direction of GenerationFrom Model to Data

A Dual View of Generativity

The Sub-part Problem

The Outlier Problem

Forward and Reverse Frameworks

Hi

w

D

Hi

w

D

Reverse F

ramew

ork

For

war

d F

ram

ewor

k

Forward Framework

Model, Hi

Shape parameter, w

Regularizationparameter,

(Uniform prior)

Data, D(Uniform prior)

Model localizationparameter,

(Uniform prior)

MultivariateGaussian

Mixture of Gaussians(each data point is

a Gaussian center)

Direction of GenerationFrom Data to Model

New Criterion Function

• Sub-data Mismatch Criterion

N

l

ljN

jmissub

yTAwm

NDTAwE

g

1

2

1 2

,,exp

1log),;,,(

Negative log of product of a mixture of Gaussians

gN

j

lj

g

N

lmis

yTAwm

NDTAwE

1

2

1 2

,,exp

1log),;,,(

Old Data Mismatch Criterion

Forward Matching

• Matching– Optimal estimates {w*, A*, T*, *, *} are

obtained by maximizing

– The EM algorithm is used.

)|,(),|,,(),|,,( DpDTAwpHTAwp i

Pattern Detection

• Detection – by computing the forward evidence (Laplacia

n approximation)

loglog

)()(

),()|(

**

**

Dw

Mi ZZ

ZHDp

Formula for these three parts are different when

compared with the reverse evidence computation.

Comparison between Two Frameworks

• Shape Discriminating Properties

– The reverse evidence does not penalize models resting on the white space. [Proof: see Proposition 1]

– The forward evidence does penalize white space. [Proof: see Proposition 2] (The sub-part problem is solved implicitly.)

Comparison between Two Frameworks

• Shape Matching Properties

– Reverse matching is sensitive to outliers but possesses good data exploration capability. [Proof: see Proposition 3]

– Forward matching is insensitive to outliers but with weak data exploration capability. Thus, its effectiveness relies on some good initialization. [Proof: see Proposition 4] (The outlier problem is solved implicitly.)

Bidirectional Matching Algorithm

• A matching algorithm is proposed which possesses the advantages of the two frameworks.

• The underlying idea is to try to obtain a correspondence between the model and data such that the model looks like the data AND vice versa (i.e., the data mismatch measures for the two frameworks should both be small.).

Bidirectional Matching AlgorithmInitialization by Chamfer matching

Forward Matching

Compute the data mismatch measures

for the two frameworks, Emis and Esub-mis

ReverseMatching

Emis>Esub-mis

?

:=(1+)if:=4

Converge ?

yesno

no

Convergence Property

• The local convergence property of the bidirectional matching algorithm has been proved. [see Theorem 1]

Experiment (I)

• Goal: To extract the leftmost characters from handwritten words.

• Test Set - CEDAR database– 300 handwritten city name images

• Model Set - manually created• Model Initialization

– Chamfer matching based on a distance transform


Forward Matching

Reverse Matching

Bidirectional Matching


Matching Results 300 Images

Correct Matching 85.5%

Bad Initialization 9.7%

Bad Matching 3.4%

Others 1.4%

* Results are obtained by visual checking.

Experiment (II)

• Goal: To retrieve handwritten words with its leftmost character similar to an input shape query.

• Test Set - CEDAR database– 100 handwritten city name images

• Query Set

Performance Evaluation

candidatescorrect of no. Total

retrievedcorrectly of No.Recall

candidates retrieved of no. Total

retrievedcorrectly of No.Precision

More false positive cases, the precision rate decreases.

More true negative cases, the recall rate decreases.


Best N Approach

Recall = 59%

Precision = 43%

# of candidates = 10


Evidence Thresholding

Recall = 65%

Precision = 45%

Averaged # of candidates = 12.7

Related Works

[Huttenlocher et al. 1993] Hausdorff matching for image comparison.

[Burl et al. 1998] Keyword spotting in on-line handwriting data.

[Jain et al. 1998] Shape-based retrieval of trademark images.

Summary

• A novel Bayesian framework is proposed for deformable pattern detection.

• By combining the two proposed frameworks, the bidirectional matching algorithm is proposed and applied to handwritten word retrieval. Both theoretical and experimental results show that the algorithm is robust against outliers and possesses good data exploration capability.

Conclusions & Future Works

Summary of Contributions

• A comprehensive study on deformable pattern classification based on a unified framework.

• A competitive mixture of deformable models for alleviating the scale-up problem.

• A study on using non-linear robust estimation for alleviating the outlier problem.

Summary of Contributions

• A novel Bayesian framework for deformable pattern detection, the theoretical comparison between the frameworks and the newly proposed bidirectional matching algorithm.

• Portability to other shape recognition problems.

Future Works

• Modeling a Dataset of Non-Rigid Shapes– On Model Representation Construction– On Model Set Construction

• Shape Discrimination and Model Initialization

• Fast Implementation– More on Model Competition– On Search Space Pruning and Deformable Shap

e Parameterization

Bayesian Frameworks for Deformable Pattern Classification and Retrieval by Kwok-Wai Cheung January...

Documents

Transcript of Bayesian Frameworks for Deformable Pattern Classification and Retrieval by Kwok-Wai Cheung January...