Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire...

25
Hubert CARDOT JY- RAMEL Rashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, 37200 TOURS – France Pascal workshop (June 14, 2007) raph Signature: A Simple Approach for Clustering Similar Grap Applied to Graphic Symbols Recognition

Transcript of Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire...

Hubert  CARDOTJY- RAMELRashid-Jalal QURESHI

Université François Rabelais de Tours,Laboratoire d'Informatique

64, Avenue Jean Portalis, 37200 TOURS – France

Pascal workshop (June 14, 2007)

Graph Signature: A Simple Approach for Clustering Similar Graphs

Applied to Graphic Symbols Recognition

Graph Signature: A Simple Approach for Clustering Similar Graphs

Applied to Graphic Symbols Recognition

Plan

Results & Conclusion

Perspectives ( Future Works)

Graph Based Symbol’s Representation

Introduction

Graph Matching using G-Signature

Proposed Graph Matching Methods

Graphics Primitives Extraction Attributed Graph Generation

+

+

+

+

_

_

_

_

_

Graph Mining for Feature Vector Extraction

Pascal workshop (June 14, 2007) 2

Document Image Analysis

Text Part Graphics Part

Character recognition

Lines recognition

Symbols recognition

Professional softwares already exist

Logos recognition

Introduction

Graphic SymbolsAttributed

Graph

Graph Matching Using G-Signature

for Recognition

Pascal workshop (June 14, 2007) 3

Introduction

Symbols can be simple 2D binary shapes composed of lines, arcs and filled areas, that represent somethingin a specific application domain.

Electrical SymbolsArchitectural Symbols

Pascal workshop (June 14, 2007) 4

For contours vectorization, we have used a method suggested by K. Wall [13]

Quadrilaterals built by matching the corresponding vectors in term of slope, distance and area criteria. i.e., vectors which are close to each other and have opposite directions are fused together to form a quadrilateral

Vectorization and Quadrilaterals

Symbol Vectorization of contours Quadrilaterals

[13] K. Wall, P. Danielsson, “A fast sequential method for polygonal approximation of digitized curves”, Computer Vision, Graphics and Image Processing, vol. 28, 1984, pp. 220 – 221.

Graph Based Symbol’s Representation 1/6

Pascal workshop (June 14, 2007) 5

Linear graphics symbols and their representation by quadrilaterals

Pascal workshop (June 14, 2007) 6

Graph Based Symbol’s Representation 2/6

Zone of Influence of a Quadrilateral

4/Ux)( 21 wwavgUy

Zone of influence of quadrilateral

Each quadrilateral has attributes like length ( ) of the median axis, angles of the two vectors, width on each side and a zone of influence

),( 21 ww

Pascal workshop (June 14, 2007) 7

Graph Based Symbol’s Representation 3/6

Zone of influence of quadrilaterals and their corresponding sub-graphs

Fusing sub-graphs together, a complete neighbourhood graph

Pascal workshop (June 14, 2007) 8

Graph Based Symbol’s Representation 4/6

Quadrilaterals Nodes

Intersection

Parallel Junction SuccessiveJunction

Nodes Attribute (Relative Length

Pascal workshop (June 14, 2007) 9

Edges Attributes (Connection Type , Relative Angles)

jiij

max/ii

max/ii )

Graph Based Symbol’s Representation 5/6

Pascal workshop (June 14, 2007) 10

TL

L

L

Attributed graph of quadrilaterals with symbolic and numeric attributes

Graph Based Symbol’s Representation 6/6

Graph Matching

Motivation Behind Graph Signature

Error-tolerant Methods… Graph edit distance + Robust to vectorial distortion - NP-Complete in Worst case

Similarity Measure Based Methods… + Robust to noise/distortion - Sub-optimal solution

Graph Isomorphism, Subgraph Isomorphism, Maximum Common Subgraph + Optimal Solution - NP Complete - No robustness to noise and distortion

Pascal workshop (June 14, 2007) 11

)1(111111

jj

k

ii

n

jj

m

iiMp EVSc

kV

kV

kk

i

AAfV

,1

k

EkE

kk

j

AAgE

,1

180

, 111

jiij

EE AAg

iiVV AAf 111 ,

vertex-to-vertex similarity

edge-to-edge similarity

Splits as penalties

Pascal workshop (June 14, 2007) 12

Greedy Algorithm, Score of mappings

Pascal workshop (June 14, 2007) 13

Greedy Algorithm, SimGraph

)1(111111

jj

k

ii

n

jj

m

iiMp EVSc

1.0

0.8

1

85

2

1.0

0.9

0.5

A

B

C

90

45

A-1,B-2 2+1.8=3.8 0.98 0 0 4.6

A-1 2 0 0 0 2

A-1,B-2 3.8+1.4=5.2 0.98 2 0 4.18 C-2

SimGraph Continue…

Pascal workshop (June 14, 2007) 14

Working with 50 different symbols of GREC2003 database, a set of 1100 examples of different levels of distortion, geometric transformations and common noises were generated.

ModelSymbol

QuerySymbol

Detectedcorrectly

MissedRecognition

Rate

Rotation 50 150 150 0 100%

Scaling 50 100 100 0 100%

Noise

Level-1 50 250 242 8 96.8%

Level-2 50 250 238 12 95.2%

Level-3 50 250 230 20 92.0%

Distortion 15 100 94 6 94.0%

The proposed novel similarity measure, and Simgraph Algorithm is devised to perform inexact matching of attributed graphs in Polynomial time ))(( 2

21 VV

Pascal workshop (June 14, 2007) 15

SimGraph Continue…

A. Quantitative Features

It consist of number of vertices in a graph, number of edges in the graph, number of vertices connected to 1, 2, 3, 4 or greater than 4 vertices ( i.e., degree of vertices).

B. Symbolic Features

The study of the symbolic attributes associated with edges. These consist of number of edges having L, P, T, X, or S as edge label.

C. Range Features

These features are based on the frequency of relative lengths (nodes) and relative angle (edges) in a certain interval.

Graph Signature or G-Signature is the transformation of graph representation of graphic symbol to 1-Dimentional features vector, which is rather easy to store and manipulate.

Graph Signature (G - Signature)

Pascal workshop (June 14, 2007) 16

Three types of discriminating features were extracted

:1f:2f:3f

:4f:5f

:6f

:7f

:8f

:9f:10f:11f

:13f

:14f

:15f

:16f

:17f

:18f

:19f:20f

:21f

:22f:23f

:12f

# of vertices in a graph

# of edges in a graph

# of vertices with degree 1

# of vertices with degree 2

# of vertices with degree 3

# of vertices with degree 4

# of vertices with degree > 4

A. Quantitative Features B. Symbolic Features C. Range Features

# of edges having label “L”

# of edges having label “P”

# of edges having label “T”

# of edges having label “X”

# of edges having label “S”

# of vertices with RL (0.0 - 0.2)

# of vertices with RL (0.2 - 0.4)

# of vertices with RL (0.4 - 0.6)

# of vertices with RL (0.6 - 0.8)

# of vertices with RL (0.8 - 1.0)

# of edges with RA (0° - 30°)

# of edges with RA (30° - 60°)

# of edges with RA (60° - 90°)

# of edges with RA (90° - 120°)

# of edges with RA (120° - 150°)

# of edges with RA (150° - 180°)

Pascal workshop (June 14, 2007) 17

Graph Signature (G - Signature)

]...[ 23321 aaaaGa ]...[ 23321 bbbbGb

n

iiiba baGGd

1

22

2

0

0

0

0

321

33231

22321

11312

321

3333231

2232221

1131211

iii

j

j

j

ijiii

j

j

j

ij

ddd

ddd

ddd

ddd

dddd

dddd

dddd

dddd

dD

Pascal workshop (June 14, 2007) 18

Graph Signature (G - Signature)

GREC-2003 Models

Distances of hand-drawn architectural and electrical symbols vs. their respective models

Pascal workshop (June 14, 2007) 19

Graph Signature (G - Signature)

d (Si , x) = MINi (d(Si ,x))

The nearest neighbour rule (NNR) for classification, i.e., Two graphic symbols are similar if the Euclidean distance of their feature vectors is relatively small.

Pascal workshop (June 14, 2007) 20

Graph Signature (G - Signature)

Results

Performance of the proposed G-signature

Pascal workshop (June 14, 2007) 21

Pascal workshop (June 14, 2007) 22

Improvement suggested

G – Signature Cluster of Similar Symbols

Greedy Algorithm

Closest Matching Symbol

Conclusions

Due to relative attributes on graph’s vertices and edges, our graph based symbols representations are invariant of rotation and scaling.

The technique is fairly general and can be used to cluster similar graphs

G-signature is very fast to compute from an attributed graph

Pascal workshop (June 14, 2007) 23

Higher precision can be achieved when it is coupled with other polynomial time graph matching algorithms.

A weighted distance measure, or some other statistical classifier can also be use to improve performance (tests under study)

Thats it !

Pascal workshop (June 14, 2007)

)1(111111

jj

k

ii

n

jj

m

iiMp EVSc

)()()()()GSim(G,

CCVCVC

ScMp

: is the score of the mapping computed

C : is a cardinality function (# of vertices or edges)

: represent the number of attributes associated to a vertex and an edge

MpSc

The New Similarity measure ( continue…)

Pascal workshop (June 14, 2007) 14

SimGraph Continue… 3/4