Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1.

1

Collective Classification for Network Data With Noise

Advisor : Prof. Sing Ling LeeStudent : Chao Chih WangDate : 2012.10.11

2

Outline

Introduction Network data Collective Classification ICA

Problem

Algorithm For Collective Inference With Noise

Experiments

Conclusions

3

Introduction – Network data traditional data:

instances are independent of each other

network data: instances are maybe related of each other

application: email web pages papers citation

independent related

4

Introduction – Collective Classification

classify interrelated instances using relational features.

Related instances: The instances to be classified are related.

Classifier: The base classifier uses content features and relational

features.

Collective inference: update the class labels recompute the relational feature values

5

Introduction – ICA

ICA : Iterative Collective Algorithm

Initial : Training local classifier use content features predict unlabel instancesIterative { for predict each unlabel instance { set unlabel instance ’s relational feature use local classifier predict unlabel instance }}

step1

step2

6

Introduction – ICA Example

1

2

2 A

E

1

1B

1

C

D

F1 G

3H

Class label : 1 2 3

Initial : use content features predict unlabel instances

Iterative 1:1. set unlabel instance

’s relational feature2. use local classifier predict unlabel instances

2

1

2

3

3



unlabel data:training data:

7

Problem – Noise

label the wrong class make a mistake difficult to judge 2

1

1A

B

2

2D

node content feature class

C 0 0 1

D 0 0 1

E 1 1 2

F 1 0 2

G 1 1 2

C

E F

G

content feature

a. b.

woman

man age≤20

age>20

0 1 0 1

class

non-smoking

smoking

1 2

11 or 2 ?1

8

Problem – use ICA

1

1

1A

B


1

1

1C 1

- noise

2

22

A

True label:

BICA:1

C




1



relational feature

Class 1 Class 2 Class 3

Iteration 1 2/3 1/3 0Iteration 2 2/3 1/3 0

2

A :

D

9

ACIN

ACIN : Algorithm For Collective Inference With Noise Initial :

Training local classifier use content features predict unlabel instancesIterative { for predict each unlabel instance { for nb unlabel instance ’s neighbors{

if(need to predict again) (class label, probability ) = local

classifier(nb) } set unlabel instance ’s relational feature (class label, probability ) = local classifier(A) } retrain local classifier}

step1

step2

step3

step4step5

10

Iterative 2:1. repredict unlabel instance ’s

neighbors2. set unlabel instance


ACIN - Example

1

1

1A

B


1

1

1C 1

- noise

2

22

A

True label:

BACIN:1

C


Iterative 1:1. predict unlabel instance ’s

neighbors2. set unlabel instance


1

relational feature


Iteration 1 70/130 60/130 0Iteration 2 60/120 60/120 0

( 2 , 60%)

( 1 , 70%)

2( 2 , 60%)

( 1 , 70%) ( 2 , 60%)

( 1 , 90%)

predict again

( 1 , 60%)

( 2 ,60%)

A :

( 1 , 60%)

D( 1 , 60%)

2

11

ACIN - Analysis

Compare with ICA 1. different method for compute

relational feature 2. predict unlabel instance ’s neighbors

again 3. retrain local classifier

12

compute relational feature use probability

A

12

3( 1 , 80%)

( 2, 60%)

( 3 , 70%)

Our method:Class 1 : 80/(80+60+70)Class 2 : 60/(80+60+70)Class 3 : 70/(80+60+70)

General method :Class 1 : 1/3Class 2 : 1/3Class 3 : 1/3

ACIN – Analysis #1

13

predict unlabel instance ’s neighbors again first iteration need to predict again different between original and predict

label : ▪ Next iteration need predict again▪ This iteration not to adopt

similar between original and predict label :▪ Next iteration not need predict again▪ Average the probability

A

1 2( 1 , 80%) ( 2, 60%)

( 2, 80%)( 2, 60%)

( 2, 70%)( 2, 60%)

1


predict again

B C

Example:

14

2

A

B

C

- noise

3D

1( 1 , 60%)

( 2 , 70%)

( 3 , 60%)

( 3 , 60%)

( 2 , 80%)

( 2 , 70%) predict again

( 2 , 75%)

( 3 , 60%)( ? , ??%)

relational feature


original 60/190 70/190 60/190

Method 1 50/185 75/185 60/185

Method 2 0/195 135/195 60/195

Method 3 0/135 75/135 60/135

B:Method 1 : (1 , 50%)Method 2 : (2 , 60%)Method 3 : (1 , 0%)

2

B’s True label : 2

B is noise:Method 2 >Method 3> Method 1

B is not noise:Method 1 >Method 3> Method 2

Method 1 & Method 2 is too extreme.So we choose the Method 3.


15

retrain local classifier


node

content feature

relational feature

class1 class2 class3

class

A 1 0 1 1/2 1/2 0 1

B 1 1 1 1 0 0 2

C 1 0 1 1 0 0 1

( 3 , 70%)

1+

AB

C

D

E

21

1+2

A 3

B

C

D

E

21

( 1 , 90%)( 2, 60%)

( 2 , 70%)

( 1 , 80%) node

content feature

relational featureclass1 class2

class3

class

A 1 0 1 90/290 130/290

70/290 1

B 1 1 1 80/140 60/140 0 2

C 1 0 1 80/150 0 70/150 1

retrain

Initial ( ICA )

16

Experiments

Data sets : Cora CiteSeer WebKB

17

Conclusions

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1.

Documents

Transcript of Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1.