Heterogeneous Consensus Learning via Decision Propagation and Negotiation
description
Transcript of Heterogeneous Consensus Learning via Decision Propagation and Negotiation
![Page 1: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/1.jpg)
Heterogeneous Consensus Learning via Decision Propagation and Negotiation
Jing Gao† Wei Fan‡ Yizhou Sun†Jiawei Han†
†University of Illinois at Urbana-Champaign‡IBM T. J. Watson Research Center
KDD’09 Paris, France
![Page 2: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/2.jpg)
2/24
Information Explosion
Fan SiteDescriptions
PicturesVideos
Not only at scale, but also at available sources!
Blogs
descriptions reviews
![Page 3: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/3.jpg)
3/24
Multiple Source Classification
Image Categorization Like? Dislike? Research Area
images, descriptions, notes, comments, albums, tags…….
movie genres, cast, director, plots…….
users viewing history, movie ratings…
publication and co-authorship network, published papers, …….
![Page 4: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/4.jpg)
4/24
Model Combination helps!
Some areas share similar keywordsSIGMOD
SDM
ICDM
KDD
EDBT
VLDB
ICML
AAAI
Tom
Jim
Lucy
Mike
Jack
Tracy
Cindy
Bob
Mary
Alice
People may publish in relevant but different areas
There may be cross-discipline co-operations
supervised
unsupervised
Supervised or unsupervised
![Page 5: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/5.jpg)
5/24
Motivation
• Multiple sources provide complementary information– We may want to use all of them to derive better classification
solution
• Concatenation of information sources is impossible– Information sources have different formats
– May only have access to classification or clustering results due to privacy issues
• Ensemble of supervised and unsupervised models– Combine their outputs on the same set of objects – Derive a consolidated solution– Reduce errors made by individual models– More robust and stable
![Page 6: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/6.jpg)
6/24
Consensus Learning
![Page 7: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/7.jpg)
7/24
Related Work
• Ensemble of Classification Models– Bagging, boosting, ……– Focus on how to construct and combine weak classifiers
• Ensemble of Clustering Models– Derive a consolidated clustering solution
• Semi-supervised (transductive) learning
• Link-based classification– Use link or manifold structure to help classification
– One unlabeled source
• Multi-view learning– Construct a classifier from multiple sources
![Page 8: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/8.jpg)
8/24
Problem Formulation
• Principles– Consensus: maximize agreement among
supervised and unsupervised models– Constraints: Label predictions should be close
to the outputs of the supervised models
• Objective function
Consensus Constraints
NP-hard!
![Page 9: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/9.jpg)
9/24
MethodologyStep 1: Group-level predictions
Step 2: Combine multiple models using local weights
How to propagate and negotiate?
How to compute local model weights?
![Page 10: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/10.jpg)
10/24
Group-level Predictions (1)
• Groups:– similarity: percentage of common members– initial labeling: category information from supervised models
![Page 11: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/11.jpg)
11/24
Group-level Predictions (2)
• Principles– Conditional probability estimates smooth over the graph– Not deviate too much from the initial labeling
[0.16 0.16 0.98]
[0.93 0.07 0]
Labeled nodes Unlabeled nodes
![Page 12: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/12.jpg)
12/24
Local Weighting Scheme (1)
• Principles– If M makes more accurate prediction on x,
M’s weight on x should be higher
• Difficulties– “unsupervised” model combination—cannot
use cross-validation
![Page 13: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/13.jpg)
13/24
Local Weighting Scheme (2)• Method
– Consensus• To compute Mi’s weight on x, use M1,…, Mi-1, Mi+1, …,
Mr as the true model, and compute the average accuracy
• Use consistency in x’s neighbors’ label predictions between two models to approximate accuracy
– Random• Assign equal weights to all the models
consensus random
![Page 14: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/14.jpg)
14/24
Algorithm and Time Complexity
Compute similarity and local consistency
for each pairs of groups
for each group
iterate f steps
Compute probability estimates based on the weighted average of neighbors
Compute local weights
for each example
for each model
Combine models’ predictions using local weights
O(s2)
O(fcs2)
O(rn)
linear in the number of examples!
![Page 15: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/15.jpg)
15/24
Experiments-Data Sets• 20 Newsgroup
– newsgroup messages categorization– only text information available
• Cora– research paper area categorization– paper abstracts and citation information available
• DBLP– researchers area prediction– publication and co-authorship network, and publication content– conferences’ areas are known
• Yahoo! Movie– user viewing interest analysis (favored movie types)– movie ratings and synopses– movie genres are known
![Page 16: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/16.jpg)
16/24
Experiments-Baseline Methods
• Single models– 20 Newsgroup:
• logistic regression, SVM, K-means, min-cut
– Cora• abstracts, citations (with or without a labeled set)
– DBLP• publication titles, links (with or without labels from conferences)
– Yahoo! Movies• Movie ratings and synopses (with or without labels from movies)
• Ensemble approaches– majority-voting classification ensemble – majority-voting clustering ensemble– clustering ensemble on all of the four models
![Page 17: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/17.jpg)
17/24
Experiments-Evaluation Measures
• Classification Accuracy– Clustering algorithms: map each cluster to th
e best possible class label (should get the best accuracy the algorithm can achieve)
• Clustering quality– Normalized mutual information– Get a “true” model from the groudtruth labels– Compute the shared information between th
e “true” model and each algorithm
![Page 18: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/18.jpg)
18/24
Empirical Results -Accuracy
0.7
0.75
0.8
0.85
0.9
0.95
1
20 Newsgroup Cora DBLP
SC1
SC2
UC1
UC2
SME
UME
MCLA
CLSU
![Page 19: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/19.jpg)
19/24
Empirical Results-NMI
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 Newsgroup Cora DBLP
SC1
SC2
UC1
UC2
SME
UME
MCLA
CLSU
![Page 20: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/20.jpg)
20/24
Empirical Results-
DBLP data
![Page 21: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/21.jpg)
21/24
Empirical Results-Yahoo! Movies
![Page 22: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/22.jpg)
22/24
Empirical Results-Scalability
![Page 23: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/23.jpg)
23/24
Conclusions• Summary
– We propose to integrate multiple information sources for better classification
– We study the problem of consolidating outputs from multiple supervised and unsupervised models
– The proposed two-step algorithm solve the problem by propagating and negotiating among multiple models
– The algorithm runs in linear time.– Results on various data sets show the improvements
• Follow-up Work– Algorithm and theory– Applications
![Page 24: Heterogeneous Consensus Learning via Decision Propagation and Negotiation](https://reader036.fdocuments.net/reader036/viewer/2022062409/56814e80550346895dbc1c17/html5/thumbnails/24.jpg)
24/24
Thanks!
• Any questions?
http://www.ews.uiuc.edu/~jinggao3/kdd09clsu.htm
Office: 2119B