Bridged Refinement for Transfer Learning
description
Transcript of Bridged Refinement for Transfer Learning
![Page 1: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/1.jpg)
Bridged Refinement for Transfer Learning
XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong
Shanghai Jiao Tong University{xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn
![Page 2: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/2.jpg)
Outline
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Conclusion
![Page 3: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/3.jpg)
Overview
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Conclusion
![Page 4: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/4.jpg)
Motivation
• Email spamming: Whether a given mail is a spam or not.
– Training Data
– Test Data
A B C D …
Z Y
Pop music
basketball basketball
football
classic music
ownerMailbox:
![Page 5: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/5.jpg)
Motivation
• New events always occur.news in 2006, commercial or politicsnews in 2007, commercial or politics
• Solution ?– Labeling new data again and again -- costly
• Therefore, …We try to utilize those old labeled data but take the
shift of distribution into consideration.[Transfer useful information]
![Page 6: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/6.jpg)
Overview
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Some other solutions
![Page 7: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/7.jpg)
Problem
• We want to solve a classification problem.• The set of target categories is fixed.• Main difference from traditional classification:
– The training data and test data are governed by two slightly different distributions.
• We do not need labeled data in the new test data distribution.
![Page 8: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/8.jpg)
Illustrative Example
+
+
-
-
+
+
-
-
sports
music+: normal mail
-: spam mail
![Page 9: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/9.jpg)
Overview
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Some other solutions
![Page 10: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/10.jpg)
Overview
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Some other solutions
![Page 11: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/11.jpg)
Assumption
• P(c|d) doesn’t changes: Ptrain(c|d) = Ptest(c|d) Since– The set of target categories is fixed.– Each target category is definite.
• P(c|di) ~ P(c|dj), when di ~ dj.
~ means “similar”, “close to each other”• Consistency
– Mutual Reinforcement Principle
![Page 12: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/12.jpg)
Overview
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Some other solutions
![Page 13: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/13.jpg)
Method: Refinement
• UConfc: scores of a base classifier, coarse-gained (Unrefined Confidence score of category c)
• M: adjacent matrix. Mij = 1 if di is a neighbor of dj
(then row L1 normalized).• RConfc: Refined Confidence score of category c.
• Mutual reinforcement principle yields:RConf c = α M RConfc + (1-α) UConfc
where α is a trade-off coefficient.
![Page 14: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/14.jpg)
Method: Refinement
• Refinement can be regarded as reaching a consistency under the mixture distribution.
• Why not try to reach a consistency under the distribution of the test data?
![Page 15: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/15.jpg)
Illustrative Example
-
+
+
-
+
-
+
-
-
+
+
-
+
-
+
-
![Page 16: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/16.jpg)
+
-
+
-
-
+
+
-
+
-
+
-
![Page 17: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/17.jpg)
Overview
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Some other solutions
![Page 18: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/18.jpg)
Method: Bridged Refinement
• Bridged Refinement
– Refine towards the mixture distribution– Refine towards the target distribution.
![Page 19: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/19.jpg)
Outline
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Conclusion
![Page 20: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/20.jpg)
Experiment
• Data set• Base classifiers• Different refinement styles• Performance• Parameter sensitivity
![Page 21: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/21.jpg)
Experiment: Data set
• Source– SRAA
• Simulated autos (simauto)• Simulated aviation (simaviation)• Real autos (realauto)• Real aviation (realaviation)
– 20 Newsgroup• Top level categories: rec, talk, sci, comp
– Reuters-21578• Top level categories: org, places, people
![Page 22: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/22.jpg)
Experiment: Data set
• Re-construction– 11 data sets
A2 B1 B2A1
Positive Negative
Training Data
Test Data
-+
![Page 23: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/23.jpg)
Experiment: Base classifier
• Supervised – Generative model: Naïve Bayes classifier– Discriminative model: Support vector machines
• Semi-supervised:– Transductive support vector machines
![Page 24: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/24.jpg)
Experiment: Refinement Style
• No refinement (base)• One step
– Refine directly on the test distribution (Test)– Refine on the mixture distribution only (Mix)
• Two steps– Bridged Refinement (Bridged)
![Page 25: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/25.jpg)
Performance: On SVM
• Base• Test• Mix• Bridged
• Test (2nd) , Mix(3rd) v.s. Base (1st)• Test (2nd) v.s. Bridged (1st):
– Different start point
![Page 26: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/26.jpg)
Performance: NB and TSVM
![Page 27: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/27.jpg)
Parameter: K
Whether di is regardedas a neighborof dj is decidedby checkingwhether di is in dj’s k-nearestneighbor set.
![Page 28: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/28.jpg)
Parameter: α
Error rate Vs.Differentalpha
![Page 29: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/29.jpg)
Convergence
The refinementformula canbe solved ina close manner or an iterativemanner.
![Page 30: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/30.jpg)
Outline
• Motivation• Problem• Solution
– Assumption– Method– Improvement and Final Solution
• Experiment• Conclusion
![Page 31: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/31.jpg)
Conclusion
• Task: Transfer useful information from training data to the same classification task of the test data, while training and test data are governed by two different distributions.
• Approach: Take the mixture distribution as a bridge and make a two-step refinement.
![Page 32: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/32.jpg)
Thank youPlease ask in slow and simple English
![Page 33: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/33.jpg)
Backup 1: Tranductive
• The boundary after either step of refinement are actually never calculated explicitly. It is hidden in the refined labels of each data points.
• I draw it in the examples explicitly is for a clearer illustration only.
![Page 34: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/34.jpg)
Backup 2: n-step
• One important problem left unsolved by us:– How to describe a distribution
\lembda D_train + (1-\lembda) D_test ?– One solution is sampling in a generative manner.
But this makes the result depends on each random number picked up in the generative process. It may cause the solution not very stable and hard to repeat.
![Page 35: Bridged Refinement for Transfer Learning](https://reader034.fdocuments.net/reader034/viewer/2022051115/5681482b550346895db54e86/html5/thumbnails/35.jpg)
Backup 3: Why mutual reinforcement principle ?
• If d_j has a high confidence to be in category c, then d_i, the neigbhor of d_j should also receive a high confidence score.