2 Task Embedding for Model Recommendation · Model recommendation Feature Extractor Zoo 3 X L(y)...

Post on 07-Aug-2020

15 views 0 download

Transcript of 2 Task Embedding for Model Recommendation · Model recommendation Feature Extractor Zoo 3 X L(y)...

TASK2VEC: Task Embedding for Model RecommendationSubhransu MajiCollege of Information and Computer SciencesUniversity of Massachusetts, Amherst

http://people.cs.umass.edu/smaji

February 19, 2019 @ ICERM, Brown University

https://arxiv.org/abs/1902.03545

Task Embedding for Model RecommendationAllesandro,Michael,Rahul,Avinash,Subhransu,Charless,Stefano,Pietro

2

Task={dataset,labels,loss}

Whataresimilartasks?WhatarchitectureshouldIuse?Whatpre-trainingdataset?Whathyperparameters?DoIneedmoretrainingdata?Howdifficultisthistask?...

IfwehaveauniversalvectorialrepresentationoftaskswecanframeallsortsofinterestingCVapplicationsengineeringproblemsasmeta-learningproblems

Model recommendation

3FeatureExtractorZoo

X L(y)

recommender

Input:Task=(dataset,loss)

ForeachfeatureextractorarchitectureF:1.TrainclassifieronF(dataset)2.Computevalidationperformance

Output:bestperformingmodel

BruteForce:

Input:Task=(dataset,loss)

1.Computetaskembeddingt=E(Task)2.PredictbestextractorF=M(t)2.TrainclassifieronF(dataset)3.Computevalidationperformance

Output:bestperformingmodel

Taskrecommendation:

1. Givenatask,trainaclassifierwiththetasklossonfeaturesfromageneric“probenetwork”

2. Computegradientsofprobenetworkparametersw.r.t.taskloss

3. Usestatisticsoftheprobeparametergradientsasthefixeddimensionaltaskembedding

4

Intuition:Fprovidesinformationaboutthesensitivityofthetaskperformancetosmallperturbationsofparametersintheprobenetwork

Ex⇠p̂

KLp

0(y|x)p✓

(y|x) = �✓ · F · �✓ + o(�✓2),

Task embedding using Fisher Information

5

Properties of TASK2VECembedding

(xi, yi), i = 1 . . . n, yi 2 {0, 1}Dataset:

Classifier:

FIMforcrossentropylossforthelastlayer:

Fw =1

N

X

i

pi(1� pi)�(xi)�(xi)T

pi = �

�w

T�(xi)

Twolayernetwork

x ! �(x)

1. Invariancetolabelspace2. Encodestaskdifficulty3. Encodestaskdomain4. Encodesusefulfeaturesforthetask

6

Properties of TASK2VECembedding

(xi, yi), i = 1 . . . n, yi 2 {0, 1}Dataset:

Classifier:

FIMforcrossentropylossforthelastlayer:

Fw =1

N

X

i

pi(1� pi)�(xi)�(xi)T

pi = �

�w

T�(xi)

D =1

N

X

i

�(xi)�(xi)T

Representativedomainembedding

Properties of TASK2VECembedding1. Binarytasksonunitsquare,

i.e.,eachtileisatask2. 10RandomReLUfeatures,i.e.,

φᵢ=max(0,aᵢx+bᵢy+cᵢ)3. T-SNEtomap10x10FIMto2D

7

Properties of TASK2VECembedding1. Binarytasksonunitsquare,

i.e.,eachtileisatask2. 10RandomReLUfeatures,i.e.,

φᵢ=max(0,aᵢx+bᵢy+cᵢ)3. T-SNEtomap10x10FIMto2D

8Polynomialdegree3

9

Robust Fisher Computation

1. ForrealisticCVtaskswewanttousedeepCNNs(e.g.,ResNet)andestimateFIMforalltheparameters.

2. Challenge:FIMcanbehardtoestimate(noisylosslandscape;highdimensions;smalltrainingset)

3. RobustFIM1. Restrictittoadiagonal2. Restrictitasinglevalueper

filter(CNNlayer)3. Robustestimationvia

perturbation

EstimateΛofaGaussianperturbation:

OptimalΛsatisfies:

“TrivialEmbedding”

10

Similarity measures on the space of tasks

TaskA TaskB

Task={dataset,labels,loss}

Similarity measures on the space of tasks

Domainsimilarity

11

Unbiasedlookatdatasetbias,TorralbaandEfros,CVPR11

Similarity measures on the space of tasks

Domainsimilarity

Range/labelsimilarity• e.g.,Taxonomicdistance

12

https://www.pinterest.com/pin/520799144386337065/

Similarity measures on the space of tasks

Domainsimilarity

Range/labelsimilarity• e.g.,Taxonomicdistance

Transfer“distance”• Fine-tuneonafollowedbyb

13

Taskonomy:DisentanglingTaskTransferLearning,AmirZamir,AlexanderSax,WilliamShen,LeonidasGuibas,JitendraMalik,SilvioSavarese,CVPR18

Distance measures on TASK2VECembedding

Symmetricdistance

Asymmetric“distance”

14

MODEL2VEC: Joint embedding of tasks and models

1. Sofarwehavebeenassociatingmodels(featureextractors)withthetaskstheyaretrainedon.

2. Howabout1. legacy/black-boxfeatureextractors?E.g.,SIFT,HOG,Fishervector2. modelsofdifferentcomplexitytrainedonthesamedataset

3. MODEL2VEC:Jointlyembedfeatureextractors(encodedasone-hot-vectors)andtaskssuchthatsimilarityreflectsameta-taskobjective.1. Needstrainingdata

15

• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]

16

Task Zoo

• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]

17

Task Zoo

• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]

18

Task Zoo

• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]

• Fewtasks>10Ktrainingsamplesbutmosthave100-1000samples

19

Task Zoo

Experiment: TASK2VEC recapitulates iNaturalist taxonomy

Taskembeddingcosinesimilarity

plants

reptilesbirdsmammals

insects

ResNettrainedonImageNetasprobenetwork

20

Experiment: TASK2VEC norm encodes task difficulty

ResNettrainedonImageNetasprobenetwork

21

Task Embeddings Domain Embeddings

Actinopterygii (n)

Amphibia (n)

Arachnida (n)

Aves (n)

Fungi (n)

Insecta (n)

Mammalia (n)

Mollusca (n)

Plantae (n)

Protozoa (n)

Reptilia (n)

Category (m)

Color (m)

Gender (m)

Material (m)

Neckline (m)

Pants (m)

Pattern (m)

Shoes (m)

Experiment: TASK2VEC vs DOMAIN2VEC

22

• Tasks[1460]• iNaturalist[207]• CUB200[25]• iMaterialist[228]• DeepFashion[1000]

23

• FeatureZoo[156experts]• ResNet-34pertainedonImageNet• Followedbyfine-tunedontaskswithenoughexamples

Task and Feature Zoo

FeatureExtractorZoo

X L(y)

recommender

The Matrix

24Featureextractors

Tasks The Matrix

25

iNaturalist+CUB

The Matrix

Experts

Tasks

26

ImageNet expert is usually good but on many tasks the best expert handily outperforms the ImageNet expert

27

Data efficiency of TASK2VEC

ImageNetfixed

ImageNetfinetune

Task2vecfinetune

Task2vecfixed

Bruteforcefixed

28

Choice of distance for TASK2VEC

Relativeerrorincreaseovertheoracle(bestchoice)

29

Choice of the probe network on TASK2VEC

Relativeerrorincreaseovertheoracle(bestchoice)

Thank you!

Task2Vec:TaskEmbeddingforMeta-Learning,AlessandroAchille,MichaelLam,RahulTewari,AvinashRavichandran,SubhransuMaji,CharlessFowlkes,StefanoSoatto,PietroPerona(https://arxiv.org/abs/1902.03545)

30