Machine Learning - Piazza
Transcript of Machine Learning - Piazza
![Page 1: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/1.jpg)
MachineLearning
Slides: James Hays, Isabelle Guyon, Erik Sudderth, Mark Johnson, Derek Hoiem Photo: CMU Machine Learning
Department protests G20
![Page 2: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/2.jpg)
![Page 3: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/3.jpg)
Clustering:grouptogethersimilarpointsandrepresentthemwithasingletoken
KeyChallenges:1)Whatmakestwopoints/images/patchessimilar?2)HowdowecomputeanoverallgroupingfrompairwisesimilariCes?
Slide: Derek Hoiem
![Page 4: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/4.jpg)
Howdowecluster?
• K-means– IteraCvelyre-assignpointstothenearestclustercenter
• AgglomeraCveclustering– StartwitheachpointasitsownclusteranditeraCvelymergetheclosestclusters
• Mean-shiHclustering– EsCmatemodesofpdf
• Spectralclustering– Splitthenodesinagraphbasedonassignedlinkswithsimilarityweights
![Page 5: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/5.jpg)
ClusteringforSummarizaConGoal:clustertominimizevarianceindatagivenclusters– PreserveinformaCon
( )∑∑ −=N
j
K
ijiN ij
21
,
** argmin, xcδcδc
δ
Whether xj is assigned to ci
Cluster center Data
Slide: Derek Hoiem
![Page 6: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/6.jpg)
K-meansalgorithm
Illustration: http://en.wikipedia.org/wiki/K-means_clustering
1. Randomly select K centers
2. Assign each point to nearest center
3. Compute new center (mean) for each cluster
![Page 7: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/7.jpg)
K-meansalgorithm
Illustration: http://en.wikipedia.org/wiki/K-means_clustering
1. Randomly select K centers
2. Assign each point to nearest center
3. Compute new center (mean) for each cluster
Back to 2
![Page 8: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/8.jpg)
BuildingVisualDicConaries1. Samplepatchesfrom
adatabase– E.g.,128dimensional
SIFTvectors
2. Clusterthepatches– Clustercentersare
thedicConary
3. Assignacodeword(number)toeachnewpatch,accordingtothenearestcluster
![Page 9: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/9.jpg)
Examplesoflearnedcodewords
Sivic et al. ICCV 2005 http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic05b.pdf
Most likely codewords for 4 learned “topics” EM with multinomial (problem 3) to get topics
![Page 10: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/10.jpg)
AgglomeraCveclustering
![Page 11: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/11.jpg)
AgglomeraCveclustering
![Page 12: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/12.jpg)
AgglomeraCveclustering
![Page 13: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/13.jpg)
AgglomeraCveclustering
![Page 14: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/14.jpg)
AgglomeraCveclustering
![Page 15: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/15.jpg)
AgglomeraCveclusteringHowtodefineclustersimilarity?- Averagedistancebetweenpoints,
maximumdistance,minimumdistance- Distancebetweenmeansormedoids
Howmanyclusters?- Clusteringcreatesadendrogram(atree)- Thresholdbasedonmaxnumberofclusters
orbasedondistancebetweenmergesdi
stan
ce
![Page 16: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/16.jpg)
AgglomeraCveclusteringdemohZp://home.dei.polimi.it/maZeucc/Clustering/tutorial_html/AppletH.html
![Page 17: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/17.jpg)
Conclusions:AgglomeraCveClusteringGood• Simpletoimplement,widespreadapplicaCon• ClustershaveadapCveshapes• ProvidesahierarchyofclustersBad• Mayhaveimbalancedclusters• SCllhavetochoosenumberofclustersorthreshold
• Needtousean“ultrametric”togetameaningfulhierarchy
![Page 18: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/18.jpg)
• VersaCletechniqueforclustering-basedsegmentaCon
D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.
MeanshiHsegmentaCon
![Page 19: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/19.jpg)
MeanshiHalgorithm• Trytofindmodesofthisnon-parametric
density
![Page 20: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/20.jpg)
KerneldensityesCmaCon
Kernel density estimation function
Gaussian kernel
![Page 21: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/21.jpg)
Region of interest
Center of mass
Mean Shift vector
Slide by Y. Ukrainitz & B. Sarel
MeanshiH
![Page 22: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/22.jpg)
Region of interest
Center of mass
Mean Shift vector
Slide by Y. Ukrainitz & B. Sarel
MeanshiH
![Page 23: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/23.jpg)
Region of interest
Center of mass
Mean Shift vector
Slide by Y. Ukrainitz & B. Sarel
MeanshiH
![Page 24: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/24.jpg)
Region of interest
Center of mass
Mean Shift vector
MeanshiH
Slide by Y. Ukrainitz & B. Sarel
![Page 25: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/25.jpg)
Region of interest
Center of mass
Mean Shift vector
Slide by Y. Ukrainitz & B. Sarel
MeanshiH
![Page 26: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/26.jpg)
Region of interest
Center of mass
Mean Shift vector
Slide by Y. Ukrainitz & B. Sarel
MeanshiH
![Page 27: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/27.jpg)
Region of interest
Center of mass
Slide by Y. Ukrainitz & B. Sarel
MeanshiH
![Page 28: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/28.jpg)
Simple Mean Shift procedure: • Compute mean shift vector
• Translate the Kernel window by m(x)
2
1
2
1
( )
ni
ii
ni
i
gh
gh
=
=
⎡ ⎤⎛ ⎞⎢ ⎥⎜ ⎟
⎜ ⎟⎢ ⎥⎝ ⎠= −⎢ ⎥⎛ ⎞⎢ ⎥⎜ ⎟⎢ ⎥⎜ ⎟⎝ ⎠⎣ ⎦
∑
∑
x - xx
m x xx - x
g( ) ( )kʹ= −x x
CompuCngtheMeanShiH
Slide by Y. Ukrainitz & B. Sarel
![Page 29: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/29.jpg)
• AZracConbasin:theregionforwhichalltrajectoriesleadtothesamemode
• Cluster:alldatapointsintheaZracConbasinofamode
Slide by Y. Ukrainitz & B. Sarel
AZracConbasin
![Page 30: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/30.jpg)
AZracConbasin
![Page 31: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/31.jpg)
MeanshiHclustering• ThemeanshiHalgorithmseeksmodesofthe
givensetofpoints1. Choosekernelandbandwidth2. Foreachpoint:
a) Centerawindowonthatpointb) Computethemeanofthedatainthesearchwindowc) CenterthesearchwindowatthenewmeanlocaCond) Repeat(b,c)unClconvergence
3. Assignpointsthatleadtonearbymodestothesamecluster
![Page 32: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/32.jpg)
• Computefeaturesforeachpixel(color,gradients,texture,etc)• SetkernelsizeforfeaturesKfandposiConKs• IniCalizewindowsatindividualpixellocaCons• PerformmeanshiHforeachwindowunClconvergence• MergewindowsthatarewithinwidthofKfandKs
SegmentaConbyMeanShiH
![Page 33: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/33.jpg)
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
MeanshiHsegmentaConresults
![Page 34: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/34.jpg)
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
![Page 35: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/35.jpg)
Mean-shiH:otherissues• Speedups
– BinnedesCmaCon– Fastsearchofneighbors– UpdateeachwindowineachiteraCon(fasterconvergence)
• Othertricks– UsekNNtodeterminewindowsizesadapCvely
• LotsoftheoreCcalsupportD.ComaniciuandP.Meer,MeanShiH:ARobustApproachtowardFeatureSpaceAnalysis,PAMI2002.
![Page 36: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/36.jpg)
MeanshiHprosandcons
• Pros– Goodgeneral-pracCcesegmentaCon– Flexibleinnumberandshapeofregions– Robusttooutliers
• Cons– Havetochoosekernelsizeinadvance– Notsuitableforhigh-dimensionalfeatures
• Whentouseit– Oversegmentatoin– MulCplesegmentaCons– Tracking,clustering,filteringapplicaCons
![Page 37: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/37.jpg)
Spectralclustering Grouppointsbasedonlinksinagraph
A B
![Page 38: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/38.jpg)
Cutsinagraph
A B
Normalized Cut • a cut penalizes large segments • fix by normalizing for size of segments
• volume(A) = sum of costs of all edges that touch A
Source: Seitz
![Page 39: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/39.jpg)
NormalizedcutsforsegmentaCon
![Page 40: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/40.jpg)
VisualPageRank• Determiningimportancebyrandomwalk
– What’stheprobabilitythatyouwillrandomlywalktoagivennode?
• Createadjacencymatrixbasedonvisualsimilarity• EdgeweightsdetermineprobabilityoftransiCon
Jing Baluja 2008
![Page 41: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/41.jpg)
Whichalgorithmtouse?• QuanCzaCon/SummarizaCon:K-means
– Aimstopreservevarianceoforiginaldata– Caneasilyassignnewpointtoacluster
Quantization for computing histograms
Summary of 20,000 photos of Rome using “greedy k-means”
http://grail.cs.washington.edu/projects/canonview/
![Page 42: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/42.jpg)
Whichalgorithmtouse?• ImagesegmentaCon:agglomeraCveclustering
– Moreflexiblewithdistancemeasures(e.g.,canbebasedonboundarypredicCon)
– AdaptsbeZertospecificdata– Hierarchycanbeuseful
http://www.cs.berkeley.edu/~arbelaez/UCM.html
![Page 43: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/43.jpg)
Thingstoremember• K-meansusefulforsummarizaCon,buildingdicConariesofpatches,generalclustering
• AgglomeraCveclusteringusefulforsegmentaCon,generalclustering
• Spectralclusteringusefulfordeterminingrelevance,summarizaCon,segmentaCon
![Page 44: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/44.jpg)
ClusteringKeyalgorithm• K-means
![Page 45: Machine Learning - Piazza](https://reader035.fdocuments.net/reader035/viewer/2022062303/62a5ea75b5fb8a1fa7749a8c/html5/thumbnails/45.jpg)