Lecture 7: Clustering

61
Machine Learning Queens College Lecture 7: Clustering

description

Lecture 7: Clustering. Machine Learning Queens College. Clustering. Unsupervised technique. Task is to identify groups of similar entities in the available data. And sometimes remember how these groups are defined for later use. People are outstanding at this. - PowerPoint PPT Presentation

Transcript of Lecture 7: Clustering

Page 1: Lecture 7: Clustering

Machine LearningQueens College

Lecture 7: Clustering

Page 2: Lecture 7: Clustering

Clustering

• Unsupervised technique.• Task is to identify groups of similar entities

in the available data.– And sometimes remember how these groups

are defined for later use.

2

Page 3: Lecture 7: Clustering

People are outstanding at this

3

Page 4: Lecture 7: Clustering

People are outstanding at this

4

Page 5: Lecture 7: Clustering

People are outstanding at this

5

Page 6: Lecture 7: Clustering

Dimensions of analysis

6

Page 7: Lecture 7: Clustering

Dimensions of analysis

7

Page 8: Lecture 7: Clustering

Dimensions of analysis

8

Page 9: Lecture 7: Clustering

Dimensions of analysis

9

Page 10: Lecture 7: Clustering

How would you do this computationally or

algorithmically?

10

Page 11: Lecture 7: Clustering

Machine Learning Approaches

• Possible Objective functions to optimize– Maximum Likelihood/Maximum A Posteriori– Empirical Risk Minimization– Loss function Minimization

What makes a good cluster?

11

Page 12: Lecture 7: Clustering

Cluster Evaluation

• Intrinsic Evaluation– Measure the compactness of the clusters. – or similarity of data points that are assigned to

the same cluster• Extrinsic Evaluation

– Compare the results to some gold standard or labeled data.

– Not covered today.

12

Page 13: Lecture 7: Clustering

Intrinsic Evaluation cluster variability

• Intercluster Variability (IV)– How different are the data points within the same

cluster?• Extracluster Variability (EV)

– How different are the data points in distinct clusters?• Goal: Minimize IV while maximizing EV

13

Page 14: Lecture 7: Clustering

Degenerate Clustering Solutions

14

Page 15: Lecture 7: Clustering

Degenerate Clustering Solutions

15

Page 16: Lecture 7: Clustering

Two approaches to clustering

• Hierarchical– Either merge or split clusters.

• Partitional– Divide the space into a fixed number of

regions and position their boundaries appropriately

16

Page 17: Lecture 7: Clustering

Hierarchical Clustering

17

Page 18: Lecture 7: Clustering

Hierarchical Clustering

18

Page 19: Lecture 7: Clustering

Hierarchical Clustering

19

Page 20: Lecture 7: Clustering

Hierarchical Clustering

20

Page 21: Lecture 7: Clustering

Hierarchical Clustering

21

Page 22: Lecture 7: Clustering

Hierarchical Clustering

22

Page 23: Lecture 7: Clustering

Hierarchical Clustering

23

Page 24: Lecture 7: Clustering

Hierarchical Clustering

24

Page 25: Lecture 7: Clustering

Hierarchical Clustering

25

Page 26: Lecture 7: Clustering

Hierarchical Clustering

26

Page 27: Lecture 7: Clustering

Hierarchical Clustering

27

Page 28: Lecture 7: Clustering

Hierarchical Clustering

28

Page 29: Lecture 7: Clustering

Agglomerative Clustering

29

Page 30: Lecture 7: Clustering

Agglomerative Clustering

30

Page 31: Lecture 7: Clustering

Agglomerative Clustering

31

Page 32: Lecture 7: Clustering

Agglomerative Clustering

32

Page 33: Lecture 7: Clustering

Agglomerative Clustering

33

Page 34: Lecture 7: Clustering

Agglomerative Clustering

34

Page 35: Lecture 7: Clustering

Agglomerative Clustering

35

Page 36: Lecture 7: Clustering

Agglomerative Clustering

36

Page 37: Lecture 7: Clustering

Agglomerative Clustering

37

Page 38: Lecture 7: Clustering

Agglomerative Clustering

38

Page 39: Lecture 7: Clustering

Dendrogram

39

Page 40: Lecture 7: Clustering

K-Means

• K-Means clustering is a partitional clustering algorithm.– Identify different partitions of the space for a

fixed number of clusters– Input: a value for K – the number of clusters– Output: K cluster centroids.

40

Page 41: Lecture 7: Clustering

K-Means

41

Page 42: Lecture 7: Clustering

K-Means Algorithm• Given an integer K specifying the number of clusters• Initialize K cluster centroids

– Select K points from the data set at random– Select K points from the space at random

• For each point in the data set, assign it to the cluster center it is closest to

• Update each centroid based on the points that are assigned to it

• If any data point has changed clusters, repeat

42

Page 43: Lecture 7: Clustering

How does K-Means work?• When an assignment is changed, the

distance of the data point to its assigned cluster is reduced.– IV is lower

• When a cluster centroid is moved, the mean distance from the centroid to its data points is reduced– IV is lower.

• At convergence, we have found a local minimum of IV

43

Page 44: Lecture 7: Clustering

K-means Clustering

44

Page 45: Lecture 7: Clustering

K-means Clustering

45

Page 46: Lecture 7: Clustering

K-means Clustering

46

Page 47: Lecture 7: Clustering

K-means Clustering

47

Page 48: Lecture 7: Clustering

K-means Clustering

48

Page 49: Lecture 7: Clustering

K-means Clustering

49

Page 50: Lecture 7: Clustering

K-means Clustering

50

Page 51: Lecture 7: Clustering

K-means Clustering

51

Page 52: Lecture 7: Clustering

Potential Problems with K-Means

• Optimal?– Is the K-means solution optimal?– K-means approaches a local minimum, but

this is not guaranteed to be globally optimal.• Consistent?

– Different seed centroids can lead to different assignments.

52

Page 53: Lecture 7: Clustering

Suboptimal K-means Clustering

53

Page 54: Lecture 7: Clustering

Inconsistent K-means Clustering

54

Page 55: Lecture 7: Clustering

Inconsistent K-means Clustering

55

Page 56: Lecture 7: Clustering

Inconsistent K-means Clustering

56

Page 57: Lecture 7: Clustering

Inconsistent K-means Clustering

57

Page 58: Lecture 7: Clustering

Soft K-means

• In K-means, each data point is forced to be a member of exactly one cluster.

• What if we relax this constraint?

58

Page 59: Lecture 7: Clustering

Soft K-means

• Still define a cluster by a centroid, but now we calculate a centroid as a weighted center of all data points.

• Now convergence is based on a stopping threshold rather than changing assignments

59

Page 60: Lecture 7: Clustering

Another problem for K-Means

60

Page 61: Lecture 7: Clustering

Next Time

• Evaluation for Classification and Clustering– How do you know if you’re doing well

61