Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting...

26
Seeing the Big Picture Segmenting Images to Create Data 15.071x – The Analytics Edge

Transcript of Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting...

Page 1: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Seeing the Big Picture Segmenting Images to Create Data

15.071x – The Analytics Edge

Page 2: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Image Segmentation

15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1

•  Divide up digital images to salient regions/clusters corresponding to individual surfaces, objects, or natural parts of objects

•  Clusters should be uniform and homogenous with respect to certain characteristics (color, intensity, texture)

•  Goal: Useful and analyzable image representation

Page 3: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Wide Applications

2

•  Medical Imaging •  Locate tissue classes, organs, pathologies and tumors •  Measure tissue/tumor volume

•  Object Detection •  Detect facial features in photos •  Detect pedestrians in footages of surveillance videos

•  Recognition tasks •  Fingerprint/Iris recognition

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 4: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Various Methods

3

•  Clustering methods •  Partition image to clusters based on differences in pixel

colors, intensity or texture

•  Edge detection •  Based on the detection of discontinuity, such as an abrupt

change in the gray level in gray-scale images

•  Region-growing methods •  Divides image into regions, then sequentially merges

sufficiently similar regions

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 5: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

In this Recitation…

4

•  Review hierarchical and k-means clustering in R

•  Restrict ourselves to gray-scale images •  Simple example of a flower image (flower.csv) •  Medical imaging application with examples of transverse

MRI images of the brain (healthy.csv and tumor.csv)

•  Compare the use, pros and cons of all analytics methods we have seen so far

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 6: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Grayscale Images

•  Image is represented as a matrix of pixel intensity values ranging from 0 (black) to 1 (white)

•  For 8 bits/pixel (bpp), 256 color levels

0 .2 .5 0 0 0 .2

.2 0 0 .5 .5 .2 0

0 .5 .7 0 0 0 .5

.5 0 1 .7 .7 .5 0

0 .5 .7 0 0 0 .5

.2 0 0 .5 .5 .2 0

0 .2 .5 0 0 0 .2

1 15.071x – Seeing the Big Picture: Segmenting Images to Create Data

7 pixels

7 pixels

Page 7: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Grayscale Image Segmentation

•  Cluster pixels according to their intensity values

2

0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1

0 .2 .5 0 0 0 .2

.2 0 0 .5 .5 .2 0

0 .5 .7 0 0 0 .5

.5 0 1 .7 .7 .5 0

0 .5 .7 0 0 0 .5

.2 0 0 .5 .5 .2 0

0 .2 .5 0 0 0 .2

0

.2

0

.5

0

.2

0

.7

1

.7

.5

0

.2

0 Intensity Vector of size n = 7x7

7 pixels Pairwise distances n(n-1)/2

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 8: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Dendrogram Example

1

ABCDEF

A B C F

BCDEF

BC

DEF

DE

D E

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Level 1

Level 2

Level 3

Level 4

Level 5

Data Observations

Cluster

Distance between clusters BCDEF and BC

Page 9: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Dendrogram Example

2

ABCDEF

A B C F

BCDEF

BC

DEF

DE

D E

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Level 1

Level 2

Level 3

Level 4

Level 5

A, BC, DE, F 4 clusters

A, BC, DEF 3 clusters

A, BCDEF 2 clusters

CO

AR

SER

Page 10: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Dendrogram Example

3

ABCDEF

A B C F

BCDEF

BC

DEF

DE

D E

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Level 1

Level 2

Level 3

Level 4

Level 5

Page 11: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Flower Dendrogram

15.071x – Predictive Diagnosis: Discovering Patterns for Disease Detection 4

4 clusters

3 clusters

2 clusters

Page 12: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

1

•  The k-means clustering aims at partitioning the data into k clusters in which each data point belongs to the cluster whose mean is the nearest

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

3. Compute cluster centroids

4. Re-assign each point to the closest cluster centroid

5. Re-compute cluster centroids

6. Repeat 4 and 5 until no improvement is made

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 13: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

2

k-Means Clustering Algorithm

1. Specify desired number of clusters k

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2

Page 14: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

3

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2

Page 15: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

4

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2

Page 16: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

5

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2

Page 17: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

6

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

3. Compute cluster centroids

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2 k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

3. Compute cluster centroids

4. Re-assign each point to the closest cluster centroid

Page 18: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

7

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

3. Compute cluster centroids

4. Re-assign each point to the closest cluster centroid

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2

Page 19: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

k-Means Clustering

8

6. Repeat 4 and 5 until no improvement is made

k-Means Clustering Algorithm

1. Specify desired number of clusters k

2. Randomly assign each data point to a cluster

3. Compute cluster centroids

4. Re-assign each point to the closest cluster centroid

5. Re-compute cluster centroids

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

k = 2

Page 20: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Segmented MRI Images

15.071x – Predictive Diagnosis: Discovering Patterns for Disease Detection 1

Page 21: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

T2 Weighted MRI Images

15.071x – Predictive Diagnosis: Discovering Patterns for Disease Detection 2

Page 22: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

First Taste of a Fascinating Field

3

•  MRI image segmentation is subject of ongoing research

•  k-means is a good starting point, but not enough •  Advanced clustering techniques such as the modified

fuzzy k-means (MFCM) clustering technique •  Packages in R specialized for medical image analysis http://cran.r-project.org/web/views/MedicalImaging.html

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 23: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

3D Reconstruction

4

•  3D reconstruction from 2D cross sectional MRI images

•  Important in the medical field for diagnosis, surgical planning and biological research

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Page 24: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Comparison of Methods

1

Used For Pros Cons

Predicting a continuous outcome (salary, price, number of votes, etc.)

•  Simple, well recognized

•  Works on small and large datasets

•  Assumes a linear relationship

Lin

ear

Reg

ress

ion

Predicting a categorical outcome (Yes/No, Sell/Buy, Accept/Reject, etc.)

•  Computes probabilities that can be used to assess confidence of the prediction

•  Assumes a linear relationship

Log

isti

c R

egre

ssio

n

15.071x – Seeing the Big Picture: Segmenting Images to Create Data

Y = a log(X)+b

Page 25: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Comparison of Methods C

AR

T

Same as CART •  Can improve accuracy over CART

•  Many parameters to adjust

•  Not as easy to explain as CART R

ando

m

For

ests

Used For Pros Cons

Predicting a categorical outcome (quality rating 1--5, Buy/Sell/Hold) or a continuous outcome (salary, price, etc.)

•  Can handle datasets without a linear relationship

•  Easy to explain and interpret

•  May not work well with small datasets

15.071x – Seeing the Big Picture: Segmenting Images to Create Data 2

Page 26: Seeing the Big Picture - edX...Image Segmentation 15.071x – Seeing the Big Picture: Segmenting Images to Create Data 1 • Divide up digital images to salient regions/clusters corresponding

Comparison of Methods

3

Hie

rarc

hica

l C

lust

erin

g

Same as Hierarchical Clustering

•  Works with any dataset size

•  Need to select number of clusters before algorithm k-

mea

ns

Clu

ster

ing

Used For Pros Cons

•  Finding similar groups

•  Clustering into smaller groups and applying predictive methods on groups

•  No need to select number of clusters a priori

•  Visualize with a dendrogram

•  Hard to use with large datasets

15.071x – Seeing the Big Picture: Segmenting Images to Create Data