Words & Pictures

69
Words & Pictures Clustering and Bag of Words Representations Many slides adapted from Svetlana Lazebnik, Fei-Fei Li, Rob Fergus, and Antonio Torral

description

Words & Pictures. Clustering and Bag of Words Representations. Many slides adapted from Svetlana Lazebnik , Fei -Fei Li, Rob Fergus, and Antonio Torralba. Announcements. HW1 due Thurs , Sept 27 @ 12pm By email to [email protected] . No need to include shopping image set. - PowerPoint PPT Presentation

Transcript of Words & Pictures

Page 1: Words & Pictures

Words & Pictures

Clustering and Bag of Words Representations

Many slides adapted from Svetlana Lazebnik, Fei-Fei Li, Rob Fergus, and Antonio Torralba

Page 2: Words & Pictures

Announcements

• HW1 due Thurs, Sept 27 @ 12pm– By email to [email protected]. No need to

include shopping image set.– Write-up can be webpage or pdf.

Page 3: Words & Pictures

Document Vectors

· Represent document as a “bag of words”

Page 4: Words & Pictures

Origin: Bag-of-words models• Orderless document representation: frequencies

of words from a dictionary Salton & McGill (1983)

Page 5: Words & Pictures

Origin: Bag-of-words models

US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/

• Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Page 6: Words & Pictures

Origin: Bag-of-words models

US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/

• Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Page 7: Words & Pictures

Origin: Bag-of-words models

US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/

• Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Page 8: Words & Pictures

Bag-of-features models

Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Page 9: Words & Pictures

Bags of features for image classification

1. Extract features

Page 10: Words & Pictures

1. Extract features2. Learn “visual vocabulary”

Bags of features for image classification

Page 11: Words & Pictures

1. Extract features2. Learn “visual vocabulary”3. Quantize features using visual vocabulary

Bags of features for image classification

Page 12: Words & Pictures

1. Extract features2. Learn “visual vocabulary”3. Quantize features using visual vocabulary 4. Represent images by frequencies of

“visual words”

Bags of features for image classification

Page 13: Words & Pictures

• Regular grid– Vogel & Schiele, 2003– Fei-Fei & Perona, 2005

1. Feature extraction

Page 14: Words & Pictures

• Regular grid– Vogel & Schiele, 2003– Fei-Fei & Perona, 2005

• Interest point detector– Csurka et al. 2004– Fei-Fei & Perona, 2005– Sivic et al. 2005

1. Feature extraction

Page 15: Words & Pictures

• Regular grid– Vogel & Schiele, 2003– Fei-Fei & Perona, 2005

• Interest point detector– Csurka et al. 2004– Fei-Fei & Perona, 2005– Sivic et al. 2005

• Other methods– Random sampling (Vidal-Naquet & Ullman, 2002)– Segmentation-based patches (Barnard et al. 2003)

1. Feature extraction

Page 16: Words & Pictures

1. Feature extraction

Page 17: Words & Pictures

2. Learning the visual vocabulary

Page 18: Words & Pictures

2. Learning the visual vocabulary

Clustering

Slide credit: Josef Sivic

Page 19: Words & Pictures

2. Learning the visual vocabulary

Clustering

Slide credit: Josef Sivic

Visual vocabulary

Page 20: Words & Pictures

Clustering

– The assignment of objects into groups (called clusters) so that objects from the same cluster are more similar to each other than objects from different clusters.

– Often similarity is assessed according to a distance measure.

– Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics.

Page 21: Words & Pictures
Page 22: Words & Pictures
Page 23: Words & Pictures

Any of the similarity metrics we talked about before (SSD, angle between vectors)

Page 24: Words & Pictures

Feature Clustering

Clustering is the process of grouping a set offeatures into clusters of similar features.

Features within a cluster should be similar.

Features from different clusters should bedissimilar.

Page 25: Words & Pictures

source: Dan Klein

Page 26: Words & Pictures

K-means clustering

• Want to minimize sum of squared Euclidean distances between points xi and their nearest cluster centers mk

k

ki

ki mxMXDcluster

clusterinpoint

2)(),(

source: Svetlana Lazebnik

Page 27: Words & Pictures

K-means clustering

• Want to minimize sum of squared Euclidean distances between points xi and their nearest cluster centers mk

k

ki

ki mxMXDcluster

clusterinpoint

2)(),(

source: Svetlana Lazebnik

Page 28: Words & Pictures
Page 29: Words & Pictures
Page 30: Words & Pictures
Page 31: Words & Pictures
Page 32: Words & Pictures
Page 33: Words & Pictures
Page 34: Words & Pictures
Page 35: Words & Pictures
Page 36: Words & Pictures
Page 37: Words & Pictures
Page 38: Words & Pictures
Page 39: Words & Pictures

source: Dan Klein

Page 40: Words & Pictures

source: Dan Klein

Page 41: Words & Pictures

Source: Hinrich Schutze

Page 42: Words & Pictures

Source: Hinrich Schutze

Page 43: Words & Pictures

Hierarchical clustering strategies

• Agglomerative clustering• Start with each point in a separate cluster• At each iteration, merge two of the “closest” clusters

• Divisive clustering• Start with all points grouped into a single cluster• At each iteration, split the “largest” cluster

source: Svetlana Lazebnik

Page 44: Words & Pictures

source: Dan Klein

Page 45: Words & Pictures

source: Dan Klein

Page 46: Words & Pictures

Divisive Clustering

• Top-down (instead of bottom-up as in Agglomerative Clustering)

• Start with all docs in one big cluster• Then recursively split clusters• Eventually each node forms a cluster on its

own.

Source: Hinrich Schutze

Page 47: Words & Pictures

Flat or hierarchical clustering?

• For high efficiency, use flat clustering (e.g. k means)

• For deterministic results: hierarchical clustering• When a hierarchical structure is desired:

hierarchical algorithm• Hierarchical clustering can also be applied if K

cannot be predetermined (can start without knowing K)

Source: Hinrich Schutze

Page 48: Words & Pictures

2. Learning the visual vocabulary

Clustering

Slide credit: Josef Sivic

Page 49: Words & Pictures

2. Learning the visual vocabulary

Clustering

Slide credit: Josef Sivic

Visual vocabulary

Page 50: Words & Pictures

From clustering to vector quantization• Clustering is a common method for learning a visual

vocabulary or codebook– Unsupervised learning process– Each cluster center produced by k-means becomes a codebook

entry– Codebook can be learned on separate training set– Provided the training set is sufficiently representative, the

codebook will be “universal”

• The codebook is used for quantizing features– A vector quantizer takes a feature vector and maps it to the index

of the nearest entry in the codebook– Codebook = visual vocabulary– Codebook entry = visual word

Page 51: Words & Pictures

Example visual vocabulary

Fei-Fei et al. 2005

Page 52: Words & Pictures

Visual vocabularies: Issues

• How to choose vocabulary size?– Too small: visual words not

representative of all patches– Too large: quantization artifacts,

overfitting

• Computational efficiency– Vocabulary trees

(Nister & Stewenius, 2006)

Page 53: Words & Pictures

3. Image representation

…..

freq

uenc

y

codewords

Page 54: Words & Pictures

Image classification (next)

• Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Page 55: Words & Pictures

Clustering in Action

Page 56: Words & Pictures

President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters

Names and Faces

Who’s in the picture?T.L. Berg, A.C. Berg, J. Edwards, D.A. Forsyth

Page 57: Words & Pictures

Intuition

George Bush

Page 58: Words & Pictures

500k News Corpora

Producer and director Bruce Paltrow has died at the age of 58 in Rome, Italy, the U.S. Consulate said on October 3, 2002. Paltrow had suffered from throat cancer for several years, but the cause of his death was not immediately known. He is seen with his daughter actress Gwyneth Paltrow after the Academy Awards in Los Angles in March 21, 1999 file photo. (Fred Prouser/Reuters)

Actress Winona Ryder (news) reacts to remarks by prosecutor Ann Rundle during the sentencing hearing in her felony shoplifting case Friday, Dec. 6, 2002 at the Beverly Hills, Calif., courthouse. At right is Ryder's attorney Mark Geragos. Ryder was sentenced to three years of probation and was ordered to perform 480 hours of community service. (APPhoto/Steve Grayson, POOL)

Page 59: Words & Pictures

President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters

Name & Face Extraction

Detected Faces

Page 60: Words & Pictures

President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters

Name & Face Extraction

Detected Names: President George W. Bush, Defense Donald Rumsfeld, Saddam Hussein.

Detected Faces

Page 61: Words & Pictures

Each name in the dataset is a potential cluster. Want to simultaneously:1.) Learn image model for each person.

2.) Learn depiction model across names.

Achieve both of these by considering a big assignment (clustering) problem.

Goal

Page 62: Words & Pictures

Assignment Problem

Page 63: Words & Pictures

Language indicates Depiction

President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters

Cues - POS tags before and after name, location in caption, distance to closest: ( ) (L) (C) (R) left right center shown pictured above

P(Depicted | Context)

Yes/No multiple independent cues

Page 64: Words & Pictures

1.) Update assignments

2.) Update: appearance model for each person. language model of depiction across

names.

Iterate 1-2

Method

Page 65: Words & Pictures

Results

British director Sam Mendes and his partner actress Kate Winslet arrive at the London premiere of 'The Road to Perdition', September 18, 2002. The films stars Tom Hanks as a Chicago hit man who has a separate family life and co-stars Paul Newman and Jude Law. REUTERS/Dan Chung

World number one Lleyton Hewitt of Australia hits a return to Nicolas Massu of Chile at the Japan Open tennis championships in Tokyo October 3, 2002. REUTERS/Eriko Sugita

Page 66: Words & Pictures

US President George W. Bush (L) makes remarks while Secretary of State Colin Powell (R) listens before signing the US Leadership Against HIV /AIDS , Tuberculosis and Malaria Act of 2003 at the Department of State in Washington, DC. The five-year plan is designed to help prevent and treat AIDS, especially in more than a dozen African and Caribbean nations(AFP/Luke Frazza)

German supermodel Claudia Schiffer gave birth to a baby boy by Caesarian section January 30, 2003, her spokeswoman said. The baby is the first child for both Schiffer, 32, and her husband, British film producer Matthew Vaughn, who was at her side for the birth. Schiffer is seen on the German television show 'Bet It...?!' ('Wetten Dass...?!') in Braunschweig, on January 26, 2002. (Alexandra Winkler/Reuters)

Results

Page 67: Words & Pictures

Results

Without – CEO SummitWith – Martha Stewart

Without – James BondWith – Pierce Brosnan

Without – Dick CheneyWith – George W. Bush

Model Accuracy of labelingVision model, No Lang model

67%

Vision model + Lang model 78%

Page 68: Words & Pictures

Face Dictionary

http://tamaraberg.com/faces/faceDict/NIPSdict/index.html

Page 69: Words & Pictures

Results - Depiction

Classifier % correctBaseline (all pictured) 67%Learned Lang Model 86%

IN - pictured, OUT - not pictured