ImageNet: A Large-Scale Hierarchical Image Database Jia Deng, Wei Dong, Richard Socher, Li-Jia Li,...
-
Upload
edith-york -
Category
Documents
-
view
219 -
download
2
Transcript of ImageNet: A Large-Scale Hierarchical Image Database Jia Deng, Wei Dong, Richard Socher, Li-Jia Li,...
1
ImageNet: A Large-Scale Hierarchical Image Database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei
Dept. of Computer Science, Princeton University, USA
CVPR 2009
3
Dataset in Computer Vision
UIUC Cars (2004)S. Agarwal, A. Awan, D. Roth
FERET Faces (1998)P. Phillips, H. Wechsler, J. Huang, P. Raus
CMU/VASC Faces (1998)H. Rowley, S. Baluja, T. Kanade
MNIST digits (1998-10)Y LeCun & C. Cortes
KTH human action (2004)I. Leptev & B. Caputo
Sign Language (2008)P. Buehler, M. Everingham, A. Zisserman
Segmentation (2001)D. Martin, C. Fowlkes, D. Tal, J. Malik.
COIL Objects (1996)S. Nene, S. Nayar, H. Murase
4
WordNet• WordNet is a large lexical database of English. Nouns,
verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept.
• WordNet as an ontology
5
ImageNet• Image database organized according to the WordNet
hierarchy, in which each node of the hierarchy is depicted by hundreds and thousands of images.
• Knowledge ontology: Taxonomy, Partonomy
6
Collect Candidate Images• For each synset, the queries are the set of WordNet
synonyms• Accuracy of Internet image search results: 10%
– For 500-1000 clean images, need 10K images• Query expansion
– Synonyms: German shepherd, German police dog, German shepherd dog, Alsatian
– Appending words from ancestors: sheepdog, dog • Multiple languages
– Italian, Dutch, Spanish, Chinese• More search engines
7
Clean Candidate Images• Rely on humans to verify each candidate image for a
given synset• 19 years’ work• No graduate students would want to do this project• Amazon Mechanical Turk (AMT)
– 300 images: 0.02 dollar– 14,197,122 images: 946 dollars– 10 repetition: 9460 dollars
• July 2008 - April 2010: 11 million images, 15,000+ synsets
8
HIT Design• HIT(Human Intelligence Task)• Application• Qualification Test• Start tasks
– Learn about the keyword: Wiki, Google– Definition quiz: choice question about the keyword– Choose images fit the keyword (Yes or No)– Pass cheating detection
• Feedback
9
Quality Control System• Human users make mistakes• Not all users follow the instructions• Users do not always agree with each other
– Subtle or confusing synsets, e.g. Burmese cat
10
Properties of ImageNet• Scale
– 14,197,122 images, 21841 synsets indexed• Hierarchy
– densely populated semantic hierarchy
12
ImageNet Applications• Non-parametric Object Recognition
– NN-voting + noisy ImageNet– NN-voting + clean ImageNet– Naive Bayesian Nearest Neighbor (NBNN)– NBNN-100
• Tree Based Image Classification• Automatic Object Localization