Semantics In Digital Photos A Contenxtual Analysis

26
Semantics in Digital Photos: a Contenxtual Analysis Author / Pinaki Sinha, Ramesh Jain Conference / The IEEE International Conference on Semantic Computing, 2008, p58. – p.65 Presenter / Meng-Lun, Wu 1

description

Interpreting the semantics of an image is a hard problem. However, for storing and indexing large multimedia collections, it is essential to build systems that can automatically extract semantics from images. In this research we show how we can fuse content and context to extract semantics from digital photographs. Our experiments show that if we can properly model context associated with media, we can interpret semantics using only a part of high dimensional content data.

Transcript of Semantics In Digital Photos A Contenxtual Analysis

Page 1: Semantics In Digital Photos A Contenxtual Analysis

Semantics in Digital Photos: a Contenxtual

AnalysisAuthor / Pinaki Sinha, Ramesh JainConference / The IEEE International

Conference on Semantic Computing, 2008, p58. – p.65

Presenter / Meng-Lun, Wu

1

Page 2: Semantics In Digital Photos A Contenxtual Analysis

Outline

Introduction Related Work The Optical Context Layer Photo Clustering Photo Classification Annotation in Digital Photos Results Conclusion

2

Page 3: Semantics In Digital Photos A Contenxtual Analysis

Introduction

Most research is concerned with extracting semantics using content information only.

All search engines rely on the text associated with the images to search for images.

Authors fuse the content of photos with two type of context using a probabilistic model.

3

Page 4: Semantics In Digital Photos A Contenxtual Analysis

Introduction (cont.)

4

Page 5: Semantics In Digital Photos A Contenxtual Analysis

Introduction (cont.)

This paper classify photos into mutually exclusive classes and automatically tagging new photos.

Authors collected the photo dataset from flickr, which publishes popular tags.

5

Page 6: Semantics In Digital Photos A Contenxtual Analysis

Related Work

Most research use content based pixel features like global features or local features.

Image search using an example input image or query using low level features might be difficult and no intuitive to most people.

Correlations among image features and human tags or labels have been studied.

The semantic gap in image retrieval can’t be overcome using pixel features alone.

6

Page 7: Semantics In Digital Photos A Contenxtual Analysis

Related Work (cont.)

Recent research has used the optical Context Layer to classify photos.

Boutell and Luo[3] use pixel values and optical metadata for classification.[3] M. Boutell and J. Luo. Bayesian fusion of camera

metadata cues in semantic scene classification. In Proc. IEEE CVPR, 2004.

Model[6] by fusing ontology.[6] P.Duygulu, K.Barnard, N. de Freitas, and D. Forsyth.

Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proc. ECCV, 2002.

7

Page 8: Semantics In Digital Photos A Contenxtual Analysis

The Optical Context Layer

The Exchangeable Image File Standard (EXIF) specifies the camera parameters recorded.

Fundamental parameters Exposure Time, Focal Length, F-number,

Flash, Metering mode and ISO.

8

Page 9: Semantics In Digital Photos A Contenxtual Analysis

Photo Clustering

LogLight metric will have a small value when the ambient light is high.

Similarly it will have a large value if the outdoor light is small.

)lg( 2FLISOAAETKtricLogLightMe

9

Page 10: Semantics In Digital Photos A Contenxtual Analysis

Photo Clustering (cont.)

Log-Light distribution of photos shot with flash and without flash as a mixture of Gaussians.

Use the Bayesian model Selection to find the optimal model and the Expectation Maximization (EM) algorithm to fit the model parameters.

10

Image

?Optic

al

Page 11: Semantics In Digital Photos A Contenxtual Analysis

Photo Clustering (cont.)

According to the above method, we generated 8 clusters.

We choose 3500 tagged photos. We find the probability of each photo. We assign the photo to the cluster

having maximum probability. We assign all tags of the photo to

that particular cluster.

11

Page 12: Semantics In Digital Photos A Contenxtual Analysis

Photo Clustering (cont.)

Cluster with High Exposure Time Shots

Cluster with No Flash

12

Page 13: Semantics In Digital Photos A Contenxtual Analysis

Photo Clustering (cont.)

Cluster with Indoor Shots

13

Page 14: Semantics In Digital Photos A Contenxtual Analysis

Photo Classification

The intent of the photographer is somehow hidden in the optical data.

These classes are outdoor day, outdoor night and indoors.

The classes should be represented different lighting condition in the LogLight metric.

14

Page 15: Semantics In Digital Photos A Contenxtual Analysis

Photo Classification (cont.)

The Classification problem using Optical Context only and also using Optical Context and Thumbnail pixel features.

Classification algorithms is decision trees.

15

Page 16: Semantics In Digital Photos A Contenxtual Analysis

Photo Classification (cont.)

16

Page 17: Semantics In Digital Photos A Contenxtual Analysis

Annotation in Digital Photos

The goal for automatic annotation is to predict words for tagging untagged photos.

Relevance model approach has become quite popular for automatic annotation and retrieval of images.

Automatic annotation is modeled as a language translation problem.

The baseline is continuous relevance model(CRM).

17

Page 18: Semantics In Digital Photos A Contenxtual Analysis

Annotation in Digital Photos (cont.)

We divided the whole image into rectangular blocks.

For each block, we compute color, texture and shape features.

Each feature vector has 42 dimensions.

18

Page 19: Semantics In Digital Photos A Contenxtual Analysis

Annotation in Digital Photos (cont.)

The goal is to predict the W associated with an untagged image based on B.

B is the observed variable. The conditional probability of a word

given a set of blocks.

19

Page 20: Semantics In Digital Photos A Contenxtual Analysis

Annotation in Digital Photos (cont.)

During clustering process, we learn the optical cluster using an untagged image.

Whenever a new image X comes, we assign it to the cluster Oj having maximum value for P(X|Oj).

The probability of a word given the pixel feature blocks and the optical context information.

20

Page 21: Semantics In Digital Photos A Contenxtual Analysis

Results

Experiments datasets – Flickr Train Evaluation Test

Performance evaluation Precision recall

The number of correctly tag.

The number of photos annotated with that tag in the real data.

The number of prediction tag.

21

Page 22: Semantics In Digital Photos A Contenxtual Analysis

Results ( cont. )

Prediction tag – wildlife Optical Context 0.71 Image Features (CRM) 0.16 Thumbnail-Context 0.44

22

Page 23: Semantics In Digital Photos A Contenxtual Analysis

Using Ontology to Improve Tagging

CIDE word similarity ontology. Wu Palmer distance between two

tags

)()(

)(*2),(

ydxd

pdyxSim

23

Page 24: Semantics In Digital Photos A Contenxtual Analysis

Using Ontology to Improve Tagging

Shrink this estimate using semantic similarity:

),()1()|()|( WSimIWPIWP MLE

24

Page 25: Semantics In Digital Photos A Contenxtual Analysis

Results (cont.)

25

Page 26: Semantics In Digital Photos A Contenxtual Analysis

Conclusion

Optical context data is only a small fraction, which has invaluable information about the photo shooting environment.

Fusing ontological models on semantics about photos also improves precision.

The future work Fuse other types of context with the

context and optical context features.

26