Expectation Maximization Method

12
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science Thesis - Machine Learning

description

Expectation Maximization Method. Effective Image Retrieval Based on Hidden Concept Discovery in Image Database. By Sanket Korgaonkar Masters Computer Science Thesis - Machine Learning. Problem Statement. - PowerPoint PPT Presentation

Transcript of Expectation Maximization Method

Page 1: Expectation Maximization Method

Expectation Maximization Method

Effective Image Retrieval Based on Hidden Concept Discovery in

Image Database

By

Sanket Korgaonkar

Masters Computer Science

Thesis - Machine Learning

Page 2: Expectation Maximization Method

Problem Statement

• Paper addresses the problem of content based image retrieval and concentrates on extracting hidden semantic concepts from given data.

• Key concepts: – Content based image retrieval– Hidden semantic concepts

Page 3: Expectation Maximization Method

An overview of the logic used

All Images Homogeneous Regions

Visual Token Catalog

Segmentation SOM learning strategy

Segmentation Procedure [1]

SOM : Self Organizing Map[2]

Features Extracted for each region:

- Color

- Shape

- Texture

Each image is segmented into homogeneous regions and similar regions are given the same token. An observation is the occurrence of a token in an image. An image is the raw data file or jpg file that is used for creating the database.

Page 4: Expectation Maximization Method

Graphic Representation of the Procedure

Raw Image Regions Tokens

Sky

Page 5: Expectation Maximization Method

Salient Details1. SOM projects high dimensional feature vectors into 2D

space. Groups similar features together and separates different features.

2. Each token created represents a set of visually similar regions. (In terms of their shapes, texture and color)

3. Number of tokens - to be generated - must be chosen empirically to find out which number gives the best efficiency and accuracy.

4. For each region identified, the index of the token it corresponds to is identified and stored and the original features are discarded.

5. For a new image, first the regions are extracted and then for each region, the features are replaced by the closest token that region corresponds to.

Page 6: Expectation Maximization Method

Salient Details (Cont.)6. N - total number of raw images.7. M - total number of regions. (M >> N)8. MxN matrix is generated - where each column

represents an image and each row corresponds to a token. Suppose the value (I,j) is 5 - it means that in the jth image, ith token was observed 5 times.

9. Since M >> N - this matrix will have many zeros - hence the name: uniform-sparse matrix.

10. A probabilistic model is generated using the matrix from step 9, token-image pairs are assumed iid.

11. The matrix is assumed to represent a mixture of M probability models and EM is used to estimate the parameters for this mixture density model.

Page 7: Expectation Maximization Method

Probabilistic Data Model1. Each token-image pair is associated with a semantic

concept variable Z. Z is assumed to have ‘K’ dimensions, each dimension, corresponds to a concept class ‘k’.

2. The authors assume independence between variables r and g given z.

3. Further mathematical calculations result in the following log likelihood formula:

Page 8: Expectation Maximization Method

Model Fitting with EM

Page 9: Expectation Maximization Method

Image Retrieval based on posterior probability

Page 10: Expectation Maximization Method

Graphic Representation of the Procedure

Raw Image Regions Tokens

Sky

Page 11: Expectation Maximization Method

Results and ConclusionThe authors experimented with 10000 general images from the COREL database collection from 96 categories. To evaluate the image retrieval algorithm, the authors use 1500 randomly selected images from all categories in the query set. To prove the effectiveness of the algorithm, the authors have compared their performance with the algorithm proposed by Chen and Wang (Fuzzified region representation) – the authors note a higher precision in their comparisons.

Page 12: Expectation Maximization Method

References

1. R. Zhang and Z. M. Zhang, “ Toward more effective and efficient image retrieval”,ACM Multimedia Syst. J., vol. 11, 2006.

2. S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero, and A. Saarela, “Self organization of a massive document collection”, IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 1025-1048, May 2000.