Tools for Image Retrieval in Large Multimedia Databases

61
Tools for Image Retrieval in Large Multimedia Databases by Carles Ventura Royo Directors: Verónica Vilaplana Xavier Giró Tutor: Ferran Marqués Barcelona, September 2011 1

Transcript of Tools for Image Retrieval in Large Multimedia Databases

Page 1: Tools for Image Retrieval in Large Multimedia Databases

Tools for Image Retrievalin Large Multimedia Databases

by Carles Ventura Royo

Directors: Verónica Vilaplana

Xavier Giró

Tutor:Ferran Marqués

Barcelona, September 2011

1

Page 2: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

2

Page 3: Tools for Image Retrieval in Large Multimedia Databases

3

Identifying the problem (I)

“House”

DATABASE

Page 4: Tools for Image Retrieval in Large Multimedia Databases

4

Identifying the problem (II)

QUERY

TARGETS

Visual + textual descriptors

“ho

use

“gar

den”

K

Page 5: Tools for Image Retrieval in Large Multimedia Databases

Identifying the problem (III)

K nearest neighbor problem Solution: Sequential scan

Drawback: Computational time for large databases (10 s for a 200,000 elements)

Approximate K nearest neighbor problem Indexing techniques

5

Page 6: Tools for Image Retrieval in Large Multimedia Databases

Requirements of the solution

Dynamic approach Multimedia databases are not static Insertions and deletions

High dimensional feature spaces “curse of dimensionality” problem MPEG-7 visual descriptors are high-dimensional

feature vectors

6

Page 7: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

7

Page 8: Tools for Image Retrieval in Large Multimedia Databases

Indexing techniques (I)

Hierarchical data structures Spatial Access Methods (SAMs)

K-d tree, R-tree, R*-tree, TV-tree, etc. Drawbacks:

Items have to be represented in an N-dimensional feature space

Dissimilarity measure based on a Lp metric

SAMs do not scale up well to high dimensional spaces

8

Page 9: Tools for Image Retrieval in Large Multimedia Databases

Indexing techniques (II)

Hierarchical data structures Metric Access Methods (MAMs)

VP-tree, MVP-tree, GNAT, M-tree, etc. More general approach than SAMs

Assuming only a similarity distance function

MAMs scale up well to high dimensional spaces Drawbacks:

Static MAMs do not support dynamic changes Dependence on pre-fixed parameters

9

Page 10: Tools for Image Retrieval in Large Multimedia Databases

Indexing techniques (III)

Locality Sensitive Hashing It uses hash functions Nearby data points are hashed into the same

bucket with a high probability Points faraway are hashed into the same bucket

with a low probability Drawback:

It does not solves the K nearest neighbor problem, but the Є-near neighbor problem.

10

Page 11: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

11

Page 12: Tools for Image Retrieval in Large Multimedia Databases

Solution adopted Hierarchical Cellular Tree (HCT)

MAM-based indexing scheme Hierarchical structure Self-organized tree Incremental construction

in a bottom-up fashion Unbalanced tree Not dependence on a maximum capacity Preemptive cell search algorithm for insertion Dynamic approach[KG07] S. Kiranyaz and M.Gabbouj, Hierarchical Cellular Tree: An efficient

indexing scheme for content-based retrieval on multimedia databases. 12

Page 13: Tools for Image Retrieval in Large Multimedia Databases

HCT: Cell Structure (I)

Basic container structure Undirected graph Minimum Spanning Tree (MST) Cell nucleus Covering radius

13

Page 14: Tools for Image Retrieval in Large Multimedia Databases

HCT: Cell Structure (II)

Cell compactness

Maturity size Mitosis operation

14

Page 15: Tools for Image Retrieval in Large Multimedia Databases

HCT: Level Structure

Representatives for each cell from the lower level

Responsible for maximizing the compactness of its cells

Compactness threshold

15

Page 16: Tools for Image Retrieval in Large Multimedia Databases

HCT Operations (I)

Item insertion Find the most

suitable cell

Most Similar Nucleus vs Preemptive Cell Search

16

Page 17: Tools for Image Retrieval in Large Multimedia Databases

HCT Operations (II)

Item insertion Find the most

suitable cell

Most Similar Nucleus vs Preemptive Cell Search

C2: d(O,O2N) - r(O2

N) < dmin

C3: d(O,O3N) - r(O3

N) > dmin 17

Page 18: Tools for Image Retrieval in Large Multimedia Databases

HCT Operations (III)

Item insertion Find the most suitable cell Append the element Generic post-processing check

Mitosis operation Nucleus change

18

Page 19: Tools for Image Retrieval in Large Multimedia Databases

HCT Operations (IV)

Item removal Cell search algorithm not required Remove the element Generic post-processing check

Mitosis operation Nucleus change

19

Page 20: Tools for Image Retrieval in Large Multimedia Databases

HCT: Retrieval scheme Progressive Query

Periodical subqueries over database subsets Query Path formation

Based on Most Similar Nucleus

Ranking agreggation

20

Page 21: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

21

Page 22: Tools for Image Retrieval in Large Multimedia Databases

Modifications to the HCT (I)

Covering radius Original definition gives an approximation by

defect Consider all the elements belonging to the

subtree High computational cost

Approximation by excess

22

Page 23: Tools for Image Retrieval in Large Multimedia Databases

Modifications to the HCT (II)

Covering radius

23

Page 24: Tools for Image Retrieval in Large Multimedia Databases

Modifications to the HCT (III)

Covering radius

24

rC (C9) = max

• d(E,B)+rC(C1)

• rC(C2)

• d(E,H)+rC(C3)

Page 25: Tools for Image Retrieval in Large Multimedia Databases

Modifications to the HCT (III)

HCT construction Preemptive Cell Search over all the levels A method for updating the covering radius

To reduce the searching time It can be performed after the HCT construction or

periodically

25

Page 26: Tools for Image Retrieval in Large Multimedia Databases

Modifications to the HCT (IV)

Searching techniques PQ fails in solving the KNN problem efficiently New searching techniques

Most Similar Nucleus Preemptive Cell Search Hybrid

Number of cells to be considered Minimum number of cells Cells hosting 2·K elements Cellular structure is not kept

26

Page 27: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

27

Page 28: Tools for Image Retrieval in Large Multimedia Databases

Experimental results

CCMA image database of 216,317 elements HCT building evaluation

Construction time

Retrieval system evaluation Retrieval time Elements retrieved

28

Page 29: Tools for Image Retrieval in Large Multimedia Databases

HCT building evaluation

29

Page 30: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (I)

30

Page 31: Tools for Image Retrieval in Large Multimedia Databases

Retrieval sytem evaluation (V)

31

Page 32: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (VI)

32

Page 33: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (II)

33

Evaluation with respect to exhaustive search Mean Competitive Recall

Elements in common

Mean Normalized Aggregate Goodness

Kendall distance Number of exchanges needed in a bubble sort

Query set of 1,082 images

Page 34: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (III)

34

Preemptive Cell Search

Page 35: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (IV)

Searching techniques comparative

35

Page 36: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

36

Page 37: Tools for Image Retrieval in Large Multimedia Databases

Implemented tools (I)

database_indexing tool Tool for indexing an image database HCT is stored at disk

hct_query tool Tool for carrying out a search over an indexed

database HCT is read from disk and load at main memory

37

Page 38: Tools for Image Retrieval in Large Multimedia Databases

Implemented tools (II)

A server/client architecture Based on a messaging system: KSC

38

Page 39: Tools for Image Retrieval in Large Multimedia Databases

Index

Identifying the problem State of art: indexing techniques State of art: Hierarchical Cellular Tree (HCT) Modifications to the original HCT Experimental results Implemented tools Conclusions and future work lines

39

Page 40: Tools for Image Retrieval in Large Multimedia Databases

Conclusions Hierarchical Cellular Tree implementation

To improve the retrieval times Generic implementation for any kind of data Modifications proposed

HCT evaluation Measures extracted from literature

Preemptive Cell Search technique gives the best performance It is essential not to use an underestimated value

for the covering radius 40

Page 41: Tools for Image Retrieval in Large Multimedia Databases

Future work lines

Very large databases Not using only main memory

Region-based CBIR system Each image can be represented by a set of

regions

Browser application based on HCT Take advantage of the hierarchical structure Alternative way to retrieve elements

41

Page 42: Tools for Image Retrieval in Large Multimedia Databases

Thanks for your attention

42

Page 43: Tools for Image Retrieval in Large Multimedia Databases

HCT Operations (IV)

HCT construction

43

Page 44: Tools for Image Retrieval in Large Multimedia Databases

HCT Operations (V)

HCT construction

44

Page 45: Tools for Image Retrieval in Large Multimedia Databases

A tool for image DB indexing

MPEG-7 visual descriptors Indexing a database

Inserting new elements to an indexed database

45

Page 46: Tools for Image Retrieval in Large Multimedia Databases

A tool for image retrieval

It requires an indexed database Image query

Image file XML file with the visual descriptors TXT file with several image queries

Interactive mode Performing a search

46

Page 47: Tools for Image Retrieval in Large Multimedia Databases

A server/client architecture

Based on a messaging system: KSC

47

Page 48: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (IV)

48

Most Similar Nucleus

Preemptive: 0.8319 seconds (retrieval time)

27.51 (CR) and 99.26% retrieved queries

Page 49: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (V)

49

Progressive Query (Most Similar Nucleus)

Preemptive: 0.8319 seconds (retrieval time)

27.51 (CR) and 99.26% retrieved queries

Page 50: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation (VI)

50

Hybrid Preemptive Cell Search over 8 uppermost levels

Preemptive: 0.8319 seconds (retrieval time)

27.51 (CR) and 99.26% retrieved queries

Page 51: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

51

Page 52: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

52

Page 53: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

53

Page 54: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

54

Page 55: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

55

Page 56: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

56

Page 57: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

57

Page 58: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

58

Page 59: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

59

Page 60: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

60

Page 61: Tools for Image Retrieval in Large Multimedia Databases

Retrieval system evaluation

61