Multimedia content based retrieval slideshare.ppt

30
Multimedia Content Based Retrieval Govindaraju Hujigal [email protected]

description

information retrieval for text and multimedia content has become an important research area.Content based retrieval in multimedia is a challenging problem since multimedia data needs detailed interpretationfrom pixel values. In this presentation, an overview of the content based retrieval is presented along withthe different strategies in terms of syntactic and semantic indexing for retrieval. The matching techniquesused and learning methods employed are also analyzed.

Transcript of Multimedia content based retrieval slideshare.ppt

Page 1: Multimedia content based retrieval slideshare.ppt

Multimedia Content Based Retrieval

Govindaraju [email protected]

Page 2: Multimedia content based retrieval slideshare.ppt

Content based retrieval in multimediaan important research areachallenging problem since multimedia data

needs detailed interpretation from pixel valuesdifferent strategies in terms of syntactic and

semantic indexing for retrieval

Page 3: Multimedia content based retrieval slideshare.ppt

Why do we need MCBR ?

How do I find what I’m looking for?!

Page 4: Multimedia content based retrieval slideshare.ppt

Multimedia content Retrievalmultimedia and storage technology that has

led to building of a large repository of digital image, video, and audio data.

Compared to text search, any assignment of text labels a massively labor intensive effort.

Focus is an calculating statistics which can be approximately correlated to the content features without costly human interaction.

Page 5: Multimedia content based retrieval slideshare.ppt

Multimedia content RetrievalSearch based on Syntactic features

Shape, texture, color histogramrelatively undemanding

Search based on Semantic features human perception“ List all dogs look like cat” “City” “Landscape” “cricket”

Page 6: Multimedia content based retrieval slideshare.ppt

Syntactic indexing

Use syntactic features as the basis for matching and employ either Query-through-dialog or Query by-example box to interface with the user.

Query-through-dialog Enter the words describing the imageQuery-through-dialog not convenient as the

user needs to know the exact details of the attributes like shape, color, texture etc.

Page 7: Multimedia content based retrieval slideshare.ppt

Image descriptors – Color Apples are red …

… But tomatoes are too!!!

Page 8: Multimedia content based retrieval slideshare.ppt

Image descriptors – Texture

Texture differentiates between a Lawn and a Forest

Page 9: Multimedia content based retrieval slideshare.ppt

Syntactic indexingQuery by example

example images and user chose the closest.various features like color, shape, textures and

spatial distribution f the chosen image are evaluated and matched against the images in the database.

Similarity or distance metric.In Video, various key frames of video clips

which are close to the user query are shown.

Page 10: Multimedia content based retrieval slideshare.ppt
Page 11: Multimedia content based retrieval slideshare.ppt

Syntactic indexingQuery by example limitations

Image can be annotated and interpreted in many ways. For example, a particular user may be interested in a waterfall, another may be interested in mountain and yet another in the sky, although all of them may be present in the same image.

User may wonder "why do these two images look similar?" or "what specific parts of these images are contributing to the similarity?“. User is required to know the search structure and other details for efficiently searching the database.

It requires many comparisons and results may be too many depending on threshold.

Page 12: Multimedia content based retrieval slideshare.ppt

Semantic indexing• Match the human perception and cognition• Semantic content contains high-level concepts

such as objects and events.• As humans think in term of events and remember

different events and objects after watching video, these high-level concepts are the most important cues in content-based retrieval. Let’s take as an example a soccer game, humans usually remember goals, interesting actions, red cards etc.

Page 13: Multimedia content based retrieval slideshare.ppt

Semantic indexingThere exists a relationship between the

degree of action and the structure of visual patterns that constitute a movie.

Movies can be classified into four broad categories: Comedies, Action, Dramas, or Horror films. Inspired by cinematic principles, four computable video features (average shot length, color variance, motion content and lighting key) are combined in a framework to provide a mapping to these four high-level semantic classes.

Page 14: Multimedia content based retrieval slideshare.ppt

Motion feature as indexing cue..Spatial Scene Analysis on video can be fully

transferred from CBIR but temporal analysis is the uniqueness about video. Temporal Information induces the concept of

motion for the objects present in the document

Page 15: Multimedia content based retrieval slideshare.ppt

Motion feature as indexing cue.. Frame level: Each frame is treated

separately. There is no temporal analysis at this level. Shot-level: A shot is a set of contiguous

frames all acquired through a continuous camera recording. Only the temporal information is used. Scene-level: A scene is a set of contiguous

shots having a common semantic significance.Video-level: The complete video object is

treated as a whole.

Page 16: Multimedia content based retrieval slideshare.ppt

Motion feature as indexing cue..The three types of Shot-level are as follows:Cut: A sharp boundary between shots. This

generally implies a peak in the difference between color or motion histograms corresponding to the two frames surrounding the cut.

Dissolve: The content of last images of the first shots is continuously mixed with that of the first images of the second shot.

Wipe: The images of the second shot continuously cover or push out of the display that of the first shot.

Page 17: Multimedia content based retrieval slideshare.ppt

Motion feature as indexing cueOften through motion that the content in a

video is expressed and the attention of the viewers captivated

Query techniquesSet of motion vector trajectories mapped to set of

objects. Visual query can be ‘player’. [Dimitrova]Use animated sketch to formulate queries.

Motion and temporal duration are the key attributes assigned to each object in the sketch in addition to the usual attributes such as shape, color and texture. [VideoQ]

Page 18: Multimedia content based retrieval slideshare.ppt

Matching techniquesMethod of finding similarity between the two

sets of multimedia data, which can either be images or videos.

Search based on features like location, colors and concepts, examples of which are ‘mostly red’, ‘sunset’, ‘yellow flowers’ etc.

User specify the relative weights to the features or assign equal weightage

Automatically identifying the relevance of the features is under active research.

Page 19: Multimedia content based retrieval slideshare.ppt

Learning methods in retrievalThe user generates both the positive and negative

retrieval examples (relevance feedback).Each image can represent multiple concepts. To replace

one of these ambiguities, each image is modeled as a bag of instances (sub-blocks in the image).

A bag is labeled as a positive example of a concept, if there exist some instances representing the concept, which could be a car or a waterfall scene. If there does not exist any instance, the bag is labelled as a negative example.

The concept is learned by using a small collection of positive and negative examples and this is used to retrieve images containing a similar concept from the database.

Page 20: Multimedia content based retrieval slideshare.ppt

Learning methods in retrievalThe ability to infer high-level understanding

from a multimedia content has proven to be a difficult goal to achieve.Example, the category “John eating icecream”.

Such categories might require the presence of sophisticated scene understanding algorithms along with the understanding of spatio-temporal relationship between entities (like the behavior eating can be characterized as repeatedly putting something eatable in mouth).

Page 21: Multimedia content based retrieval slideshare.ppt

Structure in multimedia contentTo achieve efficiency in content-production

and due to the limited number of available resources, standard techniques are employed.

The intention of video making is to represent an action or to evoke emotions using various storytelling methods. Figure 1 gives an analysis of the basic techniques of shot transitions that are used to convey particular intentions.

Page 22: Multimedia content based retrieval slideshare.ppt
Page 23: Multimedia content based retrieval slideshare.ppt

Structure in multimedia contentSpecial structure of news in ‘begin shot’,

‘newscaster shot’, ‘interview’, ‘weather forecast’ etc. and builds a video model of news.

car-race video has unusual zoom-in and zoom-out, basketball has left-panning and right-panning that last for certain maximum duration.

The motion activity in interesting shots in sports is higher than its surrounding shots and so on.

Page 24: Multimedia content based retrieval slideshare.ppt

Future of CBR systemsThere is ambiguity in making such

conclusions, for example, dissolve can be either due to ‘flashback’ or due to ‘time lapse’. if the number of dissolves is two, most probably ‘flashback’

- “Multimedia Content Description Interface” - specify a standard set of descriptors that can be used to describe various types of multimedia information

Make collaborative effort to tag the multimedia

Page 25: Multimedia content based retrieval slideshare.ppt

Commercial systems – Like.com

Page 26: Multimedia content based retrieval slideshare.ppt

Commercial systems – Like.com

Page 27: Multimedia content based retrieval slideshare.ppt

Commercial systems – Like.com

Page 28: Multimedia content based retrieval slideshare.ppt

Commercial systems – Like.com

Page 29: Multimedia content based retrieval slideshare.ppt

ConclusionsSystematic exploration of construction of high-

level indexes is lacking.None of the work has considered exploring

features close to the human perception.In summary, there is a great need to extract

semantic indices for making the CBR system serviceable to the user. Though extracting all such indices might not be possible, there is a great scope for furnishing the semantic indices with a certain well-established structure.

Page 30: Multimedia content based retrieval slideshare.ppt

ConclusionsContent-based video indexing and retrieval is

an active area of research with continuing attributions from several domain including image processing, computer vision,database system and artificial intelligence.