Pratyusha Koduri Anish Reddy Devireddy Akaash Vankamamidi 1.
-
Upload
thomas-barrett -
Category
Documents
-
view
240 -
download
5
Transcript of Pratyusha Koduri Anish Reddy Devireddy Akaash Vankamamidi 1.
1
Video Data Retrieval
Pratyusha KoduriAnish Reddy Devireddy
Akaash Vankamamidi
2
OutlineIntroductionProblem StatementOur ContributionsRelated WorkMethodologyComparisonEvaluationFuture WorkConclusionReferences
IntroductionProblem: Growing amounts of video data. video data in form of News Video, Film archives,
Surveillance, user-generated content, distance learning, video conferencing, medical applications, sports
Video data is dynamicWith the development of multimedia data types and
available bandwidth there is huge demand of video retrieval systems
One could store the digital video information on tapes, CD-ROMs, DVDs, or any such device.
Goal: Effective video retrieval.
3
4
Problem StatementAll the papers we worked on are related to
retrieval of video data. And how to do this on a compressed video data.
In content-based video retrieval systems choosing features reflect real human interest and how do feature extraction affects the video retrieval.
5
Our Contributions
First, we identify the video retrieval approaches from spatial and temporal analysis
We focused on content-based video retrieval systems and video retrieval in compressed data
classify the methods and summarize the future trends and open problems of video retrieval
6
Related WorkSince we have large amounts of data Compress itRetrieving Data from compressed data without
processing overheadTo index and retrieve semantic datawe use
semantic indexing of video data using the generalized n-ary operators
Dominant regions are used in video indexing and retrieval which include all types of users.
The main objective is to provide concurrency control for virtual editing of video data among different users
7
Related WorkFramework for semantic retrieval of video database.
Each frame of video clips characterized by its HSV (hue-saturation-value) color feature, is first projected onto the spatial principle components
Efficient video retrieval method takes users feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval
8
Key ConceptsQFV reformulation Performed by optimization
method based on Simultaneous Perturbation Stochastic Approximation (SPSA)technique
Relevance feedback (RF), a popular technique in the area of
content-based image retrieval (CBIR).Tracking semantic objects in a video and then
modeling spatio-temporal events based on object trajectories and object interactions Mine spatio-temporal data
9
Video RetrievalUseful in
Historical ArchivesForensic documentsFingerprint & DNA matchingSecurity usage
Retrieval Granularity is also important. How do users want to retrieve materials? What is the purpose of retrieval? What is the user expertise?
10
Content Based Video RetrievalContent-based video retrieval systems
automatically index video material by segmenting it into clips and extracting features such as text, color, texture, motion from each clip to support search.
As digital video collections become more widely available, content-based video retrieval tools will likely grow in importance for an even wider group of users.
CBVR system aims at assisting a human operator (user) to retrieve sequence (target) within a potentially large database
11
Content Based Video RetrievalSelection of extracted features play an important
role in content based video retrievalContent based Video Indexing and Retrieval
(CBVIR), is an extension to application of image retrieval problem
“Content-based” means that the search will analyze the actual content of the video. The term ‘Content’ in this context might refer colors, shapes, textures.
These systems are aiming at accessing video by its content, namely, the spatial-temporal (video) information.
12
MethodologyThe first step for video-content analysis,
content based video browsing and retrieval is the partitioning of a video sequence into shots
Once key frames are extracted next step is to extract features
breakdown Sequence->scene->shot->frame->object
13
Features Two type
Low-levelHigh-level
Low-level features such as object motion, color, shape, texture, loudness, power spectrum, bandwidth, and pitch are extracted directly from video in the database
High-level features are also called semantic features. Features such as timbre, rhythm, instruments, and events involve different degrees of semantics contained in the media
14
IssuesOne of the key issues in CBVR is, to bridge the
”semantic gap”, which refers to the gap between low level features and high level semantic meanings of content
Low level features such as color and textures are easy to measure and compute
But it is a challenge to connect the low level features to a semantic meaning, especially involving intellectual and emotional aspects of the human operator (user).
Another issue is how to efficiently access the rich content of video information, these involves video content, spatial and temporal analysis of videos
15
Generalized n-ary relation The principle component of video data is the
spatial/temporal semantics associated with itGeneralization in both spatial and temporal
domains is to simplify describing complex spatial or temporal events.
For the spatial domain the operands represent the physical location of the objects
In temporal case they represent the duration of a certain temporal event.
N-ArySpatial event, consider a player holding the ball in a
basketball game. A frame consisting event "player holding the ball". This is characterized by six of the n-ary relations in
both x and y coordinates . M, O, C, S, CO,ESpatial events can serve as the low level (fine-grain)
indexing mechanisms for video data.Temporal event is extension of the spatial event
“holding a ball” to ‘passing of a ball between two players”.
B is the before n-ary operation, and d(Events) are the durations of the spatial events
17
ArchitectureThe system is hierarchical in nature and allows multi-level indexing and searching mechanism by modeling information at various levels of semantic granularity and hence allows processing of content-based queries without processing raw image or video data
18
Retrieval In Compressed DataTo avoid the processing overhead of
decompressing video stream into individual frames, it is better to detect these features directly from compressed video data.
Spatio-temporal data can be dominant regions, color information and motions from compressed video data.
Dominant regions are used in video indexing and retrieval, these are extracted from intensity data.
DC Image Data
Quantization
Filtering
Simplified Data
Flat Regions
Watershed Algorithm
Dominant Regions
Color information is computed from HSV quantized table.
Camera motion detection from region-based segmented data.
Based on above features we can extract semantic information of video content.
Above information can be useful in content based video indexing and retrieval.
Retrieval In Compressed Data...
20
Comparison Study SummaryKey issues we noticed in this study are1. Bridging the semantic gap:
To do annotation automatically or semi-automatically, we need to bridge the "semantic gap", i.e., to find algorithms that will infer high-level semantic concepts (sites, objects, events) from low-level image/video features that can be easily extracted from the data (color, texture, shape and structure etc)
One sub-problem is Audio Scene Analysis. Researchers have worked on Visual Scene Analysis (Computer Vision) for many years, but Audio Scene Analysis is still in its infancy, and an under-explored field.
21
Comparison Study Summary2) Human intelligence and machine intelligence
One advantage of information retrieval is that in most scenarios there is a human (or humans) in the loop. One prominent example of human-computer interaction is Relevance Feedback.
3) New Query ParadigmsFor image/video retrieval, people have tried query by
keywords, similarity, sketching an object, sketching a trajectory, painting a rough image, etc. Can we think of useful new paradigms?
4) Data MiningSearching for interesting/unusual patterns and correlations in
video has many important applications, including Web Search Engines and dealing with intelligence data. Work to date on Data Mining has been mainly in Text data.
22
Comparison Study Summary5) Unlabeled Data
Can we use the large number of unlabeled samples in the database to help?
Another problem related to image/video data annotation is Label Propagation. Can we label a small set of data and let the labels propagate to the unlabeled samples?
6) Incremental LearningIn most applications, we keep adding new data to the
database. We should be able to change the parameters of the retrieval algorithms incrementally, not needing to start from scratch every time we have new data.
23
Comparison Study Summary7) Using Virtual Reality Visualization To Help
Can we use 3D audio/visual visualization techniques to help a user to navigate through the data space to browse and to retrieve?
8) Structuring Very Large DatabasesResearchers in audio/visual scene analysis and
those in Databases and Information Retrieval should really collaborate CLOSELY to find good ways of structuring very large video databases for efficient retrieval and search.
24
Comparison Study Summary9) Applications of Video Retrieval
Few real applications of video retrieval have been accepted by the general public so far. Is web video search engine going to be the next killer application? It remains to be seen. With no clear answer to this question, it is still a challenge to do research that is appropriate for real applications.
25
Conclusion & future work Despite the considerable progress of academic research in video
retrieval, there has been relatively little impact of content based video retrieval research on commercial applications with some niche exceptions such as video segmentation.
Choosing features that reflect real human interest remains an open issue. One promising approach is to use Meta learning
Low to High Level Semantic Gap: Visual feature based techniques at the low level of abstraction, mostly from the contribution of signal processing and computer vision communities have been explored in the literature.
Current research efforts are more inclined towards high-level description and retrieval of visual content.
The techniques that bridge this semantic gap between pixels and predicates are a field of growing interest.
Intelligent systems are needed that take low-level feature representation of the visual media and provide a model for the high-level object representation of the content.
26
References http://research.microsoft.com/en-us/um/people/yongrui/ps/sigproc06.pdf Day, Y.F.; Dagtas, S.; Iino, M.; Khokhar, A.; Ghafoor, A., "Spatio-temporal modeling of video data for on-line object-oriented query
processing," Multimedia Computing and Systems, 1995., Proceedings of the International Conference on , vol., no., pp.98,105, 15-18 May 1995
Hang-Bong Kang, "Spatio-temporal feature extraction from compressed video data," TENCON 99. Proceedings of the IEEE Region 10 Conference , vol.2, no., pp.1339,1342 vol.2, Dec 1999
Sze-Man Chan, S.; Li, Qing, "VideoMAP*: a Web-based architecture for a spatio-temporal video database management system," Web Information Systems Engineering, 2000. Proceedings of the First International Conference on , vol.1, no., pp.393,400 vol.1, 2000
Xia, J.; Wang, Y., "A spatio-temporal video analysis system for object segmentation," Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the 3rd International Symposium on , vol.2, no., pp.812,815 Vol.2, 18-20 Sept. 2003
Bo Geng; Hong Lu; Xiangyang Xue, "Incremetal Spatio-Temporal Feature Extraction and Retrieval for Large Video Database," Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on, vol., no., pp.961,964, 27-30 May 2007
Velusamy, S.; Bhatnagar, S.; Basavaraja, S. V.; Sridhar, V., "SPSA based feature relevance estimation for video retrieval," Multimedia Signal Processing, 2008 IEEE 10th Workshop on , vol., no., pp.598,603, 8-10 Oct. 2008
Xin Chen; Chengcui Zhang, "An Interactive Semantic Video Mining and Retrieval Platform--Application in Transportation Surveillance Video for Incident Detection," Data Mining, 2006. ICDM '06. Sixth International Conference on , vol., no., pp.129,138, 18-22 Dec. 2006
Mehmet Emin Dönderler;Özgür Ulusoy; Ugur Güdükbay “Rule-based spatiotemporal query processing for video databases”The VLDB Journal- The International Journal on Very Large Data Bases; Volume 13 Issue 1, January 2004; Pages 86 – 103
Fudong Sun; Minyong Shi; Weiguo Lin, "Feature Label Extraction of Online Video," Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on , vol.3, no., pp.211,214, 23-25 March 2012
Divakaran, A.; Vetro, A.; Asai, K.; Nishikawa, H., "Video browsing system based on compressed domain feature extraction," Consumer Electronics, IEEE Transactions on , vol.46, no.3, pp.637,644, Aug 2000
Al-Salih, A.A.M.; Ahson, S.I., "Object detection and features extraction in video frames using direct thresholding," Multimedia, Signal Processing and Communication Technologies, 2009. IMPACT '09. International , vol., no., pp.221,224, 14-16 March 2009
Sifei Lu; Li, R.M.; Tjhi, W.-C.; Kee Khoon Lee; Long Wang; Xiaorong Li; Di Ma, "A Framework for Cloud-Based Large-Scale Data Analytics and Visualization: Case Study on Multiscale Climate Data," Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on , vol., no., pp.618,622, Nov. 29 2011-Dec. 1 2011
27
Thank you