A Segmentation based Sequential Pattern Matching for Efficient Video Copy Detection

download A Segmentation based Sequential Pattern Matching for Efficient Video Copy Detection

of 22

description

To design a copy-detection algorithm which is sufficiently robust to detect severely deformed copies with high accuracy to localize copy segment.

Transcript of A Segmentation based Sequential Pattern Matching for Efficient Video Copy Detection

Slide 1

A Segmentation based Sequential Pattern Matching for Efficient Video Copy DetectionIntroduction Motivation Problem Statement Aim and Objectives Literature Survey System architectureProposed SystemImprovementsReferences

Content

IntroductionRapid growth of the Internet , Easiness in digital media acquiring and distributing.

As digital videos can be copied and modified easily, protecting the copyright of the digital media has become matter of concern.

Criteria for selection of copy detection algorithm :-Accuracy, measured in terms of false positive and false negative rates

Computational Requirements, processing time & storage

3The rapid growth of the World Wide Web has allowed netizens in acquiring and sharing digital media in relatively simpler way due to improvements in data transfer and processing capabilities. Due to wide use of digital devices like smart phones,cameras, more and more images and videos are produced by netizens and are uploaded on the internet for community sharing. As digital images can be copied and modified easily, so safeguarding the copyright of the digital media especially the digital imagehas become matter of concern.

We evaluate the methods by two important, yet potentially contradicting, criteria: first by accuracy, typically measured interms of false positive and false negative rates or their close counterparts precision and recall; second by computationalrequirements, i.e. by measuring indicators for usage of main memory, hard disk storage, and processing times for imagedescription and image matching. Given the purpose sketched above we are especially concerned with the scalability of thedetection methods with respect to these measurements. Many studies have focused on test sets with a size in the range of 10,000to 40,000 images ([4, 11-15]), but in the context of web search there is clearly a need to use larger test sets.3

Definition-Exclusive rights granted by the State for inventions, new and original designs, trademarks, new plant varieties and artistic and literary works.

Goals For IPR Security:-Detection and retrieval of authentic content.Protection of content from fraudulent alterations.

Common types of intellectual property rights include copyright ,trademarks, patents, industrial design rights.

4Intellectual Property RightsWorld Intellectual Property Organization(WIPO)

Copyrightis a legal concept, enacted by most governments, that grants the creator of an original workexclusive rightsto its use and distribution, usually for a limited time.4A point in an image which has a well-defined position and can be robustly detected.Local features vs. Global FeaturesTypes of local feature - Edges, Corner, Blobs.Associated with a significant change of one or more image properties (e.g.intensity,colors). Used to find corresponding points between images which is very useful for numerous applications!

What is an Interest Point/feature ? 5Feature detection is a low-level image processing operation. That is, it is usually performed as the first operation on an image, and examines every pixel to see if there is a feature present at that pixel. In practice, edges are usually defined as sets of points in the image which have a strong gradient magnitude. Furthermore, some common algorithms will then chain high gradient points together to form a more complete description of an edge. These algorithms usually place some constraints on the properties of an edge, such as shape, smoothness, and gradient value.

5

Courtesy:- Kristen GraumanLocal features: Main Components66Animage gradientis a directional change in the intensity or color in an image. Image gradients may be used to extract information from images.

Typical Photometric & Geometric TransformationsMotivationMain Concern-A considerable number of videos are illegal copies or manipulated versions of existing media, making copyright management a complicated process. Call for Change:-Todays widespread video copyright infringement calls for the development of fast and accurate copy-detection algorithms. As video is the most complex type of digital media, it has so far received the least attention regarding copyright management.Protect Data:- Content-based copy detection (CBCD) ,a promising technique for video monitoring and copyright protection.

TENS of thousands of videos are being uploaded to the Internet and shared every day. A considerable number of these videos are illegal copies or manipulated versions of existing media, making copyright management on the Internet a complicated process. Todays widespread video copyright infringement calls for the development of fast and accurate copy-detection algorithms. As video is the most complex type of digital media, it has so far received the least attention regarding copyright management.

Content-based copy detection (CBCD) recently has appeared a promising technique for video monitoring and copyright protection.

8Problem statement

To design a copy-detection algorithm which is sufficiently robust to detect severely deformed copies with high accuracy to localize copy segment.

Aim and Objectives of the projectThe aim of the project is to provide security to multimedia content and provide platform to safeguard copyright of the digital media.

Objective of video copy detection:-

To decide whether a query video segment is a copy of a video from the video data set.

If a system finds a matching video segment, then to retrieve the name of copy video in the video database and the time stamp where the query was copied from.

Literature SurveyFast and Robust Short Video Clip Search for Copy Detection: -

Selection of ordinal feature and color feature as signature for copy detection task. Designed Global visual feature (a fixed-size 144-d signature) combining the spatial-temporal and the color range information

Drawback:-Temporal order of frames within a video sequence has not yet been exploited sufcientlyLack of frame ordering information may make the signatures less distinguishable.

Temporal order of frames within a video sequence has not yet been exploited sufciently in OPD,and also in CCD. Although our signatures are useful for those applications irrespective of different shot order (such as the commercial detection in [13]), the lack of frame ordering information may make the signatures less distinguishable.

The results of a 64-element DCT transform are 1 DC coefficient and 63 AC coefficients. The DC coefficient represents the average color of the 8x8 region. The 63 AC coefficients represent color change across the block. Low-numbered coefficients represent low-frequency color change, or gradual color change across the region. High-numbered coefficients represent high-frequency color change, or color which changes rapidly from one pixel to another within the block. These 64 results are written in a zig-zag order as follows, with the DC coefficient followed by AC coefficients of increasing frequency.

11Literature SurveyContent-Based Video Copy Detection Using Discrete Wavelet Transform:-

Daubechies wavelet transform is used to obtain feature descriptor from video frames.The wavelet coefficients of all frames of the same segment are extracted and variance and mean of the coefficients are estimated, to describe each segment of a video.Global feature-based approaches are simple and efficient for retrieval in large database.

Drawback:-Performance on detecting copies with complicated transformations e.g. cropping, shift and cam cording. is not satisfactory.

Absence of sequence matching by using the videos temporal information.

A representative feature vector is created for each segment by taking mean of 25 consecutive feature vectors. (Here, the frame rate of the video is 25 frames / second and duration of each segment is 1 second). Thus, each video in the reference video database is referred as a cluster consists of a set of representative feature vectors, whose number is equal to the duration of the video in seconds. In order to facilitate the similarity search a cluster centre is determined for each cluster by calculating the mean of the representative feature vectors of the corresponding cluster.12Literature SurveyFast And Accurate Content-based Video Copy Detection Using Bag-of-global Visual Features :-

Use of Bag-of-visual words (BoVW) framework.Multiple assignment in the temporal domain.Multiple assignment in the spatial domain.DCT-sign-based feature vector(2D-DCT is performed, top-v AC coefficients in the zigzag scan order are used as a feature vector).Segment-level matching by inverted index lookups. Two segments are matched if and only if they share the same VW(s) in at least one block.

Drawback:-BoW model relies on a simple count of the visual word occurrences in the images, any spatial relations between words are lost. (Using spatial information helps discriminate visual words and therefore improves the precision performance.) It is inherently difficult for global features to handle Geometric Alterations(Rotation,Flip,Scale,etc.)

Bag-of-visual words (BoVW) framework , which is usually used for local features, is adapted to global features, in which multiple global features are extracted from predefined windows in a keyframe.Note the top-left corner entry with the rather large magnitude. This is the DC coefficient. The remaining 63 coefficients are called the AC coefficients. The advantage of the DCT is its tendency to aggregate most of the signal in one corner of the result, as may be seen above.Reference and query video clips are divided into short segments with fixed durations in the temporal domain.From each of the segments,fixed number of frames are subsampled at a uniform interval (multiple assignment in the temporal domain).Subsequently, these subsampled frames are divided into blocks(multiple assignment in the spatial domain). Finally,feature vectors are extracted from these blocks,DCT-sign-based feature.2D-DCT is performed,top-v AC coefficients in the zigzag scan order are used as a feature vector.13Architecture for Video Copy DetectionMain Components:-Change-based Threshold for Video SegmentationFeature extraction with SIFT from keyframesSimilarity-based matching between SIFT feature point setsGraph-based Video Sequence MatchingEvaluation Criteria:Copy Location AccuracyComputational Time CostRecall & Precision

Architecture for Video Copy Detection

211234Change-based Threshold for Video SegmentationFeature extraction with Binary SIFT from keyframesSimilarity-based matching between SIFT feature point setsGraph-based Video Sequence Matching1.Change-based Threshold for Video SegmentationVideo Time DirectionReference Video ClipThSegment 1Segment 2Method cuts continuous video frames into video segments by eliminating temporal redundancy of the visual information of continuous video frames.Threshold for detecting abrupt changes of visual information of frames , TH= + , and are mean and standard deviation of difference values between consecutive frames , suggested between 5 and 6.Threshold for detecting gradual changes of visual information of frames , TL = b * Th , where b is selected from the range 0.1-0.5

2. Feature Extraction with SIFT from KeyframesSegment 1Segment 2SIFT Detection:-1. Find Scale-Space Extrema2. Keypoint Localization & Filtering Improve keypoints,throw out bad ones

SIFT Description:-3. Orientation Assignment Remove effects of rotation and scale4. Create descriptor Using histograms of orientationsKeyframe173. Similarity-based matching between SIFT feature sets

For Binary SIFT descriptor extracted compute its nearest neighbor in the dictionary.

Cluster the set of descriptors (using k-means for example) to k clusters. The cluster centers act as dictionarys visual words.

Given a test feature(Binary SIFT),Hierarchical k-NN search is used to find out nearest visual word.

4. Graph-based Video Sequence Matching

Time direction consistency: For Mij and Mlm, if there exists (i j)*(l-m)> 0, then Mij and Mlm satisfy the time direction consistency.

Time jump degree: For Mij and Mlm ,the time jump degree between them is defined as,

If the following two conditions are satisfied, there exists an edge between two vertexes:The two vertexes should satisfy time direction consistency.

2. The time jump degree t < T( T is a preset threshold).

Evaluation Criteria:-Copy location accuracy:This measure aims to assess the accuracy of finding the exact extent of the copy in the reference video.What percent of your predictions were correct?

Improvements In Existing SystemSegmentation based on Dual Threshold: Basic Problems:-Matching based on SIFT descriptor is computationally expensive for large number of points and its high dimension. To reduce the computational complexity, use the dual-threshold method to segment the videos into segments with homogeneous content and extract keyframes from each segment.

Binary SIFT Descriptor for Feature Matching:Basic Problems:-Global or Local Descriptor?SIFT not only has good tolerance to scale changes, illumination variations, and image rotations, but also is robust to change of viewpoints, and additive noise,logo insertion, shifting or cropping, complicated edit.Compared with methods based on global descriptor, methods based on local descriptor(SIFT) have a better detection performance .Memory cost of binary SIFT is low, making it feasible to store the whole binary SIFT in the index list.Graph-based Video Sequence Matching:Basic Problems:-Hard threshold ?Exhaustive search ?To resolve these problems, Graph-based video sequence matching method has the advantages of high accuracy in locating copies, reducing detection time costs, and being able to simultaneously locate more than one copy in two comparing video sequences.References A Segmentation And Graph-based Video Sequence Matching Method For Video Copy Detection,hong Liu, Hong Lu, And Xiangyang Xue, IEEE Transactions On Knowledge And Data Engineering, Vol. 25, No. 8, August 2013

Visual word expansion and BSIFT verification for large-scale image search,Wengang Zhou Houqiang Li Yijuan Lu Meng Wang Qi Tian,Springer,2013

Content-based Video Copy Detection Using Discrete Wavelet Transform, Gitto George Thampi, D. Abraham Chandyc., Proceedings Of IEEE Conference On Information And Communication Technologies,2013

Fast And Accurate Content-based Video Copy Detection Using Bag-of-global Visual Features,yusuke Uchida, Koichi Takagi, Shigeyuki Sakazawa,IEEE,2012

Fast And Robust Short Video Clip Search For Copy Detection Junsongyuan, Ling-yu Duan, Qi Tian, Surendra Ranganath, And Changsheng Xu1,2005

Distinctive Image Features From Scale-invariant Keypoints, Intl J. Computer Vision, D.G.Lowe,Vol. 60, No. 2, Pp. 91-110, 2004.