Analysis of visual similarity in news videos with robust and memory efficient image retrieval
-
Upload
mediamixercommunity -
Category
Technology
-
view
485 -
download
1
description
Transcript of Analysis of visual similarity in news videos with robust and memory efficient image retrieval
![Page 1: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/1.jpg)
11Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Analysis of Visual Similarity in News Videos Analysis of Visual Similarity in News Videos with Robust and Memorywith Robust and Memory--Efficient Efficient
Image RetrievalImage Retrieval
David Chen, Peter Vajda, Sam Tsai, Maryam Daneshi, Matt Yu, Huizhong Chen, Andre Araujo, Bernd Girod
Image, Video, and Multimedia Systems GroupStanford University
![Page 2: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/2.jpg)
22Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
![Page 3: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/3.jpg)
33Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
![Page 4: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/4.jpg)
44Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
![Page 5: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/5.jpg)
55Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Plays 30 second clip around query phrase match
Would benefit from accurate segmentation of stories
Would benefit from reliable generation of summary clips
![Page 6: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/6.jpg)
66Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Applications of Anchor DetectionApplications of Anchor Detection1. Provide strong cues for story segmentation
2. Extract news story summaries/previews
3. Identify anchors for general person recognition
TURNING TO TECH, SHARES OF RESEARCH IN MOTION REBOUNDED FROM A ONE MONTH LOW. THE COMPANY'S NEXT GENERATION BLACKBERRY-10 PRODUCT LINE IS EXPECTED TO BE UNVEILED IN JUST A FEW WEEKS. YOU MAY REMEMBER SHARES SOLD OFF LAST WEEK AFTER THE COMPANY ISSUED A CAUTIOUS OUTLOOK FOR ITS FOURTH QUARTER RESULTS. BUT TODAY SHARES BOUNCED BACK: UP 11.5% TO A UNDER $12.
Anchor Brian Williams
Anchor Susie Gharib
Don’t confuse anchors with other people in the videos
![Page 7: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/7.jpg)
77Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Applications of Preview MatchingApplications of Preview Matching1. Provide strong cues for story segmentation
2. Extract news story summaries/previews
3. Indicate the most important stories in a broadcast
JUST A MESS. IN WASHINGTON, LAWMAKERS LEAVE TOWN FOR THE HOLIDAYS. THE CLOCK TICKS DOWN TO THE SO-CALLED FISCAL CLIFF. LATE TODAY, THE PRESIDENT HASTILY APPEARS TO ASK IF SOME OF THIS BUSINESS CAN BE FINISHED SOON.
![Page 8: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/8.jpg)
88Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
OutlineOutline• Related work in news video analysis• Long-range visual similarity• Anchor detection algorithm• Preview matching algorithm• Experimental results
![Page 9: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/9.jpg)
99Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Related Work in News Video AnalysisRelated Work in News Video Analysis• Model-based anchor detection
[Zhang et al., 1998] [Hanjalic et al., 1998] [Liu et al., 2000]
• Model-free anchor detection[Gao et al., 2002] [De Santo et al., 2006] [D’Anna et al., 2007] [Ma et al., 2008] [Broilo et al., 2011]
• Spatio-temporal slices for reporter detection[Liu et al., 2007] [Zheng et al., 2010]
• Classification of news video shots[Bertini et al., 2001] [Xiao et al., 2010] [Lee et al., 2011]
![Page 10: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/10.jpg)
1010Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
Frame Number
Fram
e N
umbe
r
1 501 1001 1501 2001 2501 3001
1
501
1001
1501
2001
2501
3001
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
![Page 11: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/11.jpg)
1111Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
Frame Number
Fram
e N
umbe
r
1 501 1001 1501 2001 2501 3001 3501
1
501
1001
1501
2001
2501
3001
3501 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
What causes these long-range visual similarities?What causes these long-range visual similarities?
![Page 12: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/12.jpg)
1212Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
NBC Nightly News on Dec. 21, 2012
![Page 13: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/13.jpg)
1313Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual SimilarityAnchor: Brian
Williams
Analyst: David
Gregory
Reporter: Kelly
O’Donnell
Reporter: Andrea Mitchell
![Page 14: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/14.jpg)
1414Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
![Page 15: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/15.jpg)
1515Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Anchor Detection PipelineAnchor Detection Pipeline
Exclude Frames
Without Faces
Extract Image Signatures
Compare Image
Signatures
Form Initial Anchor
Candidates
Prune Away False
Candidates
Include Temporally
Nearby Candidates
Keyframes
Detections
Similarity Matrix
Count number of long-range local peaks in the current row of the similarity matrix and pick initial
candidates from high-count rows
Compare initial candidates to one another and prune out candidates which are not very similar to
the other initial candidates
From pruned set of candidates, expand to include temporally nearby candidates which are also very
similar in appearance
![Page 16: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/16.jpg)
1616Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
IntraIntra--Episode vs. InterEpisode vs. Inter--EpisodeEpisode• Intra-episode: compare frames within a single
episode of a news program• Inter-episode: compare frames between different
episodes of a news program
![Page 17: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/17.jpg)
1717Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Preview Matching PipelinePreview Matching Pipeline
Detect and Recognize
Text
Adaptively Crop to Preview Region
Extract Image Signature
Compare Image
Signatures
Verify Geometry in
Shortlist
JUST A MESS
JUST A MESS
COMING UP
COMING UP
Database of Image Signatures
Frame
Matches
![Page 18: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/18.jpg)
1818Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
REVV: Residual Enhanced Visual VectorREVV: Residual Enhanced Visual Vector
Extract Local Features
Vector Quantize to
Visual Words
Visual Codebook
Perform Mean Aggregation of Residuals
Query Image
……
Regularize with Power
Law
Reduce Dimensions
by LDA
Binarize Components
from Sign
Compute Weighted
Correlations
Database Signatures
Ranked List1.741.751.791.801.831.84
…
![Page 19: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/19.jpg)
1919Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Experimental SetupExperimental Setup• Anchor detection
– Training on 12 episodes of NBC Nightly News (1 anchor/episode), ABC World News (1 anchor/episode), Nightly Business Report (2 anchors/episode)
– Testing on 21 episodes of same three programs– Measure precision / recall / F-score
• Preview matching– Testing on 10 episodes of NBC Nightly News and ABC
World News– Measure precision / recall / F-score
• Comparison of two memory-efficient signatures– GIST: 66 MB/episode [Oliva et al., 2001] [Douze et al., 2009]– REVV: 10 MB/episode [Chen et al., 2013]
![Page 20: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/20.jpg)
2020Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Anchor Detection ResultsAnchor Detection Results
Recall Precision F-ScoreGIST Intra 0.53 0.84 0.65REVV Intra 0.87 0.90 0.88
![Page 21: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/21.jpg)
2121Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Anchor Detection ResultsAnchor Detection Results
Recall Precision F-ScoreREVV Intra 0.87 0.90 0.88
REVV Intra + Inter 0.90 0.91 0.90
![Page 22: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/22.jpg)
2222Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Preview Matching ResultsPreview Matching Results
Recall Precision F-ScoreGIST 0.48 1.00 0.65REVV 0.90 1.00 0.95
Type A: Preview occurs at beginning of broadcast
![Page 23: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/23.jpg)
2323Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Preview Matching ResultsPreview Matching Results
Recall Precision F-ScoreGIST 0.62 1.00 0.77REVV 0.93 1.00 0.96
Type B: Preview occurs prior to a commercial
![Page 24: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/24.jpg)
2424Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
ConclusionsConclusions• Long-range visual similarity in news videos provides
a general and effective method for anchor detection and preview matching
• A robust image signature is required to handle challenging appearance variations throughout a newscast
• The image signature should be memory-efficient to enable parallelized processing of large video archives
![Page 25: Analysis of visual similarity in news videos with robust and memory efficient image retrieval](https://reader036.fdocuments.net/reader036/viewer/2022081401/55911ab51a28ab8b758b46ce/html5/thumbnails/25.jpg)
Thank YouThank YouThank [email protected]@stanford.edu