Semantics And Multimedia
-
Upload
peter-berger -
Category
Technology
-
view
1.335 -
download
6
description
Transcript of Semantics And Multimedia
![Page 1: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/1.jpg)
Advances in Semantic Analysis of Multimedia
Dr. Gerald FriedlandInternational Computer Science Institute Berkeley, [email protected]
![Page 2: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/2.jpg)
The Internet Today
2
![Page 3: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/3.jpg)
Internet Use Today
3
Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009.
![Page 4: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/4.jpg)
Types of Videos
4
![Page 5: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/5.jpg)
5
Addressable Market forEnterprise Video Applications
Security $1.2 Billion
(Total Market $7.8B, 2005)(Source: JP Freeman)
($7B in 06. Source Lehman)
Asset Tracking $480m by 2010
(RFID in 2006 2.4B)(Total Asset protection $14.7B)(Source: Lehman report 2006)
QA/Operational Efficiency$700m
(source: Envysion, Arrowsight, corporate
analysis)
Training$600m
(source: Forrester Enterprise Software
report 2005)
Compliance$450m
(source: JP Freeman)
BI$400m
(Reporting and Analysis 4B)(Total BI market $13.3B)
(source: IDC BI tools 03-08)
IntelligentMarketing
$200m(source: T3CI corporate
analysis)
Government
(Intelligence, Defense, Homeland Security)
$4.0 Billion Commercially
![Page 6: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/6.jpg)
Multimedia Capabilities: 1985
• Record• Store• Play• Random Seek• Annotate Manually
6
![Page 7: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/7.jpg)
Multimedia Capabilities: 2009
• Record• Store• Stream• Play• Random Seek• Annotate Manually
7
![Page 8: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/8.jpg)
Multimedia Capabilities: Wanted• Semantic Navigation• Search• Content Compare• Object Cut & Paste• Annotate Automatically• Infer over Content
8
=> Make multimedia “understandable” for computers.
![Page 9: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/9.jpg)
Problems
9
•Multimedia data very dense manual annotation not feasable
•Multimedia content analysis is difficult and rarely good enough to create reliable products.
![Page 10: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/10.jpg)
My Research...
Features
Recognition
Understanding
Filtering
Machine Learning
Context
AudioImages Video Text
Semantic Computing
Artificial Intelligence
Signal/Text Processing
KnowledgeNetwork
Semantic Web
![Page 11: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/11.jpg)
My Research...
Hypotheses:• Multimedia content analysis works
better when every cue is taken into account (eg. video AND audio).
• Semantic is enabled through context. Converts AI research into products.
![Page 12: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/12.jpg)
Context
• Inclusion of prior knowledge• Combination of algorithms• Multimodality:
– audio+video+...– extra hardware
• Human interaction• ...
12
Sources of Context:
![Page 13: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/13.jpg)
Context as Key: Example 1
13
→ →
Visual Object Extraction
Cut
Paste
Horse
Meadow^V
![Page 14: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/14.jpg)
Simple Interactive Object Extraction (SIOX)
14
→ →
Image User Input Output
Context delivered by human interaction
![Page 15: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/15.jpg)
15
SIOX: Algorithm IdeaColor Signatures from image retrieval:
Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000.
Idea: Instead of searching and image database, use Color Signatures to search inside an image.
![Page 16: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/16.jpg)
16
SIOX in GIMPSIOX
Button
G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in Images and Videos”, International Journal of Semantic Computing Vol 1, No 2, pp. 221-247, World Scientific, USA, June 2007.
![Page 17: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/17.jpg)
17
SIOX in Inkscape
![Page 18: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/18.jpg)
18
SIOX in Blender
![Page 19: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/19.jpg)
19
Extensions
→
Extracting multiple similar objects at once:
![Page 20: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/20.jpg)
20
Sub-Pixel Refinement
→
→
Problem: Spill colors and foreground disappearance
Original SIOX GraphCut
![Page 21: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/21.jpg)
21
Sub-Pixel Refinement
→
→
Detail Refinement Brush: Coarse Interaction
![Page 22: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/22.jpg)
22
VideoSIOX
1st Frame:
Subsequent Frames:
![Page 24: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/24.jpg)
24
Shoesurfer
![Page 25: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/25.jpg)
25
Shoesurfer
![Page 26: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/26.jpg)
26
Shoesurfer
![Page 27: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/27.jpg)
27
Shoesurfer
![Page 28: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/28.jpg)
28
Shoesurfer
![Page 29: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/29.jpg)
Context as Key: Example 2
29
![Page 30: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/30.jpg)
Speaker Diarization: Who Spoke When?
30
Audiotrack:
Segmentation:
Clustering:
G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985--993, July 2009.
![Page 31: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/31.jpg)
Analyzing Meetings
31
![Page 32: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/32.jpg)
Dominance Estimation
![Page 33: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/33.jpg)
I Know You...
33
http://www.icsi.berkeley.edu/~fractor/ioda_demo.avi
![Page 34: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/34.jpg)
Narrative Theme Navigation
34
G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009.
![Page 35: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/35.jpg)
Joke-O-Mat: Demo
35
http://www.youtube.com/watch?v=1qfa84Ulm5s
![Page 36: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/36.jpg)
36
GStreamer
Source Recorder
User
Component 1
User
Component 2
User
Component n
Appscio
.
.
.
File
Device
Driver
Connecting Multimedia and Semantic Technologies
![Page 37: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/37.jpg)
37
Custom Event
Source 1
Custom Event
Source 2
Custom Event
Source n
.
.
.
C/C++/Java
Interface
Pipeline Framework
Video Application Server
Scripting & Logic Engine
Web Technology
Interface
Events
Integrated
Development
Environment
Services Connector
Code
Semantic Media Framework
http://www.appscio.com
![Page 38: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/38.jpg)
Semantic Analysis of Multimedia Data• enables automatic logical
inference on perceptually encoded data
• enables more “natural” interaction with the computer: “do what the user means”
• Interfaces nicely with Semantic Web technologies
38
![Page 39: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/39.jpg)
A note...
39
James A. Hendler
![Page 40: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/40.jpg)
40
MySTT
Open-Source, open-model, state-of-the-art speech recognizer for multiparty conversations.
Release Date: February 2010
![Page 41: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/41.jpg)
41
4th IEEE International Conference on Semantic Computing 2010
Paper Deadline: May 3rd, 2010
![Page 42: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/42.jpg)
Upcoming...
42
![Page 43: Semantics And Multimedia](https://reader036.fdocuments.net/reader036/viewer/2022081413/54983f9bac7959222e8b55c6/html5/thumbnails/43.jpg)
Thank You!
43
Questions?Contact:Dr. Gerald FriedlandInternational Computer Science Institute Berkeley, CAhttp://[email protected]