Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS...
-
Upload
winfred-gibbs -
Category
Documents
-
view
214 -
download
0
Transcript of Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS...
![Page 1: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/1.jpg)
1Department of Computer Science and Engineering, CUHK
Final Year Project 2003/2004LYU0302
PVCAIS – Personal Video Conference Archives Indexing System
Supervisor: Prof Michael Lyu
Presented by: Lewis Ng, Philip Chan
25 November 2003
![Page 2: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/2.jpg)
2
Outline
Introduction Motivation Architecture of PVCAIS - Media Acquisition Module - Archive Indexing Module - Videoconference Accessing Module Implementation in First Term Future Work Conclusion
![Page 3: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/3.jpg)
3
Introduction
PVCAIS stands forPersonal Video Conference Archives Indexing System
A system that provides the convenient searching and browsing support for videoconferencing users on past videoconference archives
![Page 4: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/4.jpg)
4
Introduction
What is video conference?
A real-time communication technology which combines different media:
audio, video, text chat, file transfer, whiteboard and shared communications
- More precisely is “multimedia conference”
![Page 5: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/5.jpg)
5
Motivation
– Videoconference is becoming popular in
education, business, personal communication– Participants wish to keep videoconference
archives for later references– Normal video and audio files are neither
searchable nor helpful to recall their contents– Indexing of videoconference archives has not
been investigated till now
![Page 6: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/6.jpg)
6
Architecture of PVCAIS
Consists of 3 modules:
- Media Acquisition Module
- Archive Indexing Module
- Videoconference Accessing Module
Mediaacquisition
Rawvideoconference
archives
Indexedvideoconference
archives
Archive indexing
Videoconferenceaccessing
![Page 7: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/7.jpg)
7
Architecture of PVCAIS
![Page 8: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/8.jpg)
8
Media Acquisition
Extracts channel data and forms media files Videoconferencing physically contains 4 types of ch
annels: Audio, Video, Data and Control Audio and Video channels: transmit incoming/ outgo
ing audio and video information Data channel: carries information for user applicatio
n such as Text Chat, Whiteboard and File Transfer Control channel: transmits system control informatio
n such as Member Information
![Page 9: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/9.jpg)
9
Media Acquisition
Video-in and Video-out channel– Reduce redundancy : just store key-frames– Detect scene change in real time– Each key frame picture is stored with a timestamp
![Page 10: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/10.jpg)
10
Media Acquisition
Audio-in and Audio-out channel– mixed into one stream after videoconference– will be used for Speech Recognition
Text Chat channel– sender, receiver– message– store with timestamp
![Page 11: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/11.jpg)
11
Media Acquisition
Whiteboard channel– Consists of a text-based index file and a number
of snapshot pictures– Index file records timestamp for each whiteboard
update event and the path of the corresponding snapshot picture
– Update of this channel happens in a period of time -> need to detect when update begins and ends by monitoring data transfer in this channel
![Page 12: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/12.jpg)
12
Media Acquisition
File Transfer channel– Will have a copy of the sent/received files to the
directory of archive and an index file– Index file includes sender’s and recipient’s user
names and the path of the files
Control channel– Contains timestamp and information of each
event such as member joined and member left
![Page 13: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/13.jpg)
13
Media Acquisition
Paradigm of storing the videoconference archives.
Video_in
Video_out
Audio_in
Audio_out
Text_chat
Whiteboard
File_in
File_ out
Control
Time0:00:00
One line One lineTwo lines
One lineTwo linesThree lines
One lineTwo linesThree linesFour lines
One lineTwo linesThree linesFour linesFive lines
ii ii
Video_in archive
Audio archive
Text chat archive
Whiteboard archive
Document archive
Control archive
Video_out archive
![Page 14: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/14.jpg)
14
Archive Indexing
7 raw files are extracted in Media Acquisition Module
Need to implement some indexing functions to retrieve more information
These includes: Face Detection, Face Recognition, Speech Recognition, OCR, Time-based Text Merging, Keyword Selection, Title Generation
![Page 15: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/15.jpg)
15
Archive Indexing
Face Detection- distinguish between Slides and Faces
- if face is detected, find out the face region
Slide
Face
![Page 16: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/16.jpg)
16
Archive Indexing
Face Recognition
- Associate human faces in Video-in with name
- Need to keep a face base
- If no match in the face base, ask remote user to enter the name
![Page 17: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/17.jpg)
17
Archive Indexing
Speech Recognition
- Generate speech script from audio archive
- Speech of a videoconferencing contains the most information
- Can use commercial library: Microsoft SAPI, IBM Via Voice
OCR
- Take the slide archive as input and recognizes text from them
- Need to identify and localize text on the complex background
![Page 18: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/18.jpg)
18
Archive Indexing
Time-based Text Merging- Merge the Speech transcript, Chat script, Whiteboard script and slide text archive to Text source according to their timestamp
Keyword Selection- takes the Text source as input
- generates keyword for the videoconference
![Page 19: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/19.jpg)
19
Archive Indexing
Title Generation- takes the Text source as input
- automatically generates a title for the videoconference
Generate XML index file- integrates all the archives
- stores all the related files of a videoconference into a single directory
![Page 20: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/20.jpg)
20
Videoconference Accessing
Provides an interface for user to manage, search and review all indexed conference.
Allows user to modify the content of a conference, such as editing title or keywords, or delete a conference.
Allows user to search for a conference by different criteria, such as member name or keyword.
Allows user to review a conference by playing back the audio or the key frames.
![Page 21: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/21.jpg)
21
Implementation
NetMeeting 3.0– A Windows feature that provide Internet
conferencing function.– Support video, audio and data conferencing
including application sharing, chat, whiteboard and file transfer.
– Other features include remote desktop sharing.
![Page 22: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/22.jpg)
22
Implementation
NetMeeting 3.0 SDK – An extension of NetMeeting, provides an interface
for programmers and Web developers to integrate conferencing capabilities into their applications.
– API is in the form of COM interfaces and functions.
![Page 23: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/23.jpg)
23
Implementation
A simple NetMeeting compatible videoconference program built on top of the NetMeeting 3.0 SDK.
Support:
– Video– Audio– Text message– File Transfer– Whiteboard
![Page 24: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/24.jpg)
24
Implementation
By directly using the functions of the API, the following raw data can be obtained: – the members information – file transfer record – text messages record
Video, audio and whiteboard data cannot be directly obtained.
![Page 25: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/25.jpg)
25
Implementation
Video– create a thread to check the display of the video
windows – if scene change is detected, the video will be
captured and stored as a still image.– the stored images are key frames of the
conference and will be used for face detection and recognition after the conference.
![Page 26: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/26.jpg)
26
Implementation
Audio– create a thread to record the local audio from the
microphone. – when certain amount of audio data is recorded, send
the audio data to all members of the conference.– all the received audio files and locally recorded audio
files will be combined to generate a single audio file.– the final audio file will be used for voice recognition,
the voice engine used is Microsoft SAPI.
![Page 27: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/27.jpg)
27
Implementation
Whiteboard– cannot capture the
NetMeeting whiteboard information because the format of the data is not stated in the API.
– solution: create our own whiteboard function and
data format.
![Page 28: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/28.jpg)
28
Conclusion
We developed a videoconferencing agent All channel data except whiteboard can be
collected. Speech Recognition and Face Detection &
Recognition is integrated into the system but accuracy needs to be improved
Simple searching can be performed on stored archives
![Page 29: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/29.jpg)
29
Future Work
Whiteboard Improve accuracy of Voice Recognition XML Better searching method OCR for slide in video Improve User Interface
![Page 30: Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649e8b5503460f94b9082b/html5/thumbnails/30.jpg)
30
Q & A Session