User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.
-
Upload
lorena-cole -
Category
Documents
-
view
220 -
download
0
Transcript of User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.
![Page 1: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/1.jpg)
User Benefits of Non-Linear Time Compression
Liwei He and Anoop Gupta
Microsoft Research
![Page 2: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/2.jpg)
Introduction
Time compression: key to browse AV content
We focus on informational content
Audio time compression algorithms
Linear: speed up audio uniformly
Non-linear: exploit fine-grain structure of human speech (e.g. pause, phonemes)
How much more do users gain from more complex algorithms?
![Page 3: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/3.jpg)
Methodology
Conduct user listening test
One Linear TC algorithm
Two Non-linear TC algorithms
Simple: Pause-removal followed by Linear TC
Sophisticated: Adaptive TC
Compare objective and subjective measurements
![Page 4: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/4.jpg)
Time Compression Algorithms
![Page 5: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/5.jpg)
Linear Time Compression
Classic algorithms
Overlap Add (OLA) and Synchronized OLA (SOLA)
We use SOLA
![Page 6: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/6.jpg)
Non-Linear Time Compression
Algorithm 1: Pause removal plus TC
Energy and Zero Crossing Rate analysis
Leave 150ms untouched
Shorten >150ms to 150ms
Apply SOLA algorithm
PR shortens speech by 10-25%
![Page 7: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/7.jpg)
Non-Linear Time Compression (cont.)
Algorithm 2: Adaptive TC
Mimics people when talking fast
Pauses and silences are compressed the most
Stressed vowels are compressed the least
Consonants are compressed more than vowels
Consonants are compressed based on neighboring vowels
![Page 8: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/8.jpg)
System Implications
Computational complexity
Adaptive TC 10x more costly than Linear TC
Complexity in client-server implementation
Buffer management required for non-linear TC
Audio-video synchronization quality
![Page 9: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/9.jpg)
User Study Method
![Page 10: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/10.jpg)
User Study Goals
Highest intelligible speed
Comprehension
Subjective preference
Sustainable speed
![Page 11: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/11.jpg)
Experiment Method
24 subjects
4 tasks for each subject
3 time compression algorithms
Linear TC using SOLA (Linear)
Pause removal plus Linear TC (PR-Lin)
Adaptive TC (Adapt)
Each test takes approximately 30 minutes
![Page 12: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/12.jpg)
Highest Intelligible Speed Task
3 clips from technical talks
Find the highest speed when most of words are understandable
![Page 13: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/13.jpg)
Comprehension Task
3 clips at 1.5x and 3 clips at 2.5x
Clips from TOEFL listening test
Answer 4 multiple choice questions
![Page 14: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/14.jpg)
Subjective Preference Task
3 pairs of clips at 1.5x
3 pairs of clips at 2.5x
Each pair contains the same clip compressed with 2 of the 3 TC algorithms
Indicate preference on 3-point scale
![Page 15: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/15.jpg)
Sustainable Speed Task
3 clips each 8 minute along
Clips from a CD audio book
Find the maximum comfortable speed
Write a 4-5 sentence summary at the end
![Page 16: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/16.jpg)
User Study Results
![Page 17: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/17.jpg)
Highest Intelligible Speed Task
PR-Lin is significantly better than Adapt (p<.01)
0
0.5
1
1.5
2
2.5
3
Linear PR-Lin Adapt
Co
mp
res
sio
n R
ate
![Page 18: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/18.jpg)
Comprehension Task
0
10
20
30
40
50
60
70
80
90
Linear PR-Lin Adapt
Sc
ore
(%
)
1.5x
2.5x
Adapt is better than PR-Lin (p=.083) at 2.5x
![Page 19: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/19.jpg)
Preference Task at 1.5x
Slight preference for PR-Lin (p=.093)
1.5xPrefer Former
Prefer None
Prefer Latter
Linear vs. PR-Lin
6 5 13
PR-Lin vs. Adapt
13 5 6
Adapt vs. Linear
8 8 8
![Page 20: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/20.jpg)
Preference Task at 2.5x
PR-Lin and Adapt do significantly better than Linear
2.5xPrefer Former
Prefer None
Prefer Latter
Linear vs. PR-Lin
2 8 14
PR-Lin vs. Adapt
4 9 11
Adapt vs. Linear
21 3 0
![Page 21: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/21.jpg)
Sustainable Speed Task
0
0.5
1
1.5
2
2.5
Linear PR-Lin Adapt
Co
mp
res
sio
n R
ate
![Page 22: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/22.jpg)
Conclusions
![Page 23: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/23.jpg)
Previous Works
Mach1 (Covell et. al. ICASSP 98)
Comprehension and preference tasks
Comparing Linear and Mach1 (Adapt) at 2.6-4.2x
Comprehension scores 17% better w/ Mach1
95% prefers Mach1 to Linear
No data on < 2.0x
Other works (Harrigan, Omoigui, Li, Foulke)
1.2-1.7x is the sustainable listening speed
![Page 24: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649de35503460f94ada3ba/html5/thumbnails/24.jpg)
Conclusions
Trade off in TC algorithms is task-related
Listening: Linear TC is sufficient
Fast Forwarding: Non-linear TC is more suitable
Adapt TC is close to the way people talk fast
Limit lies in the human-listening and comprehension