Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf ·...

29
Dia: AutoDi rective A udio Capturing Through a Synchronized Smartphone Array Sanjib Sur Teng Wei and Xinyu Zhang University of Wisconsin - Madison 1

Transcript of Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf ·...

Page 1: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Dia: AutoDirective Audio Capturing

Through a Synchronized

Smartphone Array

Sanjib Sur

Teng Wei and Xinyu Zhang

University of Wisconsin - Madison

1

Page 2: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Multimedia applications in smartphones

Standalone smartphones are not suitable for

demanding multimedia applications

Growing mobile multimedia applications due to

pervasiveness of smartphones

Most apps are standalone

2

Page 3: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Demanding multimedia recording applications

Smart conferencing

3

Autonomous lecture

recording

Page 4: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Existing solutions

Polycom CX5000 Polycom QDX 6000

Cost ~ $4000 - $5000

Weight ~ 3 Kg.

Requires dedicated infrastructure (lacks portability)

4

Page 5: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Synchronized smartphone array

Goal

Leverage multiple smartphones’ microphones to

realize smart conferencing/lecture room recording

Challenge

Precise audio I/O synchronization

5

Practical audio beamforming

Robust speaker localization

Page 6: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Dia: System overview

6

Calibration parameters

Master

Smartphone Array

Audio signals

Server

Sync

Audio beam forming

Speaker tracking

Page 7: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Why synchronization?

Combined

signals

7

Page 8: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Why synchronization?

8

Combined

signals

Page 9: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Independent CPU and audio I/O clocks

……

……

9 Synchronize

Page 10: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Clock synchronization

Problem

Because of independent CPU and Audio I/O clock,

synchronizing only CPU clock at application level is not

sufficient

10

Page 11: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Our solution: Two-level synchronization

Observations

CPU clock Global clock

Audio I/O clock CPU clock

Smartphone1

Smartphone2

11

Page 12: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Two-level synchronization

s1

lc1

gc

lc2

s2

12

Page 13: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Estimating the timing models

Observe first n tuples,

Run a linear regression to estimate

Observe m tuples,

Run another linear regression to estimate

13

Page 14: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Implementation

Implemented in 8 Samsung Galaxy

Nexus smartphones

Modified Broadcom wireless drivers

and Tinyalsa audio drivers in Linux

kernel of Android OS

14

Used Desktop PC as server to process the synchronized

audio signals

Page 15: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Synchronization performance

Accuracy within 2 ~ 3 samples at 16 kHz = 187.5 µS

15

Page 16: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Synchronization performance

Only 500 beacons, initial

setup time ≈ 50 sec. Audio sampling rate

invariant

16

Impact of initialization time

Impact of audio sampling frequency

Page 17: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Application to autodirective audio capturing

17

Page 18: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Audio Beamforming

Phone 1

Phone 2

+ = Scale and shift

Scale and shift

Desired signal Noise signal

18

Page 19: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Audio Beamforming Algorithm

Minimum Variance Distortionless Response (MVDR)

Steers the “beam” to the desired direction

Minimizes the output energy of noise signals 19

MVDR theorem: to find the W* that

W*

Page 20: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Speaker tracking using Time Difference Of Arrival

TDOA estimation

The phase maximizing the cross-correlation corresponds to the

TDOA

Solution: Adaptive time-domain linear filter

Problem:

Instability of the estimation result

Remove TDOA outliers

Tradeoff between Robustness

and Latency

Generalized cross-correlation with phase transformation (GCC-PHAT)

20

Page 21: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Speaker tracking: Binary Mapping Algorithm

How do we find the location of speaker without

knowing the distance between smartphones?

Region Signature S: unique binary vector for each region

Use minimum Euclidean distance for region signature matching

21

Page 22: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Beamforming performance

Beamforming gain from a smartphone array

Gain scales with the growing number of microphones

11dB of PSNR

improvement!

22

6dB of

improvement!

Page 23: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Accuracy of speaker tracking

In round-table conference scenario

In lecture room scenario

~90 - 93% of accuracy!

23

Page 24: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Conclusion

Practical autodirective audio capturing and speaker

tracking by ad-hoc cooperative smartphones

Cooperative audio and visual recording system

Future work

Precise audio sample synchronization enables many

distributed audio sensing applications in smartphones

24

Page 25: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Thank you!

Page 26: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Backup slides

Page 27: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

How precise?

Human voice can range from 500 Hz – 2 kHz

Equivalently the synchronization timing offset between

two microphones ≤ 1/(2 * 2000) = 250 µS

Requirement: Audio synchronization

timing offset should be below 250 µS!

27

Page 28: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Independent CPU and audio I/O clocks

……

……

Synchronize Re-synchronize 28

Page 29: Dia: AutoDirective Audio Capturing Through a Synchronized …sur/papers/Dia_MobiSys14_slides.pdf · 2018. 7. 29. · Multimedia applications in smartphones Standalone smartphones

Independent CPU and audio I/O clocks

……

……

Synchronize Re-synchronize

……

……

De-synchronized audio samples

29