12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th,...

6
12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th , 2005

Transcript of 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th,...

Page 1: 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th, 2005.

12.5 - Low Power Speech Enhancement

David HalupkaPh.D. CandidateElectronics Group

June 24th, 2005

Page 2: 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th, 2005.

June 24th, 2005 University of Toronto 2 of 6

Motivation

Today’s recognition systems can achieve a 95%+ recognition accuracy after extensive training

Research systems: same accuracy with no training Typically: 10% accuracy in the presence of noise,

reverberations, and conflicting conversations Humans are equipped to deal with noisy environments

Two ears → let us localize and focus on a single speaker

Complex noise: one sensor doesn’t cut it Multiple microphones → superhuman noise filtering

Page 3: 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th, 2005.

June 24th, 2005 University of Toronto 3 of 6

Step 1: Sound Localization

dx+τν

x

t

tm2(t)

m1(t)Time-Based Cross-Correlation

Page 4: 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th, 2005.

June 24th, 2005 University of Toronto 4 of 6

Step 2: Speech Enhancement

Page 5: 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th, 2005.

June 24th, 2005 University of Toronto 5 of 6

A Hard Case for Hardware

Localization is a exhaustive linear search Gradient search, etc. not applicable

Each time delay must be checked Each likelihood can be evaluated in parallel

1 GHz Intel Pentium III needed just for real-time localization → consumes 35 W

Speech interface is beneficial for handheld devices, but battery life is limited. Palm M100 → 150 mW

Page 6: 12.5 - Low Power Speech Enhancement David Halupka Ph.D. Candidate Electronics Group June 24 th, 2005.

June 24th, 2005 University of Toronto 6 of 6

Results – 0.18 μm CMOS

Die Size: 2.51 mm x 2.51 mmPower Utilization: 29 mW

Die Size: 1.51 mm x 1.38 mmPower Utilization: 3.45 mWFPGA: 184 mWDSP: 650 mW