A Real-time, Open, Portable, Extensible Speech...

2
A Real-time, Open, Portable, Extensible Speech Lab Harinath Garudadri, UCSD In this work, we will develop an open, reconfigurable, non-proprietary, wearable, realtime speech processing system suitable for audiologists and hearing aid researchers to investigate new hearing aid algorithms in lab and field studies. Through active collaboration between engineers and scientists, we aim to accelerate hearing healthcare research and facilitate translation of technological advances into widespread clinical use. Figure 1 depicts the signal processing chain supporting hearing aid functions including subband amplification, dynamic range compression feedback cancellation and remote control for investigating self-fitting methodologies. Interface to ear-level assemblies is provided through a custom interface board and an off-the-shelf audio interface box, as depicted in Figure 2. The remote control is implemented on an Android device and the protocol stack is extensible beyond controlling the gain/compression parameters. This system has been implemented in ANSI-C and runs in realtime with less than 10 ms latency on a high end MacBook. We plan to release the software in source code, PCB schematics, Gerber files, parts list, etc. in 1Q-2017. The system described above is suitable for studies in the lab and also serves as a reference design for porting to an embedded platform. The embedded platform enables field studies including data collection in the field. We are considering the Snapdragon family of processors from Qualcomm for the wearable system. These processors have been optimized for low power, a powerful DSP, a general purpose CPU and multiple wireless connectivity options. In addition, they have the advantage of economies of scale due to their adoption in many smartphones, tablets and Internet of Things (IoT) devices. The DSP can support up to 2 GFLOPS and the CPU can support up to 4000 MIPS. The hearing aid tasks depicted in Fig. 1 are estimated to consume about 20% of the DSP resources, leaving adequate resources for supporting advanced signal processing functions on the wearable device. We are committed to provide hardware, software, and technical support to at least 3 outside labs engaged in improving hearing healthcare. In collaboration with San Diego State University (Dr. Carol Mackersie and Dr. Arthur Boothroyd) we are investigating self-fitting methodologies. Our signal processing expertise and interests are well suited for (i) investigating intelligibility improvements in multiple noise environments, (ii) binaural processing and (iii) objective metrics to characterize hearing loss and to quantify improvements in intelligibility. We are motivated and qualified to develop tools for investigating other aspects hearing healthcare. We are seeking active collaborations to guide us from the audiology and hearing science aspects, provide technology requirements and investigate approaches to improve hearing healthcare. Figure 3 depicts the system architecture we conceived to support the above research questions. Resample 3:1 Mic Array Processing ADC ADC + Subband-1 Subband-6 WDRC-6 WDRC-1 + Feedback Cancellation Feedback path estimation - x(n) s(n) e(n) Resample 1:3 DAC 96 kHz Domain Input Buffer size = 96 (1ms) 32 kHz Domain Block size = 32 (1ms) HA Process Latency = 3 ms Subband filter length = 193 Feedback filter length = 128 96 kHz Domain Output Buffer size = 96 (1ms) Latency Input Buffer ............... = 1 ms Mic Array Processing = 0.03 ms Resample 3:1 ............. = 0.125 ms HA Process ................. = 3 ms Resample 1:3 .............. = 0.125 ms Output Buffer .............. = 1 ms H/W - OS (measured) .. = 4.7 ms Total Latency = 9.98 ms Hearing Aid Device OSP Layer TCP/IP Hearing Aid Control OSP Layer TCP/IP OSP Layer communicates subband compression/gain parameters HA device and receives data and diagnostics Laptop/Wearable Android Device y_(n) y_(n) e(n) s(n) s(n) filter taps update Figure 2 Signal processing chain for basic hearing aid functionality Figure 2 Hardware for interfacing ear level assemblies to a laptop

Transcript of A Real-time, Open, Portable, Extensible Speech...

A Real-time, Open, Portable, Extensible Speech Lab Harinath Garudadri, UCSD

In this work, we will develop an open, reconfigurable, non-proprietary, wearable, realtime speech processing system suitable for audiologists and hearing aid researchers to investigate new hearing aid algorithms in lab and field studies. Through active collaboration between engineers and scientists, we aim to accelerate hearing healthcare research and facilitate translation of technological advances into widespread clinical use.

Figure 1 depicts the signal processing chain supporting hearing aid functions including subband amplification, dynamic range compression feedback cancellation and remote control for investigating self-fitting methodologies. Interface to ear-level assemblies is provided through a custom interface board and an off-the-shelf audio interface box, as depicted in Figure 2. The remote control is implemented on an Android device and the protocol stack is extensible beyond controlling the gain/compression parameters. This system has been implemented in ANSI-C and runs in realtime with less than 10 ms latency on a high end MacBook. We plan to release the software in source code, PCB schematics, Gerber files, parts list, etc. in 1Q-2017.

The system described above is suitable for studies in the lab and also serves as a reference design for porting to an embedded platform. The embedded platform enables field studies including data collection in the field. We are considering the Snapdragon family of processors from Qualcomm for the wearable system. These processors have been optimized for low power, a powerful DSP, a general purpose CPU and multiple wireless connectivity options. In addition, they have the advantage of economies of scale due to their adoption in many smartphones, tablets and Internet of Things (IoT) devices. The DSP can support up to 2 GFLOPS and the CPU can support up to 4000 MIPS. The hearing aid tasks depicted in Fig. 1 are estimated to consume about 20% of the DSP resources, leaving adequate resources for supporting advanced signal processing functions on the wearable device.

We are committed to provide hardware, software, and technical support to at least 3 outside labs engaged in improving hearing healthcare. In collaboration with San Diego State University (Dr. Carol Mackersie and Dr. Arthur Boothroyd) we are investigating self-fitting methodologies. Our signal processing expertise and interests are well suited for (i) investigating intelligibility improvements in multiple noise environments, (ii) binaural processing and (iii) objective metrics to characterize hearing loss and to quantify improvements in intelligibility. We are motivated and qualified to develop tools for investigating other aspects hearing healthcare. We are seeking active collaborations to guide us from the audiology and hearing science aspects, provide technology requirements and investigate approaches to improve hearing healthcare. Figure 3 depicts the system architecture we conceived to support the above research questions.

Resample 3:1

Mic Array Processing

ADC

ADC

+Subband-1

Subband-6 WDRC-6

WDRC-1

+

Feedback Cancellation

Feedback pathestimation

-x(n) s(n)e(n)

Resample 1:3 DAC

96 kHz DomainInput Buffer size = 96 (1ms)

32 kHz DomainBlock size = 32 (1ms)

HA Process Latency = 3 msSubband filter length = 193 Feedback filter length = 128

96 kHz DomainOutput Buffer size = 96 (1ms)

Latency

Input Buffer ............... = 1 msMic Array Processing = 0.03 msResample 3:1 ............. = 0.125 msHA Process ................. = 3 msResample 1:3 .............. = 0.125 msOutput Buffer .............. = 1 msH/W - OS (measured) .. = 4.7 ms

Total Latency = 9.98 ms

Hearing Aid Device

OSP Layer

TCP/IP

Hearing Aid Control

OSP Layer

TCP/IP

OSP Layer communicatessubband compression/gain parameters HA device and receives data and diagnostics

Laptop/Wearable Android Device

y_(n)

y_(n)

e(n) s(n)

s(n)filter taps update

Figure2Signalprocessingchainforbasichearingaidfunctionality Figure2Hardwareforinterfacingearlevelassembliestoalaptop

Figure3Architectureoftheproposedsystem.Thetoppaneshowsrealtimeprocessesthatconsumenomorethan10msend-to-endlatency.Themiddlepaneshowlowlatencyprocessesthatcantake200-400msandprovidenoisesuppression,speechenhancement,binaural processing, fielddata logging, etc. Thebottompane showsmedium latencyprocesses runningon auser device such as asmartphoneoratabletforcommunicatingwiththewearabledevice.

Beamforming

Adaptive Feedback

Cancellation

Subband Amplification WDRC+

Noise Suppression

Speech Enhancement

From R

To RTo R

From R From R

From RFrom R

To R

To R To R

ADC

ADC

DAC

Data Logging

Binaural Processing

Tools

SNR Boost Tools

Decision Logic

Foreground / Background

Classification

Speech Enhancement

Tools

Wearable Device

Communication

Shared Memory

Left Ear Assembly

DSP: Realtime Processes

User Device (Smartphone/PC): Medium Latency Processes

CPU: Low Latency Processes

Graphical User

Interface

Self-Fitting Protocols

Learnt Settings

Predefined Settings

User Device Communication