audioProcessingInMatlabSimulink

41
Speech/Audio Signal Processing i n MATLAB/Simulink J.-S. Roger Jang J.-S. Roger Jang ( 張張張 張張張 ) CS Dept, Tsing-Hua Univ, Taiwan CS Dept, Tsing-Hua Univ, Taiwan ( 張張張張 張張張 張張張張 張張張 ) http://www.cs.nthu.edu.tw/~jang http://www.cs.nthu.edu.tw/~jang [email protected] [email protected] 2006 Speech/Audio Signal Processing in MATLAB/Simulink

Transcript of audioProcessingInMatlabSimulink

Page 1: audioProcessingInMatlabSimulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

J.-S. Roger Jang J.-S. Roger Jang ((張智星張智星))CS Dept, Tsing-Hua Univ, TaiwanCS Dept, Tsing-Hua Univ, Taiwan

((清華大學 資訊系清華大學 資訊系))http://www.cs.nthu.edu.tw/~janghttp://www.cs.nthu.edu.tw/~jang

[email protected]@cs.nthu.edu.tw

2006 Speech/Audio Signal Processing in MATLAB/Simulink

Page 2: audioProcessingInMatlabSimulink

112/04/09 3

2006 Speech/Audio Signal Processing in MATLAB/Simulink

3

OutlineOutline

Wave file manipulationReading, writing, recording ...

Time-domain processingDelay, filtering, sptools …

Frequency-domain processingSpectrogram

Pitch determinationAuto-correlation, SIFT, AMDF, HPS ...

OthersFormant estimation, speech coding

Page 3: audioProcessingInMatlabSimulink

112/04/09 4

2006 Speech/Audio Signal Processing in MATLAB/Simulink

4

Toolbox/Blockset UsedToolbox/Blockset Used

MATLAB

Simulink

Signal Processing Toolbox

DSP Blockset

Page 4: audioProcessingInMatlabSimulink

112/04/09 5

2006 Speech/Audio Signal Processing in MATLAB/Simulink

5

MATLAB PrimerMATLAB Primer

Before you start, you need to get familiar with MATLAB. Please read “MATLAB Primer” at the following page:

http://neural.cs.nthu.edu.tw/jang/demo/demoDownload.asp

Exercise:

1. Please plot two curves y=sin(2*t) and y=cos(3*t) in the same figure.

2. Please plot x vs. y where x=sin(2*t) and y=cos(3*t).

Page 5: audioProcessingInMatlabSimulink

112/04/09 6

2006 Speech/Audio Signal Processing in MATLAB/Simulink

6

To Read a Wave FileTo Read a Wave File

To read a MS .wav file (PCM format only): wavread

y = wavread(file)

[…] = wavread(file, [n1, n2])

[y, fs, nbits, opts] = wavread(file)

[…] = wavread(file, n)

[y, fs, nbits] = wavread(file)

If the wav file is stereo, y will be a two-column matrix.

Page 6: audioProcessingInMatlabSimulink

112/04/09 7

2006 Speech/Audio Signal Processing in MATLAB/Simulink

7

To Read a Wav FileTo Read a Wav File

Example (wavRead01.m):[y, fs] = wavread('singapore.wav');

plot((1:length(y))/fs, y);

xlabel('Time in seconds');

ylabel('Amplitude');

Exercise :1. Plot the waveform of “rrrrr.wav”. Use MATLAB’s “zoom” button to find the

consecutive curling “R” occurs.

2. Plot the two-channel waveform in “flanger.wav”.

Page 7: audioProcessingInMatlabSimulink

112/04/09 8

2006 Speech/Audio Signal Processing in MATLAB/Simulink

8

Solution to the Previous ExerciseSolution to the Previous Exercise

wavRead02.m:[y, fs] = wavread(‘flanger.wav’);

subplot(2,1,1), plot((1:length(y))/fs, y(:,1));

subplot(2,1,2), plot((1:length(y))/fs, y(:,2));

Page 8: audioProcessingInMatlabSimulink

112/04/09 9

2006 Speech/Audio Signal Processing in MATLAB/Simulink

9

To Play Wav FilesTo Play Wav Files

To play sound using Windows audio output device: wavplay, sound, soundsc

wavplay(y, fs)

wavplay(y, fs, ‘async’): non-blocking call

wavplay(y, fs, ‘sync’): blocking call

sound(y, fs)

soundsc(…): autoscale the sound

Example (wavPlay01.m) :[y, fs] = wavread(‘rrrrr.wav’);

wavplay(y, fs);

Exercise :Follow the example to play “flanger.wav”.

Page 9: audioProcessingInMatlabSimulink

112/04/09 10

2006 Speech/Audio Signal Processing in MATLAB/Simulink

10

To Read/Play Using DSP BlocksTo Read/Play Using DSP Blocks

To read/play sound using DSP Blockset:DSP Blockset/DSP Sources/From Wave File

DSP Blockset/DSP Sinks/To Wave Device

Example:

Exercise:Create a model as shown above.

Frame-based operation!

Page 10: audioProcessingInMatlabSimulink

112/04/09 11

2006 Speech/Audio Signal Processing in MATLAB/Simulink

11

SolutionSolution

Solution to the previous exercise:

slWavFilePlay01.mdl

Page 11: audioProcessingInMatlabSimulink

112/04/09 12

2006 Speech/Audio Signal Processing in MATLAB/Simulink

12

To Write a Wave FileTo Write a Wave File

To write MS wave files: wavwritewavwrite(y, fs, nbits, wavefile)

“nbits” must be 8 or 16.

“y” must have two columns for stereo data.

Amplitude values outside [-1,1] are clipped.

Example (wavWrite01.m) :[y, fs] = wavread(‘rrrrr.wav’);

wavwrite(y, fs*1.2, 8, ‘testout.wav’);

!start testout.wav

Exercise :Try out the above example.

Page 12: audioProcessingInMatlabSimulink

112/04/09 13

2006 Speech/Audio Signal Processing in MATLAB/Simulink

13

To Record a Wave FileTo Record a Wave File

To record wave files:1. Use the recording utility under WinXP.

2. Use “wavrecord” under MATLAB.

3. Use “From Wave Device” under Simulink, under “DSP Blocksets/Platform Specific IO/Windows (Win32)”

Example :1. Go ahead and try WinXP recording utility!

2. Try “wavRecord01.m”

3. Try “slWavFileRecord01.mdl”

Exercise:Try out the above examples.

Page 13: audioProcessingInMatlabSimulink

112/04/09 14

2006 Speech/Audio Signal Processing in MATLAB/Simulink

14

Time-Domain Speech SignalsTime-Domain Speech Signals

A typical time-domain plot of speech signals:

Amplitude: volume or intensity

Frequency: pitch

Page 14: audioProcessingInMatlabSimulink

112/04/09 15

2006 Speech/Audio Signal Processing in MATLAB/Simulink

15

Changing Wave Playback Param.Changing Wave Playback Param.

To control the play of a sound:• Normal: wavplay(y, fs)

• High volume: wavplay(2*y, fs)

• Low volume: wavplay(0.5*y, fs)

• High pitch (and faster): wavplay(y, 1.2*fs)

• Low pitch (and slower): wavplay(y, 0.8*fs)

Exercise:• Try “wavPlay01.m” and trace the code.

• Create “wavPlay02.m” such that you can record your own voice on the fly.

Page 15: audioProcessingInMatlabSimulink

112/04/09 16

2006 Speech/Audio Signal Processing in MATLAB/Simulink

16

Time-Domain Signal ProcessingTime-Domain Signal Processing

Take-home exrecise:How to get a high pitch with the same time span?

Page 16: audioProcessingInMatlabSimulink

112/04/09 17

2006 Speech/Audio Signal Processing in MATLAB/Simulink

17

Synthetic SoundsSynthetic Sounds

Use a sine wave generator (under DSP blocksets) to produce sounds

Single frequency:

Multiple frequencies:

Amplitude modulation:

Exercise:

Create the above models.

Page 17: audioProcessingInMatlabSimulink

112/04/09 18

2006 Speech/Audio Signal Processing in MATLAB/Simulink

18

SolutionSolution

Solution to the previous exercise:

sineSource01

sineSource02

sineSource03

Page 18: audioProcessingInMatlabSimulink

112/04/09 19

2006 Speech/Audio Signal Processing in MATLAB/Simulink

19

Delay in Speech/AudioDelay in Speech/Audio

What is a delay in a signal?y(n) --> y(n-k)

What effects can delay generate?Echo

Reverberation

Chorus

Flanging

Page 19: audioProcessingInMatlabSimulink

112/04/09 20

2006 Speech/Audio Signal Processing in MATLAB/Simulink

20

Single Delay in Audio SignalSingle Delay in Audio Signal

Block diagram:

az-kInput Outputu(n) y(n) =

u(n) + a*u(n-k)

Simulink model:

Exercise:Create the above model.

Page 20: audioProcessingInMatlabSimulink

112/04/09 21

2006 Speech/Audio Signal Processing in MATLAB/Simulink

21

Multiple Delay in Audio SignalMultiple Delay in Audio Signal

How to create “karaoke” effects:

z-kInput Output y(n)u(n)

y(n) = u(n) + a u(n-k) + a u(n-2k) + a u(n-3k) ...

Simulink model:

2 3

a

Page 21: audioProcessingInMatlabSimulink

112/04/09 22

2006 Speech/Audio Signal Processing in MATLAB/Simulink

22

Multiple Delay in Audio SignalMultiple Delay in Audio SignalParameter values:

• Feedback gain a < 1

• Actual delay time = k/fs

Exercise:• Create the above model and change some parameters to see their effects.

• Modify the model to take microphone input (so you can start singing karaoke now!)

• Use a “configurable subsystem” to include all possible input files and the microphone. (See next page.)

Page 22: audioProcessingInMatlabSimulink

112/04/09 23

2006 Speech/Audio Signal Processing in MATLAB/Simulink

23

Multiple Delay in Audio SignalMultiple Delay in Audio SignalHow to use “configurable subsystem” block?

1. Create a library (say, wavinput.mdl)

2. Get a block of “configurable subsystem”

3. Fill the dialog box with the library name

Page 23: audioProcessingInMatlabSimulink

112/04/09 24

2006 Speech/Audio Signal Processing in MATLAB/Simulink

24

Audio FlangingAudio FlangingFlanging sound:

• A sound similar to the sound of a jet plane flying overhead, or a "whooshing" sound

• “Pitch modulation” due to a variable delay

Simulink demo:• dspafxf.mdl (all platforms)

• dspafxf_nt.mdl (for 95/98/NT)

Page 24: audioProcessingInMatlabSimulink

112/04/09 25

2006 Speech/Audio Signal Processing in MATLAB/Simulink

25

Audio FlangingAudio FlangingSimulink model:

Original spectrogram: Modified spectrogram:

Page 25: audioProcessingInMatlabSimulink

112/04/09 26

2006 Speech/Audio Signal Processing in MATLAB/Simulink

26

Signal Processing Using sptoolSignal Processing Using sptool

To invoke sptool, type “sptool”.

Page 26: audioProcessingInMatlabSimulink

112/04/09 27

2006 Speech/Audio Signal Processing in MATLAB/Simulink

27

Speech ProductionSpeech Production

How is speech produced?Speech is produced when air is forced from the

lungs through the vocal cords (glottis) and along the vocal tract.

Analogy to System Theory:Input: air forced into the vocal cords

Output: media vibration

System (or filter): vocal tract

Pitch frequency: frequency of the input

Formant frequency: resonant frequency

Page 27: audioProcessingInMatlabSimulink

112/04/09 28

2006 Speech/Audio Signal Processing in MATLAB/Simulink

28

Source Filter Model of SpeechSource Filter Model of Speech

The source-filter model of speech production:Speech is split into a rapidly varying excitation

signal and a slowly varying filter. The envelope of the power spectra contains the vocal tract information.

Two important characteristics of the model are fundamental (pitch) frequency (f0) and formants (F1, F2, F3, …)

Page 28: audioProcessingInMatlabSimulink

112/04/09 29

2006 Speech/Audio Signal Processing in MATLAB/Simulink

29

Frame Analysis of Speech SignalFrame Analysis of Speech Signal

Zoom in

Speech wave form :

Frame

Overlap

Page 29: audioProcessingInMatlabSimulink

112/04/09 30

2006 Speech/Audio Signal Processing in MATLAB/Simulink

30

SpectrogramSpectrogram

Spectrogram (specgram.m) displays short-time frequency contents:

Wave form :

Spectrogram :

Page 30: audioProcessingInMatlabSimulink

112/04/09 31

2006 Speech/Audio Signal Processing in MATLAB/Simulink

31

Real-time SpectrogramReal-time Spectrogram

Try “dspstfft_win32”:

Spectrogram:Spectrum:

Page 31: audioProcessingInMatlabSimulink

112/04/09 32

2006 Speech/Audio Signal Processing in MATLAB/Simulink

32

Pitch and FormantsPitch and Formants

Pitch and formants can be defined visually:

Second formant F2

First formantF1

Pitch period = 1/f0

Page 32: audioProcessingInMatlabSimulink

112/04/09 33

2006 Speech/Audio Signal Processing in MATLAB/Simulink

33

Spectrogram ReadingSpectrogram Reading

Spectrogram Reading• http://cslu.cse.ogi.edu/tutordemos/SpectrogramRe

ading/spectrogram_reading.html

“compute”

Waveform:

Spectrogram:

Page 33: audioProcessingInMatlabSimulink

112/04/09 34

2006 Speech/Audio Signal Processing in MATLAB/Simulink

34

Pitch Determination AlgorithmsPitch Determination Algorithms

Time-domain:• Auto-correlation

• AMDF (Average Magnitude Difference Function)

• Gold-Rabiner algorithm (1969)

Frequency-domain:• Cepstrum (Noll 1964)

• Harmonic product spectrum (Schroeder 1968)

Others:• SIFT (Simple inverse filter tracking)

• Maximum likelihood

• Neural network approach

Page 34: audioProcessingInMatlabSimulink

112/04/09 35

2006 Speech/Audio Signal Processing in MATLAB/Simulink

35

Autocorrelation of Each FrameAutocorrelation of Each Frame

Let s(k) be a frame of size 128.

s(k):

s(k-):

=30

30

x(30) = dot prod. of overlapped= sum(s(31:128).*s(1:99)

Autocorrelationx():

Pitch period

Page 35: audioProcessingInMatlabSimulink

112/04/09 36

2006 Speech/Audio Signal Processing in MATLAB/Simulink

36

Autocorrelation via DSP BlocksetAutocorrelation via DSP BlocksetReal-time autocorrelation demo:

Exercise:Construct the above model and try it.

Page 36: audioProcessingInMatlabSimulink

112/04/09 37

2006 Speech/Audio Signal Processing in MATLAB/Simulink

37

Pitch Tracking via AutocorrelationPitch Tracking via Autocorrelation

Real-time pitch tracking via autocorrelation: pitch2.mdl

Page 37: audioProcessingInMatlabSimulink

112/04/09 38

2006 Speech/Audio Signal Processing in MATLAB/Simulink

38

Formant AnalysisFormant Analysis

Characteristics of formants:• Formants are perceptually defined.

• The corresponding physical property is the frequencies of resonances of the vocal tract.

• Formant analysis is useful as the position of the first two formants pretty much identifies a vowel.

Computation methods:• Peak picking on the smoothed spectrum

• Peak picking on the LP spectrum

• Factoring for the LP roots

• Fitting of mixture of Gaussians

Page 38: audioProcessingInMatlabSimulink

112/04/09 39

2006 Speech/Audio Signal Processing in MATLAB/Simulink

39

Formant AnalysisFormant Analysis

Track Draw:• A package for formant synthesis with options to sk

etch formant tracks on a spectrogram.

• http://www.utdallas.edu/~assmann/TRACKDRAW/trackdraw.html

Formant Location Algorithm• MATLAB code by Michelle Jamrozik

• http://ece.clemson.edu/speech/files.htm

Page 39: audioProcessingInMatlabSimulink

112/04/09 40

2006 Speech/Audio Signal Processing in MATLAB/Simulink

40

Speech Waveform CodingSpeech Waveform CodingTime domain coding

• PCM: Pulse Code Modulation

• DPCM: Differential PCM

• ADPCM: Adaptive Differential PCM (dspadpcm.mdl)

Frequency domain coding• Sub-band coding

• Transform coding

Speech Coding in MATLABhttp://www.eas.asu.edu/~speech/education/educ1.html

Page 40: audioProcessingInMatlabSimulink

112/04/09 41

2006 Speech/Audio Signal Processing in MATLAB/Simulink

41

ConclusionsConclusions

Ideal tools for speech/audio signal processing:• MATLAB

• Simulink

• Signal Processing Toolbox

• DSP Blockset

Advantages:• Reliable functions: well-established and tested

• Visible graphical algorithm design tools

• High-level programming language yet C-compatible

• Powerful visualization capabilities

• Easy debugging

• Integrated environment

Page 41: audioProcessingInMatlabSimulink

112/04/09 42

2006 Speech/Audio Signal Processing in MATLAB/Simulink

42

ReferencesReferences

[1] “Discrete-Time Processing of Speech Signals”, by Deller, Proakis and Hansen, Prentice Hall, 1993

[2] “Fundamentals of Speech Recognition”, by Rabiner and Juang, Prentice Hall, 1993

[3] “Effects Explained”, http://www.harmony-central.com/Effects/effects-explained.html

[4] “TrackDraw”, http://www.utdallas.edu/~assmann/TRACKDRAW/trackdraw.html

[5] “Speech Coding in MATLAB”, http://www.eas.asu.edu/~speech/education/educ1.html