audioProcessingInMatlabSimulink
-
Upload
athar-noriaz -
Category
Documents
-
view
585 -
download
0
Transcript of audioProcessingInMatlabSimulink
Speech/Audio Signal Processing in MATLAB/Simulink
Speech/Audio Signal Processing in MATLAB/Simulink
J.-S. Roger Jang J.-S. Roger Jang ((張智星張智星))CS Dept, Tsing-Hua Univ, TaiwanCS Dept, Tsing-Hua Univ, Taiwan
((清華大學 資訊系清華大學 資訊系))http://www.cs.nthu.edu.tw/~janghttp://www.cs.nthu.edu.tw/~jang
[email protected]@cs.nthu.edu.tw
2006 Speech/Audio Signal Processing in MATLAB/Simulink
112/04/09 3
2006 Speech/Audio Signal Processing in MATLAB/Simulink
3
OutlineOutline
Wave file manipulationReading, writing, recording ...
Time-domain processingDelay, filtering, sptools …
Frequency-domain processingSpectrogram
Pitch determinationAuto-correlation, SIFT, AMDF, HPS ...
OthersFormant estimation, speech coding
112/04/09 4
2006 Speech/Audio Signal Processing in MATLAB/Simulink
4
Toolbox/Blockset UsedToolbox/Blockset Used
MATLAB
Simulink
Signal Processing Toolbox
DSP Blockset
112/04/09 5
2006 Speech/Audio Signal Processing in MATLAB/Simulink
5
MATLAB PrimerMATLAB Primer
Before you start, you need to get familiar with MATLAB. Please read “MATLAB Primer” at the following page:
http://neural.cs.nthu.edu.tw/jang/demo/demoDownload.asp
Exercise:
1. Please plot two curves y=sin(2*t) and y=cos(3*t) in the same figure.
2. Please plot x vs. y where x=sin(2*t) and y=cos(3*t).
112/04/09 6
2006 Speech/Audio Signal Processing in MATLAB/Simulink
6
To Read a Wave FileTo Read a Wave File
To read a MS .wav file (PCM format only): wavread
y = wavread(file)
[…] = wavread(file, [n1, n2])
[y, fs, nbits, opts] = wavread(file)
[…] = wavread(file, n)
[y, fs, nbits] = wavread(file)
If the wav file is stereo, y will be a two-column matrix.
112/04/09 7
2006 Speech/Audio Signal Processing in MATLAB/Simulink
7
To Read a Wav FileTo Read a Wav File
Example (wavRead01.m):[y, fs] = wavread('singapore.wav');
plot((1:length(y))/fs, y);
xlabel('Time in seconds');
ylabel('Amplitude');
Exercise :1. Plot the waveform of “rrrrr.wav”. Use MATLAB’s “zoom” button to find the
consecutive curling “R” occurs.
2. Plot the two-channel waveform in “flanger.wav”.
112/04/09 8
2006 Speech/Audio Signal Processing in MATLAB/Simulink
8
Solution to the Previous ExerciseSolution to the Previous Exercise
wavRead02.m:[y, fs] = wavread(‘flanger.wav’);
subplot(2,1,1), plot((1:length(y))/fs, y(:,1));
subplot(2,1,2), plot((1:length(y))/fs, y(:,2));
112/04/09 9
2006 Speech/Audio Signal Processing in MATLAB/Simulink
9
To Play Wav FilesTo Play Wav Files
To play sound using Windows audio output device: wavplay, sound, soundsc
wavplay(y, fs)
wavplay(y, fs, ‘async’): non-blocking call
wavplay(y, fs, ‘sync’): blocking call
sound(y, fs)
soundsc(…): autoscale the sound
Example (wavPlay01.m) :[y, fs] = wavread(‘rrrrr.wav’);
wavplay(y, fs);
Exercise :Follow the example to play “flanger.wav”.
112/04/09 10
2006 Speech/Audio Signal Processing in MATLAB/Simulink
10
To Read/Play Using DSP BlocksTo Read/Play Using DSP Blocks
To read/play sound using DSP Blockset:DSP Blockset/DSP Sources/From Wave File
DSP Blockset/DSP Sinks/To Wave Device
Example:
Exercise:Create a model as shown above.
Frame-based operation!
112/04/09 11
2006 Speech/Audio Signal Processing in MATLAB/Simulink
11
SolutionSolution
Solution to the previous exercise:
slWavFilePlay01.mdl
112/04/09 12
2006 Speech/Audio Signal Processing in MATLAB/Simulink
12
To Write a Wave FileTo Write a Wave File
To write MS wave files: wavwritewavwrite(y, fs, nbits, wavefile)
“nbits” must be 8 or 16.
“y” must have two columns for stereo data.
Amplitude values outside [-1,1] are clipped.
Example (wavWrite01.m) :[y, fs] = wavread(‘rrrrr.wav’);
wavwrite(y, fs*1.2, 8, ‘testout.wav’);
!start testout.wav
Exercise :Try out the above example.
112/04/09 13
2006 Speech/Audio Signal Processing in MATLAB/Simulink
13
To Record a Wave FileTo Record a Wave File
To record wave files:1. Use the recording utility under WinXP.
2. Use “wavrecord” under MATLAB.
3. Use “From Wave Device” under Simulink, under “DSP Blocksets/Platform Specific IO/Windows (Win32)”
Example :1. Go ahead and try WinXP recording utility!
2. Try “wavRecord01.m”
3. Try “slWavFileRecord01.mdl”
Exercise:Try out the above examples.
112/04/09 14
2006 Speech/Audio Signal Processing in MATLAB/Simulink
14
Time-Domain Speech SignalsTime-Domain Speech Signals
A typical time-domain plot of speech signals:
Amplitude: volume or intensity
Frequency: pitch
112/04/09 15
2006 Speech/Audio Signal Processing in MATLAB/Simulink
15
Changing Wave Playback Param.Changing Wave Playback Param.
To control the play of a sound:• Normal: wavplay(y, fs)
• High volume: wavplay(2*y, fs)
• Low volume: wavplay(0.5*y, fs)
• High pitch (and faster): wavplay(y, 1.2*fs)
• Low pitch (and slower): wavplay(y, 0.8*fs)
Exercise:• Try “wavPlay01.m” and trace the code.
• Create “wavPlay02.m” such that you can record your own voice on the fly.
112/04/09 16
2006 Speech/Audio Signal Processing in MATLAB/Simulink
16
Time-Domain Signal ProcessingTime-Domain Signal Processing
Take-home exrecise:How to get a high pitch with the same time span?
112/04/09 17
2006 Speech/Audio Signal Processing in MATLAB/Simulink
17
Synthetic SoundsSynthetic Sounds
Use a sine wave generator (under DSP blocksets) to produce sounds
Single frequency:
Multiple frequencies:
Amplitude modulation:
Exercise:
Create the above models.
112/04/09 18
2006 Speech/Audio Signal Processing in MATLAB/Simulink
18
SolutionSolution
Solution to the previous exercise:
sineSource01
sineSource02
sineSource03
112/04/09 19
2006 Speech/Audio Signal Processing in MATLAB/Simulink
19
Delay in Speech/AudioDelay in Speech/Audio
What is a delay in a signal?y(n) --> y(n-k)
What effects can delay generate?Echo
Reverberation
Chorus
Flanging
112/04/09 20
2006 Speech/Audio Signal Processing in MATLAB/Simulink
20
Single Delay in Audio SignalSingle Delay in Audio Signal
Block diagram:
az-kInput Outputu(n) y(n) =
u(n) + a*u(n-k)
Simulink model:
Exercise:Create the above model.
112/04/09 21
2006 Speech/Audio Signal Processing in MATLAB/Simulink
21
Multiple Delay in Audio SignalMultiple Delay in Audio Signal
How to create “karaoke” effects:
z-kInput Output y(n)u(n)
y(n) = u(n) + a u(n-k) + a u(n-2k) + a u(n-3k) ...
Simulink model:
2 3
a
112/04/09 22
2006 Speech/Audio Signal Processing in MATLAB/Simulink
22
Multiple Delay in Audio SignalMultiple Delay in Audio SignalParameter values:
• Feedback gain a < 1
• Actual delay time = k/fs
Exercise:• Create the above model and change some parameters to see their effects.
• Modify the model to take microphone input (so you can start singing karaoke now!)
• Use a “configurable subsystem” to include all possible input files and the microphone. (See next page.)
112/04/09 23
2006 Speech/Audio Signal Processing in MATLAB/Simulink
23
Multiple Delay in Audio SignalMultiple Delay in Audio SignalHow to use “configurable subsystem” block?
1. Create a library (say, wavinput.mdl)
2. Get a block of “configurable subsystem”
3. Fill the dialog box with the library name
112/04/09 24
2006 Speech/Audio Signal Processing in MATLAB/Simulink
24
Audio FlangingAudio FlangingFlanging sound:
• A sound similar to the sound of a jet plane flying overhead, or a "whooshing" sound
• “Pitch modulation” due to a variable delay
Simulink demo:• dspafxf.mdl (all platforms)
• dspafxf_nt.mdl (for 95/98/NT)
112/04/09 25
2006 Speech/Audio Signal Processing in MATLAB/Simulink
25
Audio FlangingAudio FlangingSimulink model:
Original spectrogram: Modified spectrogram:
112/04/09 26
2006 Speech/Audio Signal Processing in MATLAB/Simulink
26
Signal Processing Using sptoolSignal Processing Using sptool
To invoke sptool, type “sptool”.
112/04/09 27
2006 Speech/Audio Signal Processing in MATLAB/Simulink
27
Speech ProductionSpeech Production
How is speech produced?Speech is produced when air is forced from the
lungs through the vocal cords (glottis) and along the vocal tract.
Analogy to System Theory:Input: air forced into the vocal cords
Output: media vibration
System (or filter): vocal tract
Pitch frequency: frequency of the input
Formant frequency: resonant frequency
112/04/09 28
2006 Speech/Audio Signal Processing in MATLAB/Simulink
28
Source Filter Model of SpeechSource Filter Model of Speech
The source-filter model of speech production:Speech is split into a rapidly varying excitation
signal and a slowly varying filter. The envelope of the power spectra contains the vocal tract information.
Two important characteristics of the model are fundamental (pitch) frequency (f0) and formants (F1, F2, F3, …)
112/04/09 29
2006 Speech/Audio Signal Processing in MATLAB/Simulink
29
Frame Analysis of Speech SignalFrame Analysis of Speech Signal
Zoom in
Speech wave form :
Frame
Overlap
112/04/09 30
2006 Speech/Audio Signal Processing in MATLAB/Simulink
30
SpectrogramSpectrogram
Spectrogram (specgram.m) displays short-time frequency contents:
Wave form :
Spectrogram :
112/04/09 31
2006 Speech/Audio Signal Processing in MATLAB/Simulink
31
Real-time SpectrogramReal-time Spectrogram
Try “dspstfft_win32”:
Spectrogram:Spectrum:
112/04/09 32
2006 Speech/Audio Signal Processing in MATLAB/Simulink
32
Pitch and FormantsPitch and Formants
Pitch and formants can be defined visually:
Second formant F2
First formantF1
Pitch period = 1/f0
112/04/09 33
2006 Speech/Audio Signal Processing in MATLAB/Simulink
33
Spectrogram ReadingSpectrogram Reading
Spectrogram Reading• http://cslu.cse.ogi.edu/tutordemos/SpectrogramRe
ading/spectrogram_reading.html
“compute”
Waveform:
Spectrogram:
112/04/09 34
2006 Speech/Audio Signal Processing in MATLAB/Simulink
34
Pitch Determination AlgorithmsPitch Determination Algorithms
Time-domain:• Auto-correlation
• AMDF (Average Magnitude Difference Function)
• Gold-Rabiner algorithm (1969)
Frequency-domain:• Cepstrum (Noll 1964)
• Harmonic product spectrum (Schroeder 1968)
Others:• SIFT (Simple inverse filter tracking)
• Maximum likelihood
• Neural network approach
112/04/09 35
2006 Speech/Audio Signal Processing in MATLAB/Simulink
35
Autocorrelation of Each FrameAutocorrelation of Each Frame
Let s(k) be a frame of size 128.
s(k):
s(k-):
=30
30
x(30) = dot prod. of overlapped= sum(s(31:128).*s(1:99)
Autocorrelationx():
Pitch period
112/04/09 36
2006 Speech/Audio Signal Processing in MATLAB/Simulink
36
Autocorrelation via DSP BlocksetAutocorrelation via DSP BlocksetReal-time autocorrelation demo:
Exercise:Construct the above model and try it.
112/04/09 37
2006 Speech/Audio Signal Processing in MATLAB/Simulink
37
Pitch Tracking via AutocorrelationPitch Tracking via Autocorrelation
Real-time pitch tracking via autocorrelation: pitch2.mdl
112/04/09 38
2006 Speech/Audio Signal Processing in MATLAB/Simulink
38
Formant AnalysisFormant Analysis
Characteristics of formants:• Formants are perceptually defined.
• The corresponding physical property is the frequencies of resonances of the vocal tract.
• Formant analysis is useful as the position of the first two formants pretty much identifies a vowel.
Computation methods:• Peak picking on the smoothed spectrum
• Peak picking on the LP spectrum
• Factoring for the LP roots
• Fitting of mixture of Gaussians
112/04/09 39
2006 Speech/Audio Signal Processing in MATLAB/Simulink
39
Formant AnalysisFormant Analysis
Track Draw:• A package for formant synthesis with options to sk
etch formant tracks on a spectrogram.
• http://www.utdallas.edu/~assmann/TRACKDRAW/trackdraw.html
Formant Location Algorithm• MATLAB code by Michelle Jamrozik
• http://ece.clemson.edu/speech/files.htm
112/04/09 40
2006 Speech/Audio Signal Processing in MATLAB/Simulink
40
Speech Waveform CodingSpeech Waveform CodingTime domain coding
• PCM: Pulse Code Modulation
• DPCM: Differential PCM
• ADPCM: Adaptive Differential PCM (dspadpcm.mdl)
Frequency domain coding• Sub-band coding
• Transform coding
Speech Coding in MATLABhttp://www.eas.asu.edu/~speech/education/educ1.html
112/04/09 41
2006 Speech/Audio Signal Processing in MATLAB/Simulink
41
ConclusionsConclusions
Ideal tools for speech/audio signal processing:• MATLAB
• Simulink
• Signal Processing Toolbox
• DSP Blockset
Advantages:• Reliable functions: well-established and tested
• Visible graphical algorithm design tools
• High-level programming language yet C-compatible
• Powerful visualization capabilities
• Easy debugging
• Integrated environment
112/04/09 42
2006 Speech/Audio Signal Processing in MATLAB/Simulink
42
ReferencesReferences
[1] “Discrete-Time Processing of Speech Signals”, by Deller, Proakis and Hansen, Prentice Hall, 1993
[2] “Fundamentals of Speech Recognition”, by Rabiner and Juang, Prentice Hall, 1993
[3] “Effects Explained”, http://www.harmony-central.com/Effects/effects-explained.html
[4] “TrackDraw”, http://www.utdallas.edu/~assmann/TRACKDRAW/trackdraw.html
[5] “Speech Coding in MATLAB”, http://www.eas.asu.edu/~speech/education/educ1.html