Post on 02-Aug-2015
description
Image Processing Techniques for Speech Recognition
Presented by
Amr Medhat, Mostafa Fathy and Sameh Serag
Supervised by
Dr. Magda Fayek
Date: 5-5-2004
AgendaAgenda
• Introduction
• Audio Visual Modeling
• Spectrogram Reading
• Spectrogram Filtering
IntroductionIntroduction
• What is Speech Recognition?
• Speech Recognition Problems– noise– inter and intra speaker variation– continuous: no boundaries between words
• Image Processing is a possible solution
Audio Visual Speech ModelingAudio Visual Speech Modeling
• Reading speech from facial and lip movements.
• Categorizing mouth shapes visual phonemes (visemes)phoneme: the smallest distinctive unit of speech sound
• Why?– distinguish between confusing phonemes
(like f, s and m, n)– improve recognition performance in noisy
environments.
DemoDemo
Ready for the challenge ?• Listen to this audio and try to understand
the speech content: vox_mix[1].mov• Listen to speech with video image:
dig_tranexp[1].mov• Did you understand the content? Get a
prize from IBM• Play the answer: vox_clean[1].mov
(5893642)
How?How?
Audio Feature Extraction
Visual Feature Extraction
Audio-Visual Integration
• Geometric lip dimensions– Lip shape:height/width of the inner/outer lip
• Visibility of the tongue/teeth
Visual FeaturesVisual Features
AudioAudio--Visual IntegrationVisual Integration
• Feature Fusion
• Synchronization Problem
• Use low-resolution image
SpectrogramsSpectrograms
• A Spectrogram:– Translation of speech into the visual
domain
frequency
time
Spectrogram ReadingSpectrogram Reading
Waveform and Spectrogram of the word: "phonetician"
Spectrogram FilteringSpectrogram Filtering
Required:
How?
Using:Morphological Processing
Morphological ProcessingMorphological Processing
• Based on the theory of Mathematical Morphology ?!!
• Stresses the role of shape in image preprocessing used for region identification.
• Important Morphological operations:– Erosion– Dilation– Opening– Closing
Erosion & DilationErosion & Dilation
• Erosion: the meaning– Used to shrink objects.
• Dilation: the meaning– Dual of erosion.– Used to fill small gaps or valleys between shapes
• Both are irreversible
Opening & ClosingOpening & Closing
• Both used for smoothing an object contourcontour• Opening: Erosion followed by dilation
– smoothes from the inside of the object contour separate objects.
• Closing: Dilation followed by erosion– smoothes from the outside of the object contour fill in
small halls.
Erosion
Erosion
Dilation
Dilation
Spectrogram FilteringSpectrogram Filtering
thresholdingconvert
dilation
erosion
ANDing
convert