Audio Visual Speech Recognition
description
Transcript of Audio Visual Speech Recognition
Image Processing Techniques for Speech Recognition
Presented by
Amr Medhat, Mostafa Fathy and Sameh Serag
Supervised by
Dr. Magda Fayek
Date: 5-5-2004
AgendaAgenda
• Introduction
• Audio Visual Modeling
• Spectrogram Reading
• Spectrogram Filtering
IntroductionIntroduction
• What is Speech Recognition?
• Speech Recognition Problems– noise– inter and intra speaker variation– continuous: no boundaries between words
• Image Processing is a possible solution
Audio Visual Speech ModelingAudio Visual Speech Modeling
• Reading speech from facial and lip movements.
• Categorizing mouth shapes visual phonemes (visemes)phoneme: the smallest distinctive unit of speech sound
• Why?– distinguish between confusing phonemes
(like f, s and m, n)– improve recognition performance in noisy
environments.
DemoDemo
Ready for the challenge ?• Listen to this audio and try to understand
the speech content: vox_mix[1].mov• Listen to speech with video image:
dig_tranexp[1].mov• Did you understand the content? Get a
prize from IBM• Play the answer: vox_clean[1].mov
(5893642)
How?How?
Audio Feature Extraction
Visual Feature Extraction
Audio-Visual Integration
• Geometric lip dimensions– Lip shape:height/width of the inner/outer lip
• Visibility of the tongue/teeth
Visual FeaturesVisual Features
AudioAudio--Visual IntegrationVisual Integration
• Feature Fusion
• Synchronization Problem
• Use low-resolution image
SpectrogramsSpectrograms
• A Spectrogram:– Translation of speech into the visual
domain
frequency
time
Spectrogram ReadingSpectrogram Reading
Waveform and Spectrogram of the word: "phonetician"
Spectrogram FilteringSpectrogram Filtering
Required:
How?
Using:Morphological Processing
Morphological ProcessingMorphological Processing
• Based on the theory of Mathematical Morphology ?!!
• Stresses the role of shape in image preprocessing used for region identification.
• Important Morphological operations:– Erosion– Dilation– Opening– Closing
Erosion & DilationErosion & Dilation
• Erosion: the meaning– Used to shrink objects.
• Dilation: the meaning– Dual of erosion.– Used to fill small gaps or valleys between shapes
• Both are irreversible
Opening & ClosingOpening & Closing
• Both used for smoothing an object contourcontour• Opening: Erosion followed by dilation
– smoothes from the inside of the object contour separate objects.
• Closing: Dilation followed by erosion– smoothes from the outside of the object contour fill in
small halls.
Erosion
Erosion
Dilation
Dilation
Spectrogram FilteringSpectrogram Filtering
thresholdingconvert
dilation
erosion
ANDing
convert