SPEECH RECOGNITION 2 DAY 15 – SEPT 30, 2013 Brain & Language LING 4110-4890-5110-7960 NSCI...
-
Upload
veronica-lovett -
Category
Documents
-
view
219 -
download
0
Transcript of SPEECH RECOGNITION 2 DAY 15 – SEPT 30, 2013 Brain & Language LING 4110-4890-5110-7960 NSCI...
SPEECH RECOGNITION 2DAY 15 – SEPT 30, 2013
Brain & Language
LING 4110-4890-5110-7960
NSCI 4110-4891-6110
Harry Howard
Tulane University
2
Course organization• The syllabus, these slides and my recordings are
available at http://www.tulane.edu/~howard/LING4110/.• If you want to learn more about EEG and neurolinguistics,
you are welcome to participate in my lab. This is also a good way to get started on an honor's thesis.
• The grades are posted to Blackboard.
9/30/13 Brain & Language, Harry Howard, Tulane University
REVIEW
9/30/13 Brain & Language, Harry Howard, Tulane University 3
4
ReviewPitch shows fundamental frequency (F0)
Spectrogram shows formants (F1-3)
Sound wave
9/30/13 Brain & Language, Harry Howard, Tulane University
SPEECH RECOGNITIONIngram §5
9/30/13 Brain & Language, Harry Howard, Tulane University 5
6
• use Praat in class
9/30/13 Brain & Language, Harry Howard, Tulane University
Brain & Language, Harry Howard, Tulane University 79/30/13
Vowel articulation• Tongue height: high, (mid), low
• put your hand under your jaw and say the vowel of:• mat, met, mate, mitt, meat• meat, mitt, mate, met, mat
• Tongue advancement: front, central, back• Lip configuration: rounded, neutral, retracted
Brain & Language, Harry Howard, Tulane University 89/30/13
Vowel description
Front Central Back
Highi
ɪu
ʊ
(Mid)
e
ɛ
ɝə
ɚ
ʌ
o
ɔ
Lowæ a
Retracted Neutral Rounded
Brain & Language, Harry Howard, Tulane University 9
Sample vowel spectrograms
9/30/13
• Wide band spectrograms of the vowels of American English in a /b__d/ context. • Top row, left to right: [i, ɪ, eɪ, ɛ, æ]. Bottom row, left to right: [ɑ, ɔ, o, ʊ, u].
10
Acoustic cues and distinctive features
• Three problemsa. Input signal
b. Internal representation
c. Interface between (a)and (b)
• Lexical information retrieval• but we only need the
phonological form of a lexical item
9/30/13 Brain & Language, Harry Howard, Tulane University
11
Why speech recognition is difficult• The segmentation problem• The variability problem
• coarticulation
• The speaking environment• Speakers’ vocal tracts• Speech rate and style• Rate of information transmission
9/30/13 Brain & Language, Harry Howard, Tulane University
12
Lexical retrieval• Speech perception involves phonological parsing prior to
lexical access• It is not enough to know the lexicon beforehand.
• Phonetic forms and phonological representations• Speech/speaker normalization• Distinctive features and acoustic cues• Underspecified vs. fully specified• Discrete vs. continuous• Hierarchical organization vs. entrainment
9/30/13 Brain & Language, Harry Howard, Tulane University
NEXT TIMEFinish Ingram §6.
☞ Go over questions at end of chapter.
9/30/13 Brain & Language, Harry Howard, Tulane University 13