Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding...
Transcript of Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding...
![Page 1: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/1.jpg)
Machine Learning in Signal Processing
Pitch and Intonation
![Page 2: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/2.jpg)
F0 and IntonationF0 and Intonation
What is F0What is F0What it typically looks likeWhat it typically looks likeHow to extract it from SpeechHow to extract it from SpeechHow to model ifHow to model ifHow to model what it meansHow to model what it means
![Page 3: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/3.jpg)
ProsodyProsody
How the phonemes will be saidHow the phonemes will be saidFour aspects of prosodyFour aspects of prosody
Phrasing: where the breaks will bePhrasing: where the breaks will beIntonation: pitch accents and F0 generationIntonation: pitch accents and F0 generationDuration: how long the phonemes will beDuration: how long the phonemes will bePower: energy in signalPower: energy in signal
![Page 4: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/4.jpg)
IntonationIntonation
The fundamental tuneThe fundamental tuneAccents (highlighting important parts)Accents (highlighting important parts)F0 generation (the tune itself)F0 generation (the tune itself)
![Page 5: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/5.jpg)
Intonation ContourIntonation Contour
![Page 6: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/6.jpg)
Intonation InformationIntonation Information
Large pitch range (female)Large pitch range (female)Authoritative since goes down at the endAuthoritative since goes down at the end
News readerNews readerEmphasis for Finance H*Emphasis for Finance H*Final has a raise – more information to Final has a raise – more information to comecome
Female American newsreader from WBURFemale American newsreader from WBUR(Boston University Radio)(Boston University Radio)
![Page 7: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/7.jpg)
Intonation ExamplesIntonation Examples
Fixed durations, flat F0.Fixed durations, flat F0.Decline F0Decline F0“hat” accents on stressed syllables“hat” accents on stressed syllablesaccents and end tonesaccents and end tonesstatistically trained statistically trained
![Page 8: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/8.jpg)
F0 ExamplesF0 Examples
![Page 9: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/9.jpg)
Finding PitchFinding Pitch
![Page 10: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/10.jpg)
F0 ExampleF0 Example
![Page 11: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/11.jpg)
Creaky VoiceCreaky Voice
![Page 12: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/12.jpg)
Pitch DoublingPitch Doubling
![Page 13: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/13.jpg)
Pitch HalvingPitch Halving
![Page 14: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/14.jpg)
Finding PitchFinding Pitch
Know what you are looking for and lookKnow what you are looking for and lookLow Pass filterLow Pass filter
Pitch will be in range 60-300HzPitch will be in range 60-300HzLPC and residualLPC and residual
Peaks will be clearer in residualPeaks will be clearer in residualUse autocorrelationUse autocorrelation
Find common frequencyFind common frequencyThough pitch changes over timeThough pitch changes over time
Use *my* method it works bestUse *my* method it works bestESPS get_f0ESPS get_f0PDAPDATEMPO (YIN)TEMPO (YIN)
![Page 15: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/15.jpg)
Use ElectroglottographUse Electroglottograph
• EGG/Larynograph
![Page 16: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/16.jpg)
EGGEGG
![Page 17: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/17.jpg)
EGGEGG
![Page 18: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/18.jpg)
What do you do with it?What do you do with it?
We’d like to model it We’d like to model it Predict it from textPredict it from textUse it to find “focus” in speechUse it to find “focus” in speech
Normalize itNormalize itInterpolate through unvoiced regionsInterpolate through unvoiced regionsSmooth itSmooth itParameterize itParameterize it
![Page 19: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/19.jpg)
One strategyOne strategy
Find Pitch PeriodsFind Pitch PeriodsLow pass filter, use LPC residualLow pass filter, use LPC residualUse autocorrelationUse autocorrelationPrune in expected rangePrune in expected range
Interpolate through unvoiced regionsInterpolate through unvoiced regionsConvert to F0Convert to F0
1/pitch period1/pitch periodSmoothSmooth
Or curve fitOr curve fit
![Page 20: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/20.jpg)
F0 GenerationF0 Generation
Contour from accents (and durations)Contour from accents (and durations)Piece together shapes of different accentsPiece together shapes of different accentsGeneratedGenerated
By ruleBy ruleTrained from dataTrained from data
![Page 21: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/21.jpg)
Three Point ModelThree Point Model
Find F0 at Find F0 at Syllable startSyllable startVoicing onsetVoicing onsetSyllable endSyllable end
Predict these values withPredict these values withCART/Linear RegressionCART/Linear Regression
Sort of reasonableSort of reasonableRMS: 34.8 RMS: 34.8 Correlation: 0.62Correlation: 0.62
![Page 22: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/22.jpg)
Find Structures/Shapes in F0Find Structures/Shapes in F0
Tilt Theory of IntonationTilt Theory of IntonationDescribe shapes with 5 parametersDescribe shapes with 5 parameters
Moeller Vector Quantized ShapesMoeller Vector Quantized Shapes8 shapes8 shapes
Klabbers et al, Superpostional modelKlabbers et al, Superpostional modelParameters per “foot”Parameters per “foot”
![Page 23: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/23.jpg)
Intonational PhonologyIntonational Phonology
Accents and BoundariesAccents and BoundariesWhere are the important changes in F0Where are the important changes in F0
Accents on syllables Accents on syllables Identifies “important” wordsIdentifies “important” words
It will be RAINY today in BostonIt will be RAINY today in BostonIt will be rainy TODAY in BostonIt will be rainy TODAY in BostonIt will BE rainy today IN Boston (strange)It will BE rainy today IN Boston (strange)
![Page 24: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/24.jpg)
Where do the accents go?Where do the accents go?
On important wordsOn important wordsFirst approximationFirst approximation
On stressed syllables in content wordsOn stressed syllables in content wordsIt WILL be RAINY TODAY in BOSTONIt WILL be RAINY TODAY in BOSTON
About 80% correct on news reader speechAbout 80% correct on news reader speechCART training on more featuresCART training on more features
Content, proper nouns, POS, position in textContent, proper nouns, POS, position in text(not semantic information)(not semantic information)
![Page 25: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/25.jpg)
ToBIToBI
Tones and Break IndicesTones and Break IndicesA labeling for intonation (English)A labeling for intonation (English)
Different accent typesDifferent accent typesH*, !H, L*, L+H*H*, !H, L*, L+H*
Different boundary typesDifferent boundary typesL+L%, L+H%, H+H%,L+L%, L+H%, H+H%,
![Page 26: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/26.jpg)
ToBI examplesToBI examples
![Page 27: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/27.jpg)
Using real contoursUsing real contours
From a data base of different contoursFrom a data base of different contoursSelect most appropriate oneSelect most appropriate one
Record lots of different intonation examplesRecord lots of different intonation examplesHe DID then KNOW what HAD occurredHe DID then KNOW what HAD occurredTARZAN and JANE raised THEIR headsTARZAN and JANE raised THEIR heads……
Label them and select the contours when Label them and select the contours when you want emphasisyou want emphasis
![Page 28: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/28.jpg)
Emphasis SynthesisEmphasis Synthesis
This is a short exampleThis is a short exampleTHIS is a short exampleTHIS is a short exampleThis IS a short exampleThis IS a short exampleThis is A short exampleThis is A short exampleThis is a SHORT exampleThis is a SHORT exampleThis is a short EXAMPLEThis is a short EXAMPLE
![Page 29: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/29.jpg)
Extracting F0 from “real” speechExtracting F0 from “real” speech
![Page 30: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/30.jpg)
SummarySummary
Extracting F0 from speechExtracting F0 from speechModeling F0Modeling F0
Low level to high levelLow level to high levelIntonational accentsIntonational accents
How to predict where the goHow to predict where the goProblems in moving from lab to real speechProblems in moving from lab to real speech
![Page 31: Machine Learning in Signal Processingmlsp.cs.cmu.edu/courses/fall2009/class17/mlsp_pitch.pdfFinding Pitch Know what you are looking for and look Low Pass filter Pitch will be in range](https://reader034.fdocuments.net/reader034/viewer/2022051901/5ff03af1c64295250368c6cf/html5/thumbnails/31.jpg)