A Simplified System for Analyzing Stop Consonant Acoustics By Zach Polen Advisor: Helen Hanson.
-
Upload
kerry-thomas -
Category
Documents
-
view
217 -
download
0
Transcript of A Simplified System for Analyzing Stop Consonant Acoustics By Zach Polen Advisor: Helen Hanson.
A Simplified System for Analyzing
Stop Consonant AcousticsBy Zach Polen
Advisor: Helen Hanson
BackgroundBackgroundCurrent study among Hanson (Union College),
Shattuck-Huffnagel(MIT) and Demuth(Brown) is being conducted to look at medial stops in kids.
Medial stops in this case are the consonants g,t,p,k,d and b
Why? Trying to develop models of speech development in children
What purpose does What purpose does this serve?this serve?
Provides insight into how language is stored and processed in the brain
Investigates how speech utterances are planned and produced
Possible applications to diagnose and treat speech disorders
Speech recognition and speech synthesis
Labeling Speech Events Labeling Speech Events In Children’s Speech In Children’s Speech
UtterancesUtterances
Compute Durations From Compute Durations From Labeled DataLabeled Data
Voice Bar
Vowel Duration
Onset noise
Vowel Final Noise
AnalysisAnalysis Compare Across
Subjects (10 children/mothers) Voicing ( voiced/ voiceless Place of Articulation (location on the tongue)
Current MethodCurrent MethodCurrent approach uses several programs in
order to get required data.
Acoustic Recordin
g
TextGrid Formed in
Praat
Script extracts labels/ti
mes
Excel computes durations/averages
Excel Data to .TXT
MATLAB statistical analysis
Output to Excel/Word
Current Process Current Process ExplainedExplained
1. The first step is taking the acoustic data and analyze it using TextGrids in Praat. This needs to be repeated for as many files as you have
2. The Computation/Measurement script has to be ran for as many different measurements as you are looking for. In order to do so you have to know which acoustic labels need to be subtracted from each other in the TextGrids just formed above. You also have to insert these values manually.(Created by research student at MIT)
3. The Averages and Standard Deviations then have to be found in Excel for each type of measurement you are looking for and which acoustic tokens you want to account for. (Done by Summer Research Student at Union)
4. The data needs to be put into a form MATLAB can read in, so in this case you take an Excel file and convert it to a text file. (Done during Summer Research by myself )
5. The data is read into MATLAB and put into a form in MATLAB so that it can run the statistical analysis. (Done during Summer Research by myself )
6. The data is not stored/saved in MATLAB so we have to take the data and tables and output them into either Word or Excel.
Problems with Current Problems with Current MethodMethod
Piece-meal method; although it works, leaves a lot of room for human error.
Examples: Uniform labeling system not enforced Requires numerous conversions vs. data
formats Excel analysis not automated Confusing/Complex
Researcher’s RequestsResearcher’s Requests
-Simplify system (time, process, # programs)
-Reduce opportunity for human error
-Enforce label uniformity
Design ConsiderationsDesign Considerations1. Semi-automate labeling of speech events
using Praat software
2. Import the Praat data into a MATLAB database
3. Perform the statistical analysis of data in MATLAB
4. Have the system flexible enough so that it can be used for other studies
Projected Block Projected Block DiagramDiagram
LabelingAnalysis/
Computations
Acousticrecordings
Labels/time
Tables of Statistics/Durations
Design AlternativesDesign Alternatives Current piece-meal system
Single program that: Label waveforms, read in data, analyze data,
output statistics
Hybrid system Praat for labeling acoustic waveforms MATLAB for extracting/analyzing/outputting
data
Final Design: Final Design: Implement Hybrid Implement Hybrid
ProcessProcess Completely redoing entire system would
require more time.
Why we chose to use MATLAB: Experience Most familiar programming language for both
student/advisor Users have access
DesignDesign First we had to decide how we wanted to
save the data in MATLAB.
We decided that we wanted to create a database structure.
This way the data could be conveniently stored and accessed
Breaking Down the Breaking Down the TitlesTitles
From the textgrid filenames we were able to get a lot of classification information for our database: ID Playsession Word Token Place Sound/type Voicing
Getting TextGrid Getting TextGrid InformationInformation
Ran a succession of while loops
Outer most loop contained a for loop which went through all the records
First while loop found the number of tiers
The next while loop then processes each tier (which has another 4 while loops embedded) First finds name of tier Second finds number of points in tier Third finds label of the speech event Fourth finds the time associated with that label
BreakdownBreakdown# Tiers While Loop
Number Tiers
Tiers Processed While
Loop
Name Tier
Size While Loop
Num Points
Point Process
Loop
Name Point
Time at Point
TextGridCycle
Through Records
Loop
Directory of Records
Troubleshooting Troubleshooting problemsproblems
Problem with the manual labeling is even if using the exact same format you can still make mistakes. In order to account for this we had checks in MATLAB that would make all the words lower case (if label was supposed to be v1_begin and you had V1_BEgIn MATLAB would auto correct), removed symbols like “?”(if you had made the mistake of say V1-?begin MATLAB would correct to v1_begin) and made sure all “-” were “_”.
How to save our data?How to save our data? We wanted to dynamically save our data
Allows for data on the spot This way we don’t just get last result of each
loop Get the data saved for each iteration of the for
loop Data still accessible and usable later Saving was not dependant on using our
method but could be done with other studies
The Answer?The Answer?An “eval” statement!!!
This was harder to conceptualize because you do not necessarily see what it is doing as clearly as a longer code which could save it.
The eval statement takes a bunch of strings/numbers and executes them the way you have it written out.
Sample:
eval([DurType{p} '_avg' '.' alist{m} '.' slist{i} '.' vlist{n} '.' plist{l} '=' 'mean(SAVP);'])
Another great thing Another great thing about eval!about eval!
We could have saved the data and created our database using a bunch of switch statements, or for loops, etc… But had we done it this way it would be specific to our study
The eval lets you have your list of whatever durations/events you want and from that list executes each specific one.
So you could have the label names a, b , c and I could have d, e, f and either way the eval will correlate and find the desired measurements
Duration ComputationDuration ComputationUsing a for loop and our saved database
structure we then used if statements to derive computations
What if speech events weren’t marked?
Then we found out the unique labels for subject, age, voicing and place (what we want to average across)
Duration Duration Computations Cont.Computations Cont.
We wanted to average across token in subject group:
Needed to find where subject and age intersected
Found where subject,age,voicing intersected: averaged durations across place
Found where subject, age, voicing, place intersected: averaged durations across token
Dynamically saved using eval statements
Duration Duration Computations Cont.Computations Cont.
We then wanted to average across subject in age group:
Found where age, voicing intersected: averaged across subject/place
Found where age, voicing, place intersected: averaged across subject
ComplicationsComplicationsWhat if there was no events for a given
duration?
find locations
remove locations
Why?
What else we’d like to What else we’d like to get done…get done…
Create a script which outputs durations in a neatly formatted table
Create simple GUI that allows for user to pick which durations they want formatted in table
Using our durations create a script to run standard deviations and statistical (ANOVA) analysis
Praat script which automatically puts in the labels wanted for textgrid (eliminates hand error of writing labels, and keeps all labeling uniform)
Advantage of Our Advantage of Our MethodMethod
We said we want to reduce time without reducing accuracy and hopefully even increase accuracy!
Our process reduces time which can be seen from the block diagram. There are two steps the user must take versus 6 steps in the old process.
Accuracy improves because there is less manual involvement and more automation. There are also checks in the code which eliminate previously made errors so the accuracy is even more improved.
Data does not go from program to program so less chance of a copying error. Formula’s for statistical analysis are also already input so the user does not have to worry about making a mistake on those in Excel.
All the user has to do is label their textgrids and the outputs they are looking for will be obtained.
Future WorkFuture Work Developing an automated labeling script
Simplify MATLAB code into a series of functions
AcknowledgementsAcknowledgements
Questions?Questions?