2010
School of Electrical Engineering &
Telecomm
unications
UNSW ENGINEERING @
UNSW
1. Introduction Crying is the only mode for infants to express their physical and psychological status. For many years, paediatricians have been searching for non-invasive tools to measure brain function of infants; meanwhile there is growing evidence that infants with medical complications have identifiable cries. Hence, cry signals may carry informative features to reflect medical status of an infant. Moreover, cry analysis can be automated with the advent digital signal processing. Being cheap, easy to perform and completely non-invasive, automatic acoustic analysis on infant crying is a potential prognostic or diagnostic tool for certain pathologies in the future.
10
2. Aim To implement an automatic cry recogniser which classifies cries of normal and pathological (asphyxiated and hearing-impaired) infants by extracting relevant acoustic features.
3. Background 3.1. The Infant Cry Production ModelThe infant cry production mechanism can be described using the physio-acoustic model which consists of (Figure 1): the infant control organiser the independent source-filter model
The infant control organiser is a three-level processing: Upper processor is the Central Nervous System (CNS) which
determines the states of action. Middle processors represent all vegetative states, e.g. crying. Lower processors coordinate different groups of muscles.
5. Simulation Results
Average accuracy of classification is 94.1%.
4. Methodology and Experiments Implementation of automatic classification involves two phases.
Figure 3: Automatic infant cry recogniser
Class Normal Asphyxia Hearing ImpairmentAccuracy 84.2% 92.3% 98.1%
Figure 1: Physio-acoustic infant cry production model
Figure 4: Process of decision making
4.3. Leave-One-Out Training and TestingTraining samples are randomly selected from 16 (out of 17) infants. The samples of the remaining infant are used for accuracy testing. Tests are repeated by removing a different infant each time.
4.4. Model TrainingModels are trained using Gaussian mixture modelling (GMM).
4.5. Decision MakingClassifications are made using maximum likelihood criterion.
4.1. DatabaseA set of cry recordings from 5 normal, 6 asphyxiated and 6 hearing-impaired infants has been recorded and labelled by paediatricians.
4.2. Feature Extraction Fundamental frequency f0 f0 or pitch is the quasi-periodic vibration rate of vocal folds in the larynx.
Formants Fi Formants represents resonant frequencies of the vocal tract.
Spectral Centroids SCi A spectral centroid indicates the dominant frequency of a given frequency sub-band and is calculated as the average frequency weighted by amplitudes:
f0 = 1 / T0
F1 F2 F3
SC2
CNS controls sub-glottal (respiratory system), glottal (larynx) and supra-glottal (nasal and vocal tracts) independently. It is assumed that pathologies will affect the functionality of CNS. Consequently, any malfunction in either group of muscles is directly reflected in the cry sound produced. Therefore, acoustic anomalies can be correlated to physiological pathologies.
3.2. Infant Vocalisation ModesInfant cry signals present both voiced and unvoiced structures: Voiced cry when vocal
folds are vibrating.o Phonation
vibration rate < 700 Hzo Hyper-phonation
vibration rate > 700 Hz
Unvoiced cry when vocal folds are inactive.
o Dys-phonation
Hyp
er-p
hona
tion
Dys
-pho
natio
n
Phon
atio
n
Phon
atio
n
Phon
atio
n
Dys
-pho
natio
n
Figure 2: The three basic infant cry modes
Automatic Infant Cry Analysis ~ An Acoustic Approach ~Author : Voon Hian Lee Supervisor : Dr. Hadis M. NosratighodsStudent ID : 3195964 Assessor : Dr. Julien Epps
6. Conclusion and Future Work Average accuracy attained is 94.1%, results show that automatic classification of infant cry signals via acoustic analysis is feasible.In the future, we will investigate other acoustic features to: identify other pathologies determine the causes of crying (e.g. pain, hunger and discomfort)
ENGINEERING @ UNSW
Top Related