Representing Acoustics with Mel Frequency Cepstral Coefficients
Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition
description
Transcript of Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition
2007-02-08 1
Higher Order Cepstral Moment Normalization
(HOCMN) for Robust Speech Recognition
Speaker: Chang-wen HsuAdvisor: Lin-shan Lee
2007/02/08
2007-02-08 2
Outline Introduction
CMS/CMVN/HEQ Higher Order Cepstral Moment
Normalization (HOCMN) Even order HOCMN Odd order HOCMN Cascade system Fundamental principles Experimental Results
Conclusions
2007-02-08 3
Introduction Feature normalization in cepstral domain is widely
used in robust speech recognition: CMS: normalizing the first moment CMVN: normalizing the first and second moments Cepstrum Third-order Normalization (CTN): normalizing
the first three moments (Electronics Letters, 1999) HEQ: normalizing the full distribution (all order moments) How about normalizing a few higher order moments only?
Higher order moments are more dominated by higher value samples
Normalizing only a few higher order moments may be good enough, while avoiding over-normalization
2007-02-08 4
Introduction• Cepstral Normalization
• CMS: •CMVN:
Timeprogressively
( ) ( ) [ ( )]CMSX n X n E X n ( ) [ ( )]
( )CMVNX
X n E X nX n
2007-02-08 5
Introduction• Histogram Equalization
2007-02-08 6
Higher Order Cepstral Moment Normalization If the distribution of the cepstral coefficients can be
assumed to be quasi-Gaussian: Odd order moments can be normalized to zero Even order moments can be normalized to some specific
values Define notation:
X(n): a certain cepstral coefficient of the n-th frame X[k](n): with the k-th moment normalized X[k,l](n): with both the k-th and l-th moments normalized X[k,l,m](n): with the k-th, l-th and m-th moments normalized HOCMN[k,l,m]: an operator normalizing the k-th, l-th and m-
th moments For example
2007-02-08 7
Cepstral Moment Normalization Moment estimation:
Time average of MFCC parameters
Purpose: For odd order L
For even order N
[ ] ( ) 0LLE X n
2007-02-08 8
Even order HOCMN Only the moment for a single even order N can
be normalized and CMS can always be performed in advance
Therefore, the new feature coefficients can be expressed as
Let the desired value of the N-th moment of the new feature coefficient be , that is
2007-02-08 9
Even order HOCMN Aurora 2, clean condition training, word accuracy averaged over 0~20dB
and all types of noise (sets A,B,C)
CMVN=HOCMN[1,2]
2007-02-08 10
80.40
80.80
81.20
81.60
82.00
82.40
60 70 80 90 100 110 120l
Acc.
[1,100]
Even order HOCMN Evaluation of the expectation value for the moments
Sample average over a reference interval• Full utterance• Moving window of l frames
…… X(n-3) X(n-2) X(n-1) X(n) X(n+1) X(n+2) X(n+3) ……
l
to be normalized
l=86 is best
2007-02-08 11
Experimental results
CMVN (l=86)
CMVN (full-utterance)
Aurora 2, clean condition training, word accuracy averaged over 0~20dB and all types of noise (sets A,B,C)
2007-02-08 12
Odd order HOCMN (1/3) Besides the first moment (CMS), only
another single moment of odd order L can be normalized in addition
The L-th HOCMN can be obtained from the (L-1)-th HOCMN (which is for an even number as discussed previously)
Then, the new feature coefficients can be expressed as
“a” and “c” are to be solved
2007-02-08 13
Odd order HOCMN (2/3) To solve “a” and “c”
The first moment is set to zero The N-th moment is set to zero
After some mathematics and approximation
2007-02-08 14
Odd order HOCMN (3/3) Because the formula for “a” above is only
an approximation, a recursive solution can be obtained in about two iterations
2007-02-08 15
Cascade system Cascading an odd order operator HOCMN[1,L] (L
is an odd number) and an even order operator HOCMN[1,N] (N is an even number) can obtain an operator HOCMN[1,L,N]
2007-02-08 16
Experimental results
CN
CTN=HOCMN[1,2,3]
CN (l=86)
Aurora 2, clean condition training, word accuracy averaged over 0~20dB and all types of noise (sets A,B,C)
CMVN
CTN=HOCMN[1,2,3]
CMVN (l=86)
2007-02-08 17
Skewness and Kurtosis Skewness
Third moment about the mean and normalized to the standard deviation
Pdf departure from symmetric• Positive/negative indicate skew to right/left• Zero indicate symmetric
Kurtosis
Fourth moment about the mean and normalized to the standard deviation
Peaked or “flat with tails of large size” as compared to standard Gaussian
• “3” is the fourth moment of N(0,1)• Positive/negative indicate flatter/more peaked
2007-02-08 18
Skewness and Kurtosis 1st-moment always normalized Define: Generalized skewness of odd order L
L are not necessary 3 Similar meaning as skewness (skew to right or left)
except in the sense of L–th moment
Define: Generalized kurtosis of even order N
N are not necessary 4 Similar meaning as kurtosis (peaked or flat) except
in the sense of N–th moment
( ) , : an odd integerL LS E X L
2007-02-08 19
Skewness and Kurtosis Normalizing odd order moment is to constrain
the pdf to be symmetric about the origin Except in the sense of L-th moment
Normalizing even order moment is to constrain the pdf to be “equally flat with tails of equal size” Except in the sense of N-th moment
2007-02-08 20
The order of normalized moments are not necessary integers
Generalized moment Type 1:
• Reduced to odd order moment when u is an odd integer L (ex: L=1 or 3)
Type 2:
• Reduced to even order moment when u is an even integer N (ex: N=2 or 4)
HOCMN with non-integer moment orders
Generalized Moments
2007-02-08 21
Experimental Setup Aurora2 database
Training: Clean condition training Testing: Set A, B and C Development: All from clean training data
39-dimension feature coefficients C0~C12 MFCC, Δ, Δ2
Normalization performed on C0~C12
2007-02-08 22
Experimental Results
• Higher order moments can derive more robust features• Normalizing only three orders of moments are better than full distribution
2007-02-08 23
Experimental Results
2007-02-08 24
Experimental Results
2007-02-08 25
PDF Analysis
HEQ Over fitting to Gaussian Loss original statistics
HOCMN Fitting the generalized skewness
and kurtosis Retain more speech nature
HEQ
HOCMN
Original C0 & C1
2007-02-08 26
Distance Analysis Distance definition:
• HOCMN can derive smaller distance between clean and noisy speech• distance reduction has similar trend as error rate reduction
2007-02-08 27
Experimental Results
• Slight improvement for HOCMN with non-integer order moments• Especially for lower SNR values• Other robust techniques can be combined with it
2007-02-08 28
Experimental Results
2007-02-08 29
Experimental Results
For multi-condition training: HOCMN performs better than CMVN for
all SNR values Better than HEQ for higher SNR values
2007-02-08 30
Conclusions We proposed a unified framework for
higher moment order cepstral normalization
Normalization of higher moment order gives more robust features
Parameter set can be appropriately selected by development set
Skewness/kurtosis/distance analysis can further demonstrate the concepts of the normalization techniques