Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation...

22
Aalborg Universitet Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum Karimian-Azari, Sam; Mohammadiha, Nasser; Jensen, Jesper Rindom; Christensen, Mads Græsbøll Published in: I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings DOI (link to publication from Publisher): 10.1109/ICASSP.2015.7178788 Publication date: 2015 Document Version Early version, also known as pre-print Link to publication from Aalborg University Citation for published version (APA): Karimian-Azari, S., Mohammadiha, N., Jensen, J. R., & Christensen, M. G. (2015). Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum. I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings, 4330-4334. https://doi.org/10.1109/ICASSP.2015.7178788 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. ? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from vbn.aau.dk on: January 23, 2021

Transcript of Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation...

Page 1: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

Aalborg Universitet

Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum

Karimian-Azari, Sam; Mohammadiha, Nasser; Jensen, Jesper Rindom; Christensen, MadsGræsbøllPublished in:I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

DOI (link to publication from Publisher):10.1109/ICASSP.2015.7178788

Publication date:2015

Document VersionEarly version, also known as pre-print

Link to publication from Aalborg University

Citation for published version (APA):Karimian-Azari, S., Mohammadiha, N., Jensen, J. R., & Christensen, M. G. (2015). Pitch Estimation andTracking with Harmonic Emphasis On The Acoustic Spectrum. I E E E International Conference on Acoustics,Speech and Signal Processing. Proceedings, 4330-4334. https://doi.org/10.1109/ICASSP.2015.7178788

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ?

Take down policyIf you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access tothe work immediately and investigate your claim.

Downloaded from vbn.aau.dk on: January 23, 2021

Page 2: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

Pitch Estimation and Tracking withHarmonic Emphasis on the Acoustic Spectrum

April 23, 2015

Sam Karimian-Azari1, Nasser Mohammadiha2,Jesper R. Jensen1, and Mads G. Christensen1

[email protected]

1Audio Analysis Lab, AD:MT, Aalborg University2University of Oldenburg

Page 3: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Agenda

I IntroductionI Noisy Harmonic Signal ApproximationI ML Pitch Estimate from UFE

I Bayesian MethodsI MotivationI HMMI Kalman Filter

I Numerical ResultsI Conclusion

Page 4: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

Introduction2 Harmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

IntroductionHarmonic Signal Model

Page 5: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

Introduction3 Harmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

IntroductionHarmonic Signal Model

Page 6: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

Introduction4 Harmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

IntroductionHarmonic Signal Model

Page 7: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

Introduction5 Harmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

IntroductionHarmonic Signal Model

Harmonic Signal Model:

s(n) =

L(n)∑l=1

αl e j (ωl (n) n+ϕl ), (1)

where ωl (n) = lω0(n) for l = 1, . . . ,L(n),

L(n) : number of sinusoidsαl : real magnitudesω0 : fundamental frequencyϕl : phases of harmonics

50 100 150 200 250−2

0

2

4

6

n

s(n)

0 π/4 π/2 3π/4 π0

0.5

1

ω

|S(ω

)|

Page 8: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

6 Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Signal ModelAdditive noise

The observed signal can be written as a sum of a desiredsignal s(n) and a noise signal v(n), i.e.,

x(n) =s(n) + v(n) (2)

=L∑

l=1

αl e j (ωl n+ϕl ) + v(n).

Page 9: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

7 Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Signal ModelPhase noise

At a high narrowband SNR, the harmonic frequency ωl isperturbed with a real-valued phase-noise [S.Tretter 1985],which has a normal distribution with zero mean and thevariance

E∆ω2l (n) =

σ2

2α2l

(3)

We can approximate x(n) =∑L

l=1 αl e j (ωl n+ϕl ) + v(n) like

x(n) ≈L∑

l=1

αl e j (ωl n+∆ωl (n)+ϕl ) (4)

Page 10: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

8 Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Signal ModelUnconstrained frequency estimates (UFE)

Unconstrained frequency estimates (UFE) of the constrainedfrequencies:

Ω(n) =[ω1(n), ω2(n), . . . , ωL(n)

]T (5)= dL(n)ω0(n) + ∆Ω(n), (6)

where

dL(n) =[

1,2, . . . , L(n)]T (7)

∆Ω(n), =[

∆ω1(n), ∆ω2(n), . . . , ∆ωL(n)]T, (8)

and

R∆Ω(n) = E∆Ω(n)∆ΩT(n) (9)

=σ2

2diag

1α2

1,

1α2

2, . . . ,

1α2

L

.

Page 11: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

9 ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Max. Likelihood (ML) Pitch Estimator

For the time-frame x(n) =[

x(n), x(n−1), . . . , x(n−M−1)]T ,

the PDF of the UFE is

P(Ω(n)|ω0(n)) ∼ N (dL(n)ω0(n),R∆Ω(n)). (10)

The ML pitch estimator:

ω0(n) = arg maxω0(n)

log P(Ω(n)|ω0(n)) (11)

=[

dTL (n)R−1

∆Ω(n)dL(n)]−1

dTL (n)R−1

∆Ω(n)Ω(n) (12)

Page 12: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed Method10 Motivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Bayesian Pitch EstimatorMotivation

I The ML Estimators are statistically efficient, e.g., thenon-linear least-squares (NLS), and the weighted leastsquares (WLS) [H.Li, et al. 2000], but the minimumvariance is limited by the number of samples.

I Consecutive pitch values are estimated independently

Page 13: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed Method11 Motivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Bayesian Pitch EstimatorMotivation

I Pitch values are usually correlated in a sequence, i.e.,

P(ω0(n)|ω0(n−1), ω0(n−2), · · · ), (13)

that motivate Bayesian methods to minimize an errorincorporating prior distributions.

I State-of-the-art methods mostly track pitch estimates in asequential process without concerning noise statistics.

Page 14: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed Method12 Motivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Bayesian Pitch EstimatorHypothesis

1- Jointly estimate and track pitch incorporating both theharmonic constraints and noise characteristics.2- Estimate the state ω0(n) through a series of noisyobservations:

P(ω0(n)|Ω(n), Ω(n−1), · · · ) (14)

3- Recursively update the prior distribution of the pitch value.

Page 15: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

13 1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Bayesian Pitch EstimatorDiscrete state-space (HMM)

ω0(n) : Discrete random variable (Hidden states)

P(ω0(n)|ω0(n−1)) : Transition probability in a 1st-order Markov model,

i.e.,∑ω0(n)

P(ω0(n)|ω0(n−1)) = 1

ω0(n) = arg maxω0(n)

log P(ω0(n)|Ω(n), Ω(n−1), · · · ) (15)

= arg maxω0(n)

log P(Ω(n)|ω0(n)) + log P(ω0(n)|Ω(n−1), · · · ).

The priori distribution is defined recursively like

P(ω0(n)|Ω(n−1), Ω(n−2), · · · ) = (16)∑ω0(n−1)

P(ω0(n)|ω0(n−1))P(ω0(n−1)|Ω(n−1), · · · ),

where P(ω0(n−1)|Ω(n−1), · · · ) is the past estimate.

Page 16: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

14 2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Bayesian Pitch Estimatorstate-space representation of the pitch continuity

Continuous state-space:

ω0(n) = ω0(n−1) + δ(n)

Ω(n) = dL(n)ω0(n) + ∆Ω(n),

where δ(n) ∼ N (0, σ2t ) and ∆Ω(n) ∼ N (0,R∆Ω(n)) are the

state evolution and observation noise, respectively.

Page 17: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

15 2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Bayesian Pitch EstimatorContinuous state-space (Kalman filter)

First, a pitch estimate is predicted using the past estimates as

ω0(n|n−1) = ω0(n−1|n−1) (17)

with the variance

σ2K (n|n−1) = σ2

K (n−1|n−1) + σ2t . (18)

Second, the pitch estimate is updated with the error of

e(n) = Ω(n)− dL(n) ω0(n|n−1). (19)

Then, the predicted estimate is updated:

ω0(n|n) = ω0(n|n−1) + hK(n)e(n) (20)

hK(n) = σ2K (n|n−1)dT

L (n)[

ΠL(n)σ2K (n|n−1) + R∆Ω(n)

]−1, (21)

where ΠL(n) = dL(n)dTL (n), and update

σ2K (n|n) =

[1− hK(n)dL(n)

]σ2

K (n|n−1). (22)

Page 18: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

16 2- Continuous state-space:Kalman Filter

Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Covariance MatrixEstimation

The ML estimator of the covariance matrix among N estimates:

R∆Ω(n) = E∆Ω(n)∆ΩT(n)

=1N

n∑i=n−N+1

∆Ω(i)∆ΩT(i), (23)

where ∆Ω(n) = Ω(n)− µ(n), and µ(n) = EΩ(n).

Exponential moving average:

µ(n) = λ Ω(n) + (1−λ) µ(n−1) (24)

The forgetting factor 0 < λ < 1 recursively updates thetime-varying mean value.

Page 19: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

17 Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Numerical ResultsSynthetic signal

A linear chirp signal (r = 100 Hz/s) with L = 5 harmonics,random phases, and identical amplitudes during 0.1 s.

0 2 4 6 8 10 12 14 16 18 20

10−7

10−6

10−5

10−4

SNR [ dB ]

MSE

ω1

ω0,WLSω0,MLω0,HMMω0,KF

M = 80, ω0(1) = 400π/fs , fs = 8.0 kHz, σt =√

2πr/f2s , and for the HMM-based pitch estimator, the frequency rangeω ∈ [150, 280] × (2π/fs ) was discretized into Nd = 1000 samples.

Page 20: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

18 Numerical Results

Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Numerical ResultsReal signal

Speech signal + Car noise at SNR= 5 dB.Freq.[Hz]

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

500

1000

1500

2000

2500

3000

3500

4000

Time [ s ]

Freq.[Hz]

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1100

150

200

250

300

350

400

WLSMLKFHMM

The MAP order estimation [Djuric 1998], M = 240, λ = 0.9, and N = 150.

Page 21: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

19

Pitch Estimation andTracking

Sam Karimian-Azariet al.

IntroductionHarmonic Signal Model

Noisy Signal Approx.

ML Pitch Estimation

Proposed MethodMotivation-Hypothesis

1- Discrete state-space:HMM

2- Continuous state-space:Kalman Filter

Numerical Results

19 Conclusion

Audio Analysis Lab, AD:MT,Aalborg University, Denmark

Conclusion

I For pitch estimation, we have formulated the ML estimatefrom the UFE.

I For pitch estimation and tracking, we have proposedHMM- and KF-based methods.

I Experimental results showed that both HMM- andKF-based methods outperform the corresponding ML pitchestimators.

I The KF-based method statistically performs better thanthe HMM-based method, while the it tracks pitch changesmore accurate than the KF-based method.

Page 22: Pitch Estimation and Tracking with Harmonic Emphasis on ... · Bayesian Pitch Estimator Motivation I The ML Estimators are statistically efficient, e.g., the non-linear least-squares

Thank you!