Spectral Estimation & Examples of Signal Analysis · · 2017-07-14Spectral Estimation & Examples...
Transcript of Spectral Estimation & Examples of Signal Analysis · · 2017-07-14Spectral Estimation & Examples...
Spectral Estimation & Examples of Signal Analysis
Examples from research of Kyoung Hoon Lee, Aaron Hastings, Don Gallant,
Shashikant More, Weonchan Sung Herrick Graduate Students
Estimation: Bias, Variance and Mean Square Error
Let φ denote the thing that we are trying to estimate. Let denote the result of an estimation based on one data set with N pieces of information. Each data set used for estimate à a different estimate of φ. Bias: True value - the average of all possible estimates
Variance: Measure of the spread of the estimates about the mean of all estimates. Mean Square Error:
φ
b(φ) = φ − E[φ]
m.s.e.= E[ (φ −φ)2 ]= b2 +σ 2
σ 2 = E[ (φ − E[φ])2 ]
Estimation: Some definitions
Estimate is consistent if, when we use more data to form the estimate, the mean square error is reduced.
If we have two ways of estimating the same thing, we say that the estimator that leads to the smaller mean square error is more efficient than the other estimator.
xxxxxxxx
x
xestimates
mean of all estimates
φ = (a,b)
a
b
true value bias
Examples
Bias and variance of an estimate of the mean:
X ,µ = 1N
Xnn=1
N∑
E[µ]= E 1N
Xnn=1
N∑
⎡
⎣⎢⎢
⎤
⎦⎥⎥=
1N
E Xn⎡⎣ ⎤⎦n=1
N∑ =
1N
µ = n=1
N∑ µ (unbiased)
σµ2E µ − E[µ]( )2⎡
⎣⎢⎤
⎦⎥= E 1
NXn
n=1
N∑
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟−µ
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
2⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥= E 1
NXn −µ( )
n=1
N∑
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
2⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
=1N2E Xm −µ( )
m=1
N∑ Xn −µ( )
n=1
N∑⎡
⎣⎢⎢
⎤
⎦⎥⎥
=1N2
N2 − N( )E Xn −µ( ) Xm −µ( )⎡⎣
⎤⎦ + N E Xn −µ( )2
⎡
⎣⎢⎤
⎦⎥⎧⎨⎩
⎫⎬⎭
=1N2
N2 − N( )E Xn −µ( )⎡⎣
⎤⎦E Xm −µ( )⎡⎣
⎤⎦ + N E Xn −µ( )2
⎡
⎣⎢⎤
⎦⎥⎧⎨⎩
⎫⎬⎭
=1N2N E Xn −µ( )2
⎡
⎣⎢⎤
⎦⎥=
1Nσ x
2
Derivation assuming that the samples Xn are independent of one another.
Separate into terms where n does not equal m and where n=m
Examples
Biased Estimate of the variance of a set of iN measurements:
Unbiased Estimates of the variance of a set of N measurements::
1N
(Xnn=1
N∑ − µ)2
1N −1
(Xnn=1
N∑ − µ)2 and 1
N(Xn
n=1
N∑ −µ)2
First estimate the mean, and use that estimate in this calculate
(have lost 1 degree of freedom)
Special case where the mean is known and doesn’t need to be
estimated from the data
Estimation of Autocovariance functions
Two methods of estimating Rxx(τ) from T sec. of data. 1. Dividing by the integration time: T-|τ|
Estimation was unbiased but had very high variance, particularly when τ is close to T.
2. Dividing by total time: T Estimation was biased (asymptotically unbiased). This was equivalent to multiplying first estimate by a triangular window (T-|τ|)/T. This window attenuates the high variance estimates.
x(t)
time
T secs
τ
x(t) x(t+τ)
Calculating the average value of [x(t) x(t+τ)] from T seconds of data.
Estimation of Autocovariance functions
Two methods of estimating Rxx(τ) from T sec. of data. 1. Dividing by the integration time: T-|τ|
Estimation was unbiased but had very high variance, particularly when τ is close to T.
2. Dividing by total time: T Estimation was biased (asymptotically unbiased). This was equivalent to multiplying first estimate by a triangular window (T-|τ|)/T. This window attenuates the high variance estimates.
x(t)
time τ
x(t) x(t+τ)
T secs
τx(t) x(t+τ)
x(t)
τ
x(t+τ)
τx(t) x(t+τ)
x(t)
τCalculating the average value of [x(t) x(t+τ)] from T seconds of data.
Estimation of Cross Covariance
Same issues as for Auto-Covariance: Bigger τ less averaging for finite T.
x(t)
time τ
y(t)
T
time T
x(t) y(t+τ)
x(t) and y(t), zero mean, weakly stationary random processes. Average value of [x(t) y(t+τ)]. Additional problem: must make T large enough to accommodate system delays.
y(t-τ) x(t)
Estimation of Covariance
With fast computation of spectra, these are now more usually estimated by inverse Fourier
transforming the power and cross spectral density estimates.
Inverse transform of RAW PSD or CSD ESTIMATE
equivalent to Method 2 for calculating covaraiance functions with triangular window for data
of size Tr
Power Spectral Density EstimationDefinition: Estimation:1. Could Fourier Transform the Autocorrelation Function estimate
(not computationally efficient). 2. Could use the frequency domain definition directly.
Raw Estimate =
No averaging! Extremely poor variance characteristics. Variance is and is unaffected by T, the length of data used.
Sxx ( f ) = limT →∞
EXT*XTT
⎡
⎣⎢⎢
⎤
⎦⎥⎥= Rxx−∞
+∞∫ (τ )e− j2π f τdτ .
Sxx ( f ) =XT*XTT
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Sxx ( f )2
Power Spectral Density Estimation (Continued)
Smoothed estimate from segment averaging.
1. Break signal up into Nseg segments, Tr seconds long. 2. For each segment:
1. Apply a window to smooth transition at ends of segments 2. Fourier Transform windowed segment à XT(f) 3. Calculate a raw power spectral density: XTr|2/Tr| estimate
3. Average the results from each segment to get the smoothed estimate and do a power compensation for the window used.
!Sxx ( f ) = 1
NSEG.PcompSxxi ( f )
i=1
NSEG∑ Pcomp = 1
Tw2(t)dt∫
x(t)
time Tr
w(t)
Power Spectral Density Estimation (Continued)
Smoothed estimate from segment averaging.
x(t)
time Tr
w(t)
Overlap: For some windows segment overlap makes sense. A Hann window, 50% overlap means that data de-emphasized in one windowed segment is strong emphasized in the next window (and vice versa).
Bias: Note PSD estimate bias is controlled by the size of the window (Tr) which controls the frequency resolution (1/Tr). Larger window, smoother transitions à less power leakage à less bias
Power Spectral Density (PSD) Estimation (Continued)
We argue that the distribution of the smoothed PSD was related to that of a Chi-squared random variable (χν2 ) with ν = 2.NSEG degrees of freedom, if Tr was large enough so we could ignore bias errors. Therefore: and rearranging we showed that: Therefore, we can control variance by averaging more segments. Note: shorter segments mean larger bias, so for a fixed T seconds of data, there is a trade-off between Segment Length (Tr), which controls the bias, and Number of Segments (NSEG), which controls the variance: T=Tr.NSEG.
Variance2.Nseg. !SxxSxx
⎡
⎣⎢
⎤
⎦⎥=4.Nseg2
Sxx2
Variance !Sxx⎡⎣ ⎤⎦= 2(2.Nseg)
Variance[ !Sxx ]=Sxx2
Nseg
Cross Spectral Density (CSD)Definition: Estimation: Could Fourier Transform the Cross-correlation function estimate (not computationally efficient). Could use the frequency domain definition directly.
Raw Estimate = As with PSD,this has extremely poor variance characteristics, so
– divide the time histories into segments, – generate a raw estimate from each segment, and – average to reduce variance and produce a smoothed estimate.
Sxy ( f ) = limT →∞
EXT*YTT
⎡
⎣⎢⎢
⎤
⎦⎥⎥= Rxy−∞
+∞∫ (τ )e− j2π f τdτ .
Sxy ( f ) =XT*YTT
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Cross Spectral Density Estimation: Segment Averaging
x(t)
time Tr
w(t)
y(t)
time Tr
w(t)
Fourier Transform of Windowed Segments à XT(f) & YT(f). Raw Estimate from ith segment = Smoothed Estimate = !Sxy ( f ) =
1Nseg
Sxyi ( f )i=1
Nseg∑
Sxyi ( f ) =XTr* ( f )YTr ( f )Tr
Issues with Cross Spectral Density Estimates
1. Reduce bias by choosing the segment length (Tr) as large as possible. (Bias greatest where the phase changes rapidly.)
2. Reduce variance by averaging many segments. 3. Might require a large amount of averaging to reduce noise effects:
4. Time delays between x and y cause problems, if the time delay (to) is greater than a small fraction of the segment length (Tr). Can estimate t0 and offset y segments, but need T+t0 seconds of data.
SNRym =SyySnyny
=H ( f )
2Sxx
Snyny
ym(t) = y(t)+ n(t) = h(t)∗ x(t)+ n(t)x(t), n(t) zero mean, weakly stationary, uncorrelated random processes
!Sxy ≈ H ( f ) !Sxx + !Sxn→ H ( f ) !Sxx
Cross Spectral Density Estimation: Segment Averaging with System Delays
x(t)
time Tr
w(t)
y(t)
time Tr
w(t)
Fourier Transform of Windowed Segments à XT(f) & YT(f).
Offsetting y segements essentially removes most of the delay from the estimated frequency response function. Can put back delay effects in by multiplying estimate of H(f) by:
e− j2π f t0
estimated t0
Coherence Function Estimation: Substitute in Smoothed Estimates of Spectral Densities
Coherence takes values in the range 0 to 1. Definition: ; Estimate: – Substituting raw spectral density estimates into formula results in 1
A result where the coherence = 1 at all frequencies from measured signals should be treated with a high degree of suspicion.
– Estimate highly sensitive to bias in spectral density estimates, which is particularly bad where the phase of the cross spectral density changes rapidly (at maxima and minima in |Sxy|).
– COHERENCE à 0 because of: NOISE NONLINEARITY BIAS ERRORS IN ESTIMATION LINEAR RELATIONSHIP BETWEEN SIGNALS VERY WEAK
γxy2 =
| Sxy |2
SxxSyy!γxy2 =
| !Sxy |2
!Sxx !Syy
Example: System with Some Nonlinearities (cubic stiffness) and Noisy Measurements
Nonlinearity causes spread of energy here, around 3x and 5x this frequency
Poor SNR on output
causing this
Nonlineary causes broad dips in
coherence function. If you drive the system
harder these regions become wider
Nonlinear Mode
Dips due to Bias Errors
Poor SNRy
Poor SNRy
Example: Linear System with Noisy Output Measurements
High SNR; Tr = 512/fs
High SNR; Tr = 2048/fs
Low SNR on output; Tr = 512/fs
SNRy also affecting coherence here
Dips mainly due to bias…. and thus get smaller
as resolution increases
Dip filled in with noise
Bias greatest where phase change is fastest
Less Averaging compared to N=512 case: fewer segments à greater variance
but bias effects are less
H1 and H2 Estimates of H: Effects of Noise
If the system is linear and there is No Noise (ignoring all other estimation erros): H(f) = Sxy(f)/Sxx(f) (H1 approach) = Syy(f)/Syx(f) (H2 approach) Cases with Noise: Assume that estimation errors are small (Tr and Nseg both large). H1estimate = Sxmym
/Sxmxm = [Sxy(f)/Sxx(f)]/[1+ Snxnx
/Sxx] = H(f)/[1+ Snxnx
/Sxx] Noise on the input adversely affects this estimate of H. Theory: |H1estimate| < |H| H2 estimate = Symym
/S*xmym
= [Syy(f)/S*xy(f)].[1+ Snyny
/Syy] = H(f).[1+ Snyny
/Syy] Noise on the output adversely affects this estimate of H. Theory: |H2 estimate| > |H| Note that with bias errors due to windowing (Tr not as large as you would like) these inequalities may not hold, but |H1estimate| < |H2estimate|
Estimation of H Note that, e.g., Frequency response function estimates are extremely sensitive to bias errors which are worse at peaks and troughs. Require large segment sizes to overcome bias, but this means less segments to average, thus higher variance. Note: A low coherence function does not necessarily imply a poor frequency response function estimate. If the coherence function is low because of noise on the response (input), then the H1 (H2) frequency response estimation should be accurate, provided sufficient averaging was done to reduce the variance of the estimates.
E[H ]= E!Sxy!Sxy
⎡
⎣⎢⎢
⎤
⎦⎥⎥≠E[ !Sxy ]E[ !Sxx ]
Calibration of PSD and CSD in MatLab
psd - old program pwelch – new program
cpsd – gives complex conjugate of want you want mean square value of the time signal (variance), should give the same result as integrating the PSD. (Parseval’s theorem) Check for whether you are getting a two-sided or a one-sided PSD. One sided: Add negative and positive frequency contributions (not for the components at f=0 and fs/2, though, which should be zero anyway) – this is what Matlab does Two sided: When you integrate the spectrum (0 to fs/2) you’ll get about half of what you expect (no addition of positive and negative frequency contributions has occurred) Matlab also doubles the CPSD from 0 to fs/2, which doesn’t make sense, because it is convenient when you estimated the frequency response function because the doubling cancels.
Calibration (continued)
Power Spectral Density Estimates Using DTFs: Recall that for –fs/2 < f < fs/2,
XT ( f ) f =k fsN≈ Δ.DFT w(nΔ).x(nΔ),n = 0,1,...N −1( ) = Δ.Xk
Sxx ( fk ) =X*T ( fk )XT ( fk )T .wcomp
≈ Δ2. X*k X kNΔ.wcomp
=Δ X k
2
N .wcomp
Calibration Continued: Energy Spectral Density We sometimes have segments that contain a single transient (tap testing of structures) and we average the raw spectra from each segment to remove noise effects. [Be careful with applying this random process theory to different types of signals, each segment used in the estimation should contain similar information.] If we choose different Tr, i.e., allow a shorter or longer time between successive transients, (transient should have died away in the segment), the PSD will change because of the divide by Tr in the formula.
To overcome this problem we estimate an Energy Spectral Density (ESD) (remove the divide by Tr in the raw PSD estimate.) Raw ESD estimate = |XTr(f)|2 ≈ Δ2 |Xk|2 (Volts/Hz)2
[You also need to be careful with window choice here so as not to distort the transient]
time - s Tr
Calibration Continued: Power Spectrum
Power SpectrumSegment averaging is often applied to signals that have periodic and random components. In a power spectrum (works great for periodic signals), as resolution increases (frequency spacing gets smaller) the noise floor decreases. Total power = sum of power at each spectral component. Recall: Ck = Xk/N, if you synchronize, don’t alias and there is no noise. Power Spectral Density (PSD) (ideal for random signals level unaffected by changes in frequency resolution – window size) Total power = the integral of the PSD = sum of PSD Values x Freq.Resolution Power estimate = |Xk|2/N2
= Raw PSD estimate . (frequency resolution) = (Δ |Xk|2/N ) . (fs/N) V2
PSDs for Sines + NoiseThe power spectral density of a sinusoid is: • But by using windows Tr seconds long, the delta
functions become sinc or sinc-like functions with maximum height affected by window size Tr.
• If Tr is too small the sinc functions will be buried in the noise. But as Tr is increased the sinc functions begin to emerge from the noise.
• So if you expect a peak in your spectrum is due to sinewave, increase the window size (better frequency resolution) and see if the peak gets larger, as you would expect if it were truly a sine wave.
A2
2δ( f − f1)+
A2
2δ( f + f1)
1000 1050 1100 1150 1200 1250 130030
35
40
45
50
55
60
65
70
75
Frequency - Hz
Leve
l - d
B
Left
OriginalSimulated
Sines + Noise
Sxmxm = Sxx + Snn
=
ATr2sinc π ( f − f1)Tr( )+ ATr2 sinc π ( f + f1)Tr( )
2
Tr+ Snn
≈A2Tr4sinc2 π ( f − f1)Tr( )+ A
2Tr4sinc2 π ( f + f1)Tr( )+ Snn
Note here we have assumed averaging is sufficient to make cross terms small compared to the terms retained.
Sines + Broadband Random Noise
Frequency - Hertz
PSD
- V
2 /Hz
Tr = NΔ N=4,096 N=8,192 N=16,384
sinc function emerging from
noise as Tr increases.
Variation in estimated PSD due to lack of
averaging. Tr larger
⇒ Nseg smaller,
∴ larger variance.
Frequency Variations in Sinusoidal Components
When there are frequency variations, sinusoidal power is spread over group of frequencies and amplitudes are reduced. Sometimes more noticeable at higher harmonics when variations are small, like signal 17 below, versus signal 22, where there appears to be very little frequency variation (sinusoidal components are narrow, even at high frequencies.
Frequency - Hz
Pow
er S
pect
ral
Den
sitie
s - d
B
Fast Modulations in Frequency Modulated (FM) Sounds
FM tones of the form (randomized variation of tones) :
– r(t) : random noise pass through a 4th order Butterworth filter – fc : cutoff freq., – B : the range of frequency modulation – f0 : center freq (700 Hz)., – Sampling frequency : 44.1 kHz
y(t) = Asin 2π fo t + 2πB r(t)0
t∫ dt
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
time cf
Instantaneous Frequency
€
w(t)
€
r(t)
€
f0 = 700
€
700 + B
€
700 − B€
f (t)
Sounds – Power Spectra of a Frequency Modulated Tone
fc = 10 Hz 50 Hz 100 Hz 200 Hz
B = 25 Hz
50 Hz
75 Hz
100 Hz
Spectral Estimation Parameters:
Hanning Window
fs = 44.1 kHz
Δ f = 1 Hz
100 segments
50% overlap
fc Filter cut-off frequency: controls frequency content of frequency variation
B controls the range of frequency
modulation
Power Spectra of a FM Tone with Trackable FMs Made Stationary
fc = 10 Hz 50 Hz 100 Hz 200 Hz
B = 25 Hz
50 Hz
75 Hz
100 Hz
Spectral Estimation Parameters:
Hanning Window
fs = 44.1 kHz
Δ f = 1 Hz
100 segments
50% overlap
fc Filter cut-off frequency: controls frequency content of frequency variation
B controls the range of frequency
modulation
Another Example of Spectral Manipulation to Help in Estimation of Tonality Metrics
Recording (> 5s) à Finely Resolved Spectrum Signal Decomposition: (1) Significant Tones, (2) Insignificant Tones, (3) Noise Floor Signal Reconstruction: (2)+(3) à (4) new noise floor, (1)+(4) Inverse DFT to give sound.
Time-Varying (Non-Stationary) Signals
Spectrograms: Apply stationary spectral methods over short periods of time with overlapping windows
– limits averaging for random parts of signals – short windows means more bias, and tones not so prominent
Time 0 à 3 seconds
Freq
uenc
y 0
à 2
000
Hz.
Humming/Whining
Motor Driven Device
Spectrograms: Sliding Spectral Estimates
Have to “play” with window sizesa. Listen to see if there are any obvious variations you can
track, try a window size about 1/10 of a variation “period” (Ta). In Matlab: nfft = nearest power of 2 to ( 0.1 fs . Ta). Typically we choose a Hann window with 50% overlap.
b. Identify fundamental frequencies of tone complexes to identify lowest desirable frequency resolution. Based on frequency analysis and understanding of repetition rates in your machine, the minimum window size should be the inverse of (fundamental frequency/7) for a Hann window. (One harmonic series example.)
c. Make window smaller (if harmonics remain well separated) to see if there are faster fluctuations. As you continue to make windows smaller, the frequency resolution in Hz (inverse of window size in seconds) will get bigger. Eventually harmonic separation and spectral resolution will merge (not good).
d. Always a trade-off between spectral and temporal resolution