SP10Barbe-Wech Method Revisited

download SP10Barbe-Wech Method Revisited

of 13

description

adsp

Transcript of SP10Barbe-Wech Method Revisited

  • IEEE

    Proo

    f

    IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010 1

    Welch Method Revisited: Nonparametric PowerSpectrum Estimation Via Circular Overlap

    Kurt Barb, Rik Pintelon, and Johan Schoukens

    AbstractThe objective of this paper is twofold. The first partprovides further insight in the statistical properties of the Welchpower spectrum estimator. A major drawback of the Welchmethod reported in the literature is that the variance is not amonotonic decreasing function of the fraction of overlap. Selectingthe optimal fraction of overlap, which minimizes the variance,is in general difficult since it depends on the window used. Weshow that the explanation for the nonmonotonic behavior of thevariance, as reported in the literature, does not hold.

    In the second part, this extra insight allows one to eliminate thenonmonotonic behavior of the variance for the Welch power spec-trum estimator (PSE) by introducing a small modification to theWelch method. The main contributions of this paper are providingextra insight in the statistical properties of the Welch PSE; modi-fying the Welch PSE to circular overlapthe variance is a mono-tonically decreasing function of the fraction of overlap, making themethod more user friendly; and an extra reduction of variancewith respect to the Welch PSE without introducing systematic er-rorsthis reduction in variance is significant for a small numberof data records only.

    Index TermsNonparametric, overlap, power spectrum, systemidentification, variance reduction, Welchs method, windowing,WOSA method.

    I. INTRODUCTION

    A N important application area of digital signal processing(DSP) is the power spectral estimation of periodic andrandom signals. Speech recognition problems use spectrumanalysis as a preliminary measurement to perform speech band-width reduction and further acoustic processing. Sonar systemsuse sophisticated spectrum analysis to locate submarines andsurface vessels. Spectral measurements in radars are used toobtain target location and velocity information [1]. The vastvariety of measurements that spectrum analysis encompassesis perhaps limitless, and thus, the subject received a lot ofattention over the last five decades [2].

    Another important area of application is system identi-fication. Frequency-domain system identification offers a

    Manuscript received November 27, 2008; revised August 14, 2009. . Theassociate editor coordinating the review of this manuscript and approving itfor publication was Dr. Tryphon T. Georgiou. This work was supported bythe Flemish Government under Methusalem Grant METH-1, the Belgian State,Prime Ministers Office, Science Policy programming under the Belgian Pro-gram on Interuniversity Poles IUAP VI/4DYSCO, and the Research Councilof Vrije Universiteit Brussel (OZR).

    The authors are with Vrije Universiteit Brussel, 1050 Brussels, Belgium(e-mail: [email protected]; [email protected]; [email protected];[email protected]).

    Digital Object Identifier 10.1109/TSP.2009.2031724

    tool to identify the transfer function of a linear dy-namic time-invariant system measured at angular frequencies

    , where . Theuser wants to identify the transfer function on a frequency bandof interest such that a high-frequency resolution is desired. Ahigh-frequency resolution implies long and costly experiments.When the noise characteristics are unknown, these are to be es-timated from the input/output data. One can use nonparametricnoise models to obtain this information, [3], which is closelyrelated to nonparametric power spectrum estimation (PSE).

    There are mainly two classes of power spectrum estimators:the parametric and the nonparametric estimators. The class ofparametric PSE [4], [5] tries to fit a parametric model (AR,ARMA, MA, etc.) to the signal by minimizing a given costfunction; for example: Burgs entropy method [6] and theYuleWalker method [7]. In contrast to parametric methods,nonparametric methods do not make any assumptions on thedata-generating process or model (e.g., autoregressive model,presence of a periodic component). There are five commonnonparametric PSE available in the literature: the periodogram[8], the modified periodogram [2], Bartletts method [9],BlackmanTukey [7], and Welchs method [10]. However, allthese nonparametric PSEs are modifications of the classicalperiodogram method introduced by Schuster [8].

    The periodogram is defined as [2]

    It is well known that the periodogram is asymptotically un-biased but inconsistent because the variance does not tend tozero for large record lengths. One can show [2], [11] that thevariance on the periodogram of an ergodic weakly sta-tionary signal [12], for is asymptoti-cally proportional to , the square of the true power at fre-quency bin . The periodogram uses a rectangular time-window,a weighting function to restrict the infinite time signal to a finitetime horizon, [13]. The modified periodogram uses a nonrect-angular time window [14].

    A way to enforce a decrease of the variance is averaging.Bartletts method [9] divides the signal of length intosegments of length each. The periodogram methodis then applied to each of the segments. The average of theresulting estimated power spectra is taken as the estimatedpower spectrum. One can show that the variance is reduced bya factor , but the spectral resolution is also decreased by afactor , [2]. The Welch method [10] eliminates the tradeoffbetween spectral resolution and variance in the Bartlett methodby allowing the segments to overlap; see Fig. 1. Furthermore,

    1053-587X/$26.00 2009 IEEE

  • IEEE

    Proo

    f

    2 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010

    Fig. 1. Regular overlap for and fraction of overlap .

    Fig. 2. Circular overlap for and fraction of overlap .

    the time window can also vary. Essentially, the modified peri-odogram method is applied to each of the overlapping segmentsand averaged out.

    Like in system identification, if a high-frequency resolutionis desired, one can only split the record in a small number ofsegments of length . In system identification, the number ofsegments is typically 2,3, 6; see [15] and [16]. Unfortu-nately, a small number of segments implies a higher varianceof the estimated power spectrum. Therefore, it is worth lookinginto methods using overlapping subrecords for reducing thevariance as the main processing algorithm remains unchanged.

    For more than three decades, many successful applicationsof the Welch PSE have been reported throughout the literature.However, there are still some open questions regarding the un-derlying theory of the Welch method. Indeed, only a handfulof papers extend beyond the analysis of Welch [10]. In [10],the mean of the Welch PSE and the variance for a fraction ofoverlap and were studied. In [13], the variance ofthe Welch PSE was discussed for a fraction of overlap .In [17], the probability density function of the Welch PSE wasinvestigated. A couple of papers [18], [19] reported a drawbackconnected to the Welch PSE: the variance is a nonmonotoni-cally decreasing function of the fraction of overlap. The vari-ance starts to grow when the fraction of overlap becomes toolarge. This makes the method difficult to apply, as it is not clearwhich fraction of overlap to use since the optimum also dependson the type of the window used.

    It is explained intuitively [18], [19] that this is a result ofthe competing effects of more segments (lowering the variance)and more overlap among segments (increasing the variance).In this paper, we show that this intuitive explanation does nothold. In particular, we illustrate that by using circular overlap(Fig. 2), the variance of the PSE is monotonically decreasingas the fraction of overlap increases. We explain that Welchsmethod suffers from the nonmonotonicity due to the unequalweighting of the measured samples, increasing the uncertaintyof the PSE considerably.

    Furthermore, we show that the variance of the PSE can bereduced up to a factor , with the fraction of overlapand the number of independent segments, with respect to thevariance of the Welch PSE.

    1isiknowledge.com.

    Fig. 3. independent segments of the data record.

    However, it is clear by using circular overlap that the concate-nation of the end of the record with the beginning of the recordcreates a discontinuity in the signal. We need to examine the in-fluence of this discontinuity on the statistical properties of thePSE with circular overlap. It is shown that no extra systematicerrors are introduced by the discontinuity, implying that the PSEwith circular overlap is asymptotically unbiased. However, theconvergence rate is slightly slower. Furthermore, we show thatthe bias is of the same order of magnitude as the bias of Welchsmethod.

    The user can gain these properties for free, as the algorithmfor computing the PSE via circular overlap is essentially thesame as computing the Welch PSE, taking into account someminor modifications.

    In summary, the main contributions of this paper are: providing further insight in the statistical properties of the

    Welch PSE; improving the Welch PSE, which implies a decrease in un-

    certainty up to a factor ; guaranteeing a monotonic decrease in variance of the PSE

    via circular overlap as a function of the fraction of overlap.

    II. ASSUMPTIONS AND NOTATIONS

    We assume that the signal withis a weakly stationary Gaussian process. The signal is di-vided into segments, as in Fig. 3, such that every segmenthas length . Further, we extract different overlappingsubrecords by following the scheme of Fig. 2. The th overlap-ping subrecord of the signal satisfies

    (1)

    with , anddenotes modulo imposing circularoverlap. We need to impose that circular overlap implies an in-teger number of overlapping subrecords; further, the differentsubrecords need to overlap an integer number of time samples.One can show that these two properties are, respectively, satis-fied if the following two conditions hold simultaneously:

    forfor (2)

    The first expression in (2) implies that the number of subrecordsextracted via circular overlap is an integer. The last expressionimplies that only an integer number of points are overlapped.

    In the beginning of the section, we assumed the signalto be weakly stationary. Next, we apply Wolds decompositiontheorem [12], which states that a weakly stationary Gaussiansignal can be interpreted as filtered white Gaussian noise,where the filter has a square summable impulse response, butnot necessarily stable in the bounded-input bounded-output

  • IEEE

    Proo

    f

    BARB et al.: WELCH METHOD REVISITED: NONPARAMETRIC POWER SPECTRUM ESTIMATION VIA CIRCULAR OVERLAP 3

    Fig. 4. The Wold decomposition of a weakly stationary signal.

    sense, which requires an absolutely summable impulse re-sponse; see Fig. 4. In the sequel of this paper, we shall assumethe following.

    Assumption 1: The signal with is aweakly stationary Gaussian process, such that

    , where is zero-mean white Gaussian noise with unitvariance and denotes the impulse response of the cor-responding filter. We assume that the sequence satisfies

    .

    Assumption 1 implies that the filter characteristic iscontinuously differentiable. This condition can be relaxed tocontinuously filter characteristics, while the conclusions of thispaper remain valid, except the convergence rate of the bias,which drops from to ; see [11] for more de-tails.

    The main objective is to estimate the power spectrum of thesignal . The true power spectrum at frequency ofa weakly stationary signal is defined as [2]

    (3)

    where denotes the expectation and the last equality is a di-rect consequence of Wolds decomposition [12].

    Assumption 2: The time window is assumed to be zerofor . The windowing function is either a rectan-gular window or nonnegative, unimodal,and symmetrical at , such that .

    Due to the fact that the time window is unimodal, Assumption2 implies that the Riemann integral of the window functionexists [20]. Furthermore, it is important to note that Assumption2 captures all windows analyzed in [13].

    We define the windowed discrete Fourier transform (DFT) ofthe th subrecord at frequency bin as

    (4)

    with . We corrected the phase, for the delay, to refer all subrecords to the same time origin.

    III. DIRECT GENERALIZATION OF THE PERIODOGRAM TOCIRCULAR OVERLAP

    In this section, we propose a direct generalization of the pe-riodogram method and Welch method to circular overlappingsubrecords and make a detailed analysis of both methods.

    For every subrecord as in Fig. 2, the estimated power for sub-record of the signal is defined as

    (5)

    By averaging over the different overlapping subrecords in (5),an estimate of the power spectrum of the signal is definedas

    (6)

    Before studying the bias and the variance of the estimator (6),we prove the following property.

    Theorem 1: Under the conditions of Assumptions 1 and 2, theequation shown at the bottom of the page holds asymptotically

    . Further, denotes the DFT of the filter ,as explained in Section II and

    with .The proof can be found in Appendix A.Remark 1: In Theorem 1, the first convergence rate holds for the non-

    circular segments and the second convergence rate holdsfor the circular segments.

    Note that since the function is a windowingfunction, and if , the function

    for .Furthermore, under Assumption 1, the filter charac-teristic is continuously differentiable. Hence,one can show, by applying [21, Theorem 3.15], that

    .

    Theorem 1 further implies the following.Corollary 1: The first two moments of equal

    for

    elsewhere

  • IEEE

    Proo

    f

    4 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010

    with

    The proof can be found in Appendix B.We can analyze the bias and the variance of the PSE (6). The

    expected value of the PSE (6) is given by

    where the last equality comes from the fact that, for every

    This indicates that the bias is given by

    Bias (7)

    and the variance is given by

    (8)

    where the last approximate equality holds asymptotically by ap-plication of Corollary 1. Since nonparametric PSEs are incon-sistent, the variance does not vanish asymptotically but is dom-inated by (8).

    A. Collating the MSE of the PSE (5) With the MSE of theWelch Method

    To study the differences between the MSE of (6) andthe Welch method, we discuss first the bias contribution

    and secondly the variance contribu-tion separately.

    1) The Bias Contribution: In the case of circular overlap,the bias contribution (7) is dominated by and

    , which is not the magnitude of the DFT of the wholewindowing function, where for the Welch PSE the bias contri-bution is dominated by . This observation indicates thatan extra leakage contribution is introduced by the circular seg-ments. Therefore, by breaking the window, the good propertiesof the windowing function are ruined in the circular segments.

    2) The Variance Contribution: Next, we study the variancecontribution in the MSE of (6). Expression (8) reveals that thevariance can be approximated by

    By taking circular overlap (2) into account, it can easily be seenthat

    (9)

    where correlation . Welch [10] showedthat the variance of the Welch PSE equals

    (10)

    with the number of overlappingsubrecords for regular overlap. Note that the expression forthe variance of the Welch PSE uses an unequal weighting forthe correlation between two overlapping subrecords. Circularoverlap weights the correlation between two overlappingsubrecords equally. Although the correlation between two over-lapping subrecords does not depend on the method of overlap(circular overlap or regular overlap), the nonmonotonicityof the variance reported in [18] and [19] is not observed forcircular overlap.

    Although we are not able to formally prove this claim, wecan analytically show that the variance is a monotonically de-creasing function of the fraction of overlap when a rectangularwindow is used. For a rectangular window and , thevariance (9) equals

    (11)

    The computation of (11) is found in Appendix C.Hence, the variance of the PSE via circular overlap is a

    parabola with minimum for . Extensive simulations,without a formal proof, revealed that for the windows describedby Assumption 2, the following conjectured claim holds if thewindow function is -times continuously differentiable:

    (12)

  • IEEE

    Proo

    f

    BARB et al.: WELCH METHOD REVISITED: NONPARAMETRIC POWER SPECTRUM ESTIMATION VIA CIRCULAR OVERLAP 5

    with the lower bound for tending to one.Next, we compare the variance expression (9) for circular

    overlap with (10), the Welch method, as follows:

    Since the correlation is positive, the following inequalityholds:

    (13)

    An important consequence of (13) is that the vari-ance can be reduced as low as

    . Hence, circular overlapcan reduce the variance of the Welch PSE up to factor 1 .

    It is known that the variance of the Welch PSE is not mono-tonically decreasing as a function of overlap [18], [19]. A prioricomparing (9) and (10) suggests that the reason the PSE via cir-cular overlap outperforms the Welch PSE is due to the fact thatmore segments are averaged out. However, the fact that the vari-ance of the Welch PSE is not monotonically decreasing impliesthat increasing the fraction of overlap will not provide a reduc-tion in variance.

    Next, we illustrate these results with a numerical example. InSection IV, we address the question of whether it is possible toeliminate this biasvariance tradeoff.

    B. Numerical ExampleIn this section, we illustrate the method of circular overlap

    as described in the last section. In particular, we shall illustratethe variance reduction (13) and the MSE. In a second example,we illustrate that the variance of the PSE via circular overlapis monotonically decreasing where the Welch PSE suffers fromnonmonotonicity as reported in [19] and [18].

    1) Example: Keeping in mind the requirement of a maximalfrequency resolution (e.g., system identification), we illustratethe method for (two data records only). We used a type1 digital Chebychev filter of order two, a stopband ripple of 20dB, and a cutoff frequency at . A zero-mean whiteGaussian noise sequence with unit variance of pointswas generated. The sequence was partitioned in two records of

    points, and a Hanning window was applied. The lightgray curve in the first plot shows one realization of the PSE viacircular overlap, and the dark gray curve shows one realizationof the PSE of Welch. The second plot shows the correspondingroot mean squared error (RMSE) of the PSE via circular overlapand the PSE of Welch, respectively, both estimated on 10 000Monte Carlo simulations. The third and fourth plots zoom inat the resonance. In Fig. 5, we see that we gain approximately 2

    Fig. 5. PSE via circular overlap for a type1 digital Chebychev filterof order two. The first plot shows the estimated power spectrum; the second plotshows the root mean squared error of the estimated PSE of the first plot. Thethird and fourth plots show a zoom-in at the resonance of the first and secondplots, respectively. The dashed curve is the true power spectrum, the light graycurve is the PSE via circular overlap, and the dark gray curve is the Welch PSE.

    dB in RMSE in the passband of the estimator, which is in agree-ment with a factor of (13). However, for the higherfrequencies, it is clear that the PSE with circular overlap has dif-ficulties following the slope towards infinity, where the WelchPSE does not suffer from this effect. Due to the fact that the vari-ance of the PSE is proportional to the true power, the MSE forthe higher frequencies is dominated by the bias contribution. Weconclude that the variance of the PSE via circular overlap is re-duced, but at the expense of an increase in bias. By introducinga small modification for the PSE via circular overlap, we showin Section IV that this extra bias can be suppressed.

    2) Example: In this example, we use the same setup as in theprevious example. We used a type1 digital Chebychev filter oforder two, a stopband ripple of 20 dB, and a cutoff frequencyat 0.15 . A zero-mean white Gaussian noise sequence withunit variance of points was generated via Matlab.The sequence was partitioned in records ofpoints, and a Hanning window was applied.

    We compute the variances of the PSE via circular overlapand the PSE of Welch as a function of the fraction of overlap.In particular, we choose all fractions of overlap between zeroand 9 such that, for the Welch PSE, the number of segments is

  • IEEE

    Proo

    f

    6 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010

    Fig. 6. The black curve indicates the theoretical normalized variance of theWelch PSE, and the gray curve indicates the theoretical variance of the PSE viacircular overlap as a function of the fraction in the top plot and as a function ofthe number of overlapping subrecords in the bottom plot. The x-markers arethe simulated normalized variance at the resonance frequency on 6000 MonteCarlo simulations of the PSE of a type1 digital Chebychev filter of order two.

    an integer . For the PSE via circularoverlap, we choose all the possible fractions satisfying (2). Thisis illustrated in the top plot of Fig. 6.

    Secondly, the top plot of Fig. 6 was redrawn, such thatthe -axis is a function of the number of overlapping seg-ments. If the Welch PSE was computed for the fractions

    , we plotted the variance of the Welch PSEas a function of

    and similarly for the PSE via circular overlap. This isillustrated in the bottom plot of Fig. 6. In both figures, wenormalized the computed variances by dividing these by thetrue power.

    It is clear that in both plots of Fig. 6, the variance of the WelchPSE (indicated in black) is not a decreasing function of both thefraction of overlap and the number of overlapping subrecords.The variance of the Welch PSE has a minimum that depends onthe window used. In general, the question arises of which frac-tion of overlap to use in order to minimize the variance. In thecase of the PSE via circular overlap, the variance (indicated bythe gray curve) does not suffer from this effect. The varianceis a monotonically decreasing function of both the fraction of

    overlap and the number of overlapping subrecords. It is impor-tant to stress that Fig. 6 illustrates the conjectured claim (12) asvariance for the PSE with circular overlap decreases monotoni-cally to a lower bound for tending to one.

    Note that the x-markers coincide with the theoretic curveswithin the accuracy of the Monte Carlo simulation (

    with 95% confidence).Furthermore, this example shows that the common explana-

    tion for the nondecreasing behavior of the variance of the WelchPSEthat a tradeoff needs to be made between increasing thefraction of overlap (more correlation between two overlappingsubrecords) and more subrecords to averagedoes not hold.The correlation, in the first plot of Fig. 6, between two sub-records is the same for both the Welch PSE and the PSE viacircular overlap. However, the variance of the PSE via circularoverlap is monotonically decreasing.

    In the next section, we discuss a way to obtain the reductionin variance without increasing the bias.

    IV. IMPROVED POWER SPECTRUM ESTIMATION VIA CIRCULAROVERLAP

    In Section III, it was shown that the bias of the PSE viacircular overlap is a function of and .We observed that the good properties of the windowing func-tion were ruined by using circular overlap since, for instance,

    is a function of ,which is not the magnitude of the DFT of the whole windowingfunction. This led to significant bias for the frequency lineswhere the true power , as illustrated in exampleB.1. The following modification solves this problem withoutlosing the variance reduction.

    We introduce the following notation, which is needed in themodification. Let be the windowing function used; thenwe define

    forfor

    For the noncircular subrecords , nomodification of the periodogram is needed. For the circular sub-records , we define

    (14)

    where is defined as in (4), where the windowwith is used. The onlydifference between (5) and (14) is the use of the windowfor the circular segments, as illustrated in Fig. 7. The improvedPSE via circular overlap becomes

    (15)

  • IEEE

    Proo

    f

    BARB et al.: WELCH METHOD REVISITED: NONPARAMETRIC POWER SPECTRUM ESTIMATION VIA CIRCULAR OVERLAP 7

    Fig. 7. Illustration of the applied windows for each subrecord for and .

    A. Properties of (15)Following the scheme of Fig. 7, it is clear that the window

    applied to the circular segments will not lose its good proper-ties. Applying Theorem 1 and repeating the same discussion asin Section III, the bias and the variance of (15) can be approx-imated by

    Bias

    (16)(17)

    where

    Fortunately, due to the fact that, for instance,is the magni-

    tude of the DFT of the whole window function, the goodproperties of the windowing function remain valid.

    For the variance analysis of in (17), it was notpossible to find an analytical expression. However, extensivesimulations showed that the following inequality holds asymp-totically:

    (18)

    This is what we expect due to the fact that we actually onlychanged the resolution of the window in the scheme in Fig. 7but not the type of window.

    In conclusion, the improved estimator (15) hasapproximately the same variance as the estimator (6).However, the bias of the estimator is significantlyreduced with respect to the bias of . This is mainlydue to fact that the good properties of windowing for leakagereduction are ruined in , whereas this is not the case forthe estimator .

    B. Numerical ExampleIn this section, we illustrate the modified PSE via circular

    overlap by repeating example III.B. In particular, we illustratethat the bias due to the modification is of the same magnitude

    as the bias of the Welch PSE and that the variance reduction(18) is approximately achieved. In the second example, we showthat the variance of the modified PSE via circular overlap re-mains a monotonically decreasing function of both the fractionof overlap and the number of overlapping subrecords.

    1) Example: In the first example, we show that the modifica-tion works where a Hanning window was applied. In particular,we show on the same simulation example as in Section III-Bthat the approximately the same RMSE reduction is achievedand this without paying the price in bias. The light gray curvein the first plot shows one realization of the PSE via circularoverlap, and the dark gray curve shows one realization of thePSE of Welch. The second plot shows the corresponding RMSEof the PSE via circular overlap and the PSE of Welch, respec-tively, both estimated on 10 000 Monte Carlo simulations. Thethird and fourth plots zoom in on the resonance frequency. InFig. 5, we see that we gain approximately 2 dB in RMSE in thepassband of the estimator, which is in agreement with the factor

    of (18).Fortunately, we observe that the bias of the Modified PSE

    via circular overlap is of the same magnitude as the bias of theWelch PSE, as indicated by the previous discussion. In the nextexample, we shall look into the approximating expression forthe variance (18).

    2) Example: In this example, we use the same setup as in theexample as in Section III-B2. A zero-mean white Gaussian noisesequence with unit variance of points was generatedvia Matlab. The sequence was partitioned in four records of

    points, and a Hanning window was applied.In this example, we want to illustrate that the approximate

    expression for the variance (18) holds. We compute the vari-ances of the modified PSE via circular overlap and the PSE ofWelch as a function of the fraction of overlap. In particular,we choose all fractions of overlap between zero and 9 suchthat, for the Welch PSE, the number of segments is an integer

    . For the modified PSE via circularoverlap, we choose all the possible fractions satisfying (2). Thisis illustrated in the top plot of Fig. 9.

    Secondly, the top plot of Fig. 9 was redrawn, such thatthe -axis is a function of the number of overlapping seg-ments. If the Welch PSE was computed for the fractions

    , we plotted the variance of the Welch PSEas a function of

    and similarly for the modified PSE via circular overlap.This is illustrated in the bottom plot of Fig. 9. In both figures,we normalized the computed variances by dividing these by thetrue power.

    We see from both plots in Fig. 9 that the simulated variancecomputed from 1000 Monte Carlo simulations (indicated withthe cross-markers) follows the theoretical curves (indicated bythe solid lines) within the uncertainty of the simulation (

    with 95% confidence). This illustrates theclaim that the variance of the modified PSE via circular overlapfollows approximately (18). Therefore, we conclude that we ap-proximately can achieve a reduction in variance up to a factor

    , and this without increasing the bias.

  • IEEE

    Proo

    f

    8 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010

    Fig. 8. PSE via circular overlap for a type1 digital Chebychev filterof order two. The first plot shows the estimated power spectrum; the second plotshows the root mean squared error of the estimated PSE of the first plot. Thethird and fourth plots show a zoom-in at the resonance of the first and secondplots, respectively. The dashed curve is the true power spectrum, the light graycurve is the PSE via circular overlap, and the dark gray curve is the Welch PSE.

    V. CONCLUSION

    The main goal of this paper was to investigate some aspectsof the Welch PSE in more detail. In particular, we tried to finda theoretical explanation of the drawback (the nonmonotonicityof the variance of the Welch PSE) reported in the literature. Ouranalysis revealed the following.

    Circular overlap revealed that the nonmonotonicity of thevariance of the Welch PSE is not due to the followingtradeoff: increasing the fraction of overlap results in moresubrecords to average but increases the correlation betweentwo subrecords.

    The nonmonotonicity of the variance of the Welch PSE isdue to the unequal weighting of the data, which impliesthat the covariance between two overlapping subrecords isweighted with a triangular window.

    The variance of the improved PSE via circular overlap (15)is monotonically decreasing without introducing extra sys-tematic errors.

    The improved PSE via circular overlap is user-friendlysince choosing an optimal percentage of overlap is no

    Fig. 9. The black curve indicates the theoretical normalized variance of theWelch PSE, and the gray curve indicates the theoretical variance of the PSE viacircular overlap as a function of the fraction in the top plot and as a function ofthe number of overlapping subrecords in the bottom plot. The cross-markers arethe simulated normalized variance at the resonance frequency on 1000 MonteCarlo simulations of the PSE of a type 1 digital Chebychev filter of order 2.

    longer an issue for the improved PSE via circular overlap.As a rule of thumb, we propose a fraction of overlap of

    .

    The PSE via circular overlap provides the user with a re-duction of the variance with respect to the variance of theWelch PSE up to a factor . However, this reduc-tion in variance is significant for small values of only.

    The improved PSE via circular overlap is a small modification ofthe Welch PSE but has some better properties than the classicalWelch PSE; however, theoretically, the bias vanishes slightlyslower than the bias for the Welch PSE. The user can receivethese extra properties for free as the main processing algorithmremains unchanged.

    APPENDIX

    Throughout the Appendix, we shall use the notation in-stead of for simplicity if there is no confusion possible.

    A. Proof of Theorem 1We shall only prove Theorem 1 for the subrecords

    (see Fig. 10), but the proof for the other

  • IEEE

    Proo

    f

    BARB et al.: WELCH METHOD REVISITED: NONPARAMETRIC POWER SPECTRUM ESTIMATION VIA CIRCULAR OVERLAP 9

    Fig. 10. The subrecord used in the proof is denoted by the dashed line.

    subrecords is similar. Before proving the main result, weformulate two lemmas.

    Lemma 1: Under Assumption 1, the following holds forand

    and any arbitrary :

    Proof: The proof follows immediately from the fact that, implying that . We give

    the proof for claim i), but the proof for claim ii) is completelysimilar. We denote as a time sample inthe beginning of the record and

    at the end of the measured record. As indicated by Fig. 10,the points are separated by at least time samples.We compute

    Using the fact that andimplies that

    Since , the proof is complete.Lemma 2: Under Assumptions 1 and 2, the following holds

    asymptotically :

    Proof: We compute

    (19)

    Further, using (1), we compute the different terms in the ex-pected value (19). Separately, for

    (i)

    where denotes the real part of . Applying Lemma 1,the last term becomes . Furthermore, if we denote

    , then we find after renum-bering the first sum with , where

    , and the second sum with

    where we used the WienerKhintchine theorem [2] to obtain thesecond equality.

    In a completely similar way, we obtain for the other terms inthe expected value (19)

  • IEEE

    Proo

    f

    10 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010

    The third term becomes

    (iii)

    The fourth term becomes

    (iv)

    Combining results i)iv) completes the proof.

    Next, we compute

    (20)

    (21)

    (22)

    where in (20) and (21), we used the fact that the Fourier co-efficients are circular complex Gaussian dis-tributed. See [22], which implies that the second term in (21)equals zero; and we used Lemma 2 for (22).

    B. Proof of Corollary 1We start with a small technical lemma, which indicates that

    the sum of the squares of the windowing function is proportionalto the number of data points.

    Lemma 3: Under Assumption 2, the following holds for largerecord lengths and a nonzero constant

    Proof: Under Assumption 2 the window function isRiemann-integrable [20]. Due to the fact that is uniformlybounded by the constant one, it is easy to see that the windowfunction is quadratic integrable. It follows that

    Let and choose such that, for ,the following holds:

    which completes the proof.Now, we are able to prove claim (i) of Corollary 1. For the firstclaim, it is sufficient to study . We compute

    with . This follows immediately from results(i) and (iv) in the proof of Lemma 2. Using the triangular in-equality together with Lemma 3 results in

  • IEEE

    Proo

    f

    BARB et al.: WELCH METHOD REVISITED: NONPARAMETRIC POWER SPECTRUM ESTIMATION VIA CIRCULAR OVERLAP 11

    which completes the proof of the first claim.To show the second claim, it is sufficient to study

    . The proof of the second claim of Corollary1 is based on a very important corollary of mean square con-vergence. The corollary is found in [23, p. 41]. However, weshall slightly generalize the result.

    Lemma 4: Let four sequences of (complex)random variables, such that are equivalent in meansquare ( if ) and areequivalent in mean square. Then the following holds: if thesecond-order moments of the random variablesare uniformly bounded, then

    Remark 2: In [23], the result is formulated for andfor every . However, this makes no difference for the

    proof.Proof: It is easy to verify that

    . Using thetriangular inequality in combination with the CauchySchwartz(or Hlders) inequality, we compute

    This completes the proof of this lemma sinceare assumed uniformly bounded.To prove claim (ii) of Corollary 1, we compute

    (23)

    where the last equality follows from the following rule [22, p.123]: for complex circular Gaussian random variables,the following rule applies:

    To complete the proof, we apply the previously shown claim (i)for the first term in (23) such that

    For the second term in (23), we apply Lemma 4 withand

    , and where . This immediately proves theresult since Theorem 1 together with Remark 1 implies that

    forlarge enough.

    C. Proof of (11)In this section, we shall prove that the variance expression (9)

    for a rectangular window equals

    First, we study in more detai,l where we usedthe following fact [3] for Gaussian processes :

  • IEEE

    Proo

    f

    12 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010

    (24)

    Now, we use the fact that the windowing function andsuch that we obtain

    (25)

    where the last equality (25) holds asymptoticallysince for

    , [11]. Hence, we can compute the variance explicitly

    (26)

    where we used (25) to obtain (26). Next, we use following com-putation rules [24]:

    Finally, we obtain

    which completes the proof.

    REFERENCES

    [1] G. M. Jenkins and D. G. Watts, Spectral Analysis and Its Applica-tions. San Francisco, CA: Holden-Day, 1968.

    [2] J. G. Proakis and D. G. Manolakis, Digital Signal Processing: Princi-ples, Algorithms, and Applications, 3rd ed. Upper Saddle River, N.J.:Prentice-Hall, 1996.

    [3] R. Pintelon and J. Schoukens, System Identification: A Frequency Do-main Approach. New York: IEEE, 2001.

    [4] P. Broersen, Automatic Autocorrelation and Spectral Anal-ysis. Berlin, Germany: Springer, 2006.

    [5] G. Box and G. Jenkins, Time Series Analysis: Forecasting and Con-trol. San Francisco, CA: Holden-Day, 1970.

    [6] J. P. Burg, The relationship between maximum entropy spectra andmaximum likelihood spectra, Geophysics, vol. 37, pp. 375376, Apr.1972.

    [7] P. Stoica and R. Moses, Introduction to Spectral Analysis. UpperSaddle River, NJ: Prentice-Hall, 1997.

    [8] A. Schuster, On the investigation of hidden periodicities, Terrestr.Magn., vol. 3, pp. 1341, 1898.

    [9] M. S. Bartlett, Smoothing periodograms from time series with contin-uous spectra, Nature, vol. 161, pp. 686687, 1948.

    [10] P. Welch, The use of fast fourier transform for the estimation of powerspectra: A method based on time averaging over short, modified peri-odograms, IEEE Trans. Audio Electroacoust., vol. AE-15, pp. 7073,Jun. 1967.

    [11] D. Brillinger, Time Series: Data Analysis and Theory. London, U.K.:Holt, Rinehart and Winston, 1975.

    [12] B. Porat, Digital Processing of Random Signals: Theory andMethods. Upper Saddle River, NJ: Prentice-Hall, 1994.

    [13] F. J. Harris, On the use of windows for harmonic analysis with thediscrete fourier transform, Proc. IEEE, vol. 66, no. 1, pp. 5183, 1978.

    [14] D. J. Thompson, Spectrum estimation and harmonic analysis, Proc.IEEE, vol. 70, no. 9, pp. 10551096, 1982.

    [15] J. Schoukens, R. Pintelon, G. Vandersteen, and P. Guillaume, Fre-quency-domain system identification using nonparametric noisemodels estimated from a small number of data sets, Automatica, vol.33, no. 6, pp. 10731086, 1997.

    [16] K. Barb, J. Schoukens, and R. Pintelon, Frequency domain errors-in-variables estimation of linear dynamic systems using data from over-lapping sub-records, IEEE Trans. Instrum. Meas., vol. 57, no. 8, pp.15291536, 2008.

    [17] P. Johnson and D. Long, The probability density of spectral estimatesbased on modified periodogram averages, IEEE Trans. Signal Process., pp. 12551261, 1999.

    [18] T. P. Bronez, On the performance advantage of multitaper spectralanalysis, IEEE Trans. Signal Process., vol. 40, no. 12, pp. 29412946,1992.

    [19] H. Jokinen, J. Ollila, and O. Aumala, On windowing effects in es-timating averaged periodograms of noisy signals, Measurement, vol.28, no. 3, pp. 197207, 2000.

    [20] G. E. Shilov and B. L. Gurevich, Integral, Measure and Derivative: AUnified Approach. Englewood Cliffs: Prentice-Hall, 1966.

    [21] A. Zygmund, Trigonometric Series. Cambridge, U.K.: Univ. Press,19681979.

    [22] B. Pincinbono, Random Signals and Systems. Englewood Cliffs, NJ:Prentice-Hall, 1993.

    [23] E. Lukacs, Stochastic Convergence. London, U.K.: Academic, 1975.[24] R. Grimaldi, Discrete and Combinatorial Mathematics. Reading,

    MA: Addison-Wesley, 1999.

  • IEEE

    Proo

    f

    BARB et al.: WELCH METHOD REVISITED: NONPARAMETRIC POWER SPECTRUM ESTIMATION VIA CIRCULAR OVERLAP 13

    Kurt Barb received the degree in mathematics(option statistics) from the Vrije Universiteit Brussel(VUB), Brussels, Belgium, in 2005.

    He is currently a Ph.D. Researcher with the ELECDepartment, VUB. His main interests are in the fieldof system identification, time series analysis, and sta-tistics.

    Rik Pintelon received the engineer degree and thedoctoral degree in applied science from Vrije Uni-versiteit Brussel (VUB), Brussels, Belgium, in 1982and 1988, respectively.

    He is presently a Professor in the ELEC Depart-ment, VUB. His main research interests are in thefield of parameter estimation/system identificationand signal processing.

    Johan Schoukens received the engineer degree andthe doctoral degree in applied sciences from VrijeUniversiteit Brussel (VUB), Brussels, Belgium, in1980 and 1985, respectively.

    He is presently a Professor at VUB. The prime fac-tors of his research are in the field of system identifi-cation for linear and nonlinear systems.