An implementation of the Local Ensemble Kalman Filter in a ......M. Corazza et al.: LEKF in a QG...

Nonlin Processes Geophys 14 89ndash101 2007wwwnonlin-processes-geophysnet14892007copy Author(s) 2007 This work is licensedunder a Creative Commons License

Nonlinear Processesin Geophysics

An implementation of the Local Ensemble Kalman Filter in a quasigeostrophic model and comparison with 3D-Var

M Corazza12 E Kalnay1 and S C Yang1

1University of Maryland Department of Meteorology College Park MD USA2ARPAL ndash CFMI-PC Vle Brigate Partigiane 2 16129 Genova Italy

Received 21 August 2006 ndash Revised 16 February 2007 ndash Accepted 16 February 2007 ndash Published 22 February 2007

Abstract We perform data assimilation experiments with awidely used quasi-geostrophic channel model and comparethe Local Ensemble Kalman Filter (LEKF) with a 3D-Vardeveloped for this model The LEKF shows a large improve-ment especially in correcting the fast growing modes of theanalysis errors with a mean square error equal to about halfthat of the 3D-Var The improvement obtained in the anal-ysis is maintained in the forecasts implying that the systemis capable of correcting the initial errors responsible for laterforecast error growth

Different configurations of the LEKF are tested and com-pared We find that for this system adding random pertur-bations after every analysis step is more effective than thestandard variance inflation in order to avoid underestimatingthe background error covariance and the consequent filter di-vergence

Experiments indicate that optimal results are obtained witha relatively small number of vectors (sim30) in the ensembleThe LEKF is characterized by the ldquolocalizationrdquo of the anal-ysis process over local domains surrounding each gridpointof the model grid We find that when using a fixed num-ber of ensemble vectors there is an optimal size of the localhorizontal domain beyond which the results do not changefurther

1 Introduction

Numerical weather prediction has substantially improved inrecent years due to both an enhancement in the dynamicsand physical parameterizations in models and to new meth-ods to generate initial conditions in the data assimilation pro-cesses (egKalnay 2003) It is well known that the atmo-spheric flow is a chaotic system and that numerical forecasts

Correspondence toM Corazza(matteocorazzaarpalorg)

are sensitive to small changes in the initial conditions with arapid growth of the initial errors that leads in a relativelyshort time to deviations responsible for limiting the pre-dictability of the flow to a week or so For this reason dur-ing the last decades a great effort has been devoted to studynew methods aimed at improving the description of the pre-dictability of the flow and as a consequence of the initialconditions

Presently several operational centers use 3-DimensionalVariational data assimilation systems (3D-Var egParrishand Derber 1992 Lorenc 1986) to generate the analysesrequired to provide initial conditions to the model Thismethod is a statistical interpolation between the new obser-vations and a short range forecast (typically 3 to 12 h) whichis used as first guess orbackground In this process the cor-rections of the background toward the observations ie theanalysis increments are strongly dependent on the inverse oftheir error covariances and only take place within the sub-space spanned by the background error covariance There-fore a good representation of the observation and backgrounderror covariances is one of the most important aspects in theframework of the present efforts in developing data assimi-lation systems In 3D-Var methods the background error co-varianceB is given by a statistical average of the error struc-tures and is maintained constant in time That is there isno accounting for the variations of the atmospheric state andconsequent day-to-day variability of the background errorsdue to the actual state of the flow These flow-dependent er-rors are hereafter referred to as the ldquoerrors of the dayrdquo theimportance of which has been described in several recentpapers and books (egCourtier et al 1994 Kalnay 2003Corazza et al 2003)

Kalman filtering (seeDaley 1993) in which B is pre-dicted (Bt=L B tminus1LT whereL is the tangent linear modelLT its transpose andt is time) can be considered the nat-ural and complete approach to the problem of representingthe variability of B but the computational cost related to

Published by Copernicus GmbH on behalf of the European Geosciences Union and the American Geophysical Union

90 M Corazza et al LEKF in a QG model and comparison with 3D-Var

this formulation makes its operational application impossiblewithout major simplifications even considering the potentialdevelopment of the computational facilities during the nextdecade There are other methods that try to account for theldquoerrors of the dayrdquo in the forecast error covariance such asthe modification ofB in 3D-Var (Purser 2005 De Pondecaet al 2006 Corazza et al 2002) 4-dimensional variationaldata assimilation (4D-Var egCourtier et al 1994 Rabieret al 2000 Klinker et al 2000) ensemble Kalman Filter-ing (EnKF egHoutekamer and Mitchell 1998 Hamill andSnyder 2000) and the method of representers (Bennett et al1996)

Recently several studies have been carried out in orderto substitute the standardB in 3D-Var with a more complexcovariance matrix no longer constant but capable to takeinto account time dependent flow In particular a generalmethod to implement any arbitrary anisotropy and thereforealso flux-dependent anisotropies has been implemented atNOAANCEP as described byPurser(2005) regarding thetheoretical geometrical implementation and byDe Pondecaet al(2006) concerning different implementations of the fil-ter All the implementations produced an improvement re-spect to the standard 3D-Var further confirming the impor-tance of the inclusion of the errors of the day in the data as-similation processCorazza et al(2002) experimented witha simpler method based on the use of the information pro-vided by an ensemble of bred vectors inB using the samesystem as in this work and also showed significant improve-ments with respect to 3D-Var

4D-Var is a powerful approach that reduces the compu-tational cost of Kalman Filtering by estimating the initialconditions such that the model forecast best fits the obser-vations within a data assimilation window 4D-Var producesthe same results as Kalman Filtering under the assumptionsof perfect linear model framework of Gaussian errors andof a Kalman Filter representation of the background errorcovariance matrix at the beginning of the computation Al-though 4D-Var is computationally affordable with the mostpowerful available computers its cost is still high comparedto 3D-Var In addition it requires the development and main-tenance of the adjoint of the model

Another approximation to Kalman Filtering is EnsembleKalman Filtering (EnKF) where the background error co-variance is estimated by the sample covariance from thebackground ensemble vectors As a result it has the advan-tage of being model independent ie no tangent linear andadjoint model are needed since EnKF only requires as inputan ensemble of forecasts and their corresponding predictionsof the observations and from that it produces a new ensembleof analyses In recent implementations that allow for the useof a relatively small number of ensemble members EnKFhas been shown to be less computationally expensive than4D-Var (Evensen 1994) A number of studies have been car-ried out with promising results (Houtekamer and Mitchell1998 Hamill and Snyder 2000 Tippett et al 2002 Ander-

son 2001 Etherton and Bishop 2004) and the system de-veloped byHoutekamer et al(2005) has been implementedin Canada to provide initial ensemble perturbations

Kalnay and Toth(1994) have pointed out that the breedingtechnique used for ensemble forecasting describes the localpattern of the ldquoerrors of the dayrdquo and proposed a method touse the information made available by an ensemble of bredvectors to in effect approximate the background covariancematrix In their work they argued that the similarity betweenbreeding (Toth and Kalnay 1993 1997) and data assimila-tion suggests that the background errors should have localstructures similar to those of bred vectors and this conjec-ture has been confirmed byCorazza et al(2003) with thesame Quasi-Geostrophic model used in this work

More recently based on the results ofPatil et al(2001)Ott et al(2002 2004) developed a new system called Lo-cal Ensemble Kalman Filter (LEKF) in which the KalmanFilter equations are solved within the space of the ensembleforecasts but locally in space This system has been testedon a simple 40 variables Lorenz model (Ott et al 2004) andon the NCEP Global Forecasting System (Szunyogh et al2004 2005) It is one of the possible implementations ofthe square-root ensemble Kalman filters where the analysisincrements and errors are computed using a background er-ror covariance derived from the ensemble of vectors so thatthe increments lay in the subspace spanned by these vectorsThis gives the major advantage of capturing the dominant er-ror growing directions thus providing a large improvementto the informations available from the observational errorsOther square-root filters derive their computational efficiencyfrom the assimilation of one observation at a time (Tippettet al 2002) whereas the LEKF computes the analysis inde-pendently at each grid point using all the observations avail-able within a local volume surrounding it This is advanta-geous in the presence of a large number of observations suchas satellite data and it allows for a very efficient parallel im-plementation

In this work we test different configurations of the LEKFby implementing it on a quasi-geostrophic channel model de-scribed inRotunno and Bao(1996) The results are com-pared with those obtained with an optimized 3D-Var systemdeveloped byMorss(1998) A perfect model framework isassumed so that the conclusions of this work are not neces-sarily valid for a more complex model with errors Howeverthis assumption allows us to explicitly define the ldquotrue stateof the atmosphererdquo (by integrating the model from a giveninitial state) and therefore perform a direct comparison inwhich the analysis and forecast errors are explicitly repre-sented

The outline of the work is as follows In Sect2 the im-plementation of the quasi-geostrophic model and of the 3D-Var data assimilation system are briefly reviewed In Sect3a description of the implementation of the Local EnsembleKalman Filter is presented Results are discussed in Sect4and a summary and conclusions close the work

Nonlin Processes Geophys 14 89ndash101 2007 wwwnonlin-processes-geophysnet14892007

M Corazza et al LEKF in a QG model and comparison with 3D-Var 91

2 Model and 3D-Var data assimilation system

This study uses the model developed byRotunno and Bao(1996) and used byMorss (1998) and Morss et al(2001)to develop a 3D-Var scheme It is a quasi-geostrophic mid-latitude β-plane finite-difference channel model periodicin longitude and with impermeable walls on the north andsouth boundaries The bottom and the top boundaries aretreated as rigid lids It has no orography land-sea contrastor seasonal cycle Quasi-geostrophic potential vorticity isconserved except for Ekman pumping at the surfacenabla

4 hor-izontal diffusion and forcing by relaxation to a baroclinicallyunstable zonal mean state All the experiments described inthis work have been performed over a domain correspondingto an area of roughly 16 000times8000times9 km with a grid com-posed by 64 points east-west 32 south-north and 5 verticalinterior layers plus the bottom and top of the domain wherepotential temperature is forecasted to provide vertical bound-ary conditions for the streamfunction

As indicated in the introduction the ldquotruerdquo state of the at-mosphere is created by a long integration of the model The3D-Var data assimilation system used in this work as the ref-erence data assimilation scheme is based on the code of thescheme originally described inMorss(1998) ldquoRawinsonderdquoobservations ofu v andT are obtained from the true state ofthe atmosphere at a fixed number of grid-points in order toavoid interpolation errors Random Gaussian noise is gener-ated and added to the observations to simulate observationalerrors compatible with those of the real radiosonde stationsof the global operational network

Analyses are performed every 12 h and the 12 h forecastis used as background for the 3D-Var The system has beenoptimized for the number of stations used in this work bytuning the amplitude of the background covariance matrixBoriginally developed byMorss(1998) so as to minimize theanalysis and forecast errors of the 3D-Var The optimizationhas been limited to a careful tuning of the amplitude ofBbecauseMorss(1998) showed that the spectral coefficientsof B obtained for the original system are not sensitively de-pendent of the density of the observational network

In the original formulationMorss(1998) defines the ob-servation operatorH as the operator that transforms themodel variables (grid point components of the potential vor-ticity and potential temperature) into ldquorawinsonde observa-tionsrdquo The global spectral transforms for the Poisson equa-tion solver and its inverse (required to transform model po-tential vorticity into observed winds and temperatures andvice versa) are conveniently computed using a global spectralapproach within the 3D-Var As we will see in the next sec-tion this method cannot be directly applied in the frameworkof the LEKF formulation since the analysis is computed onlocal domains and a different approach has been developed

3 The implementation of the Local Ensemble KalmanFilter data assimilation system

31 The localization

The implementation of the Local Ensemble Kalman Filterhas been performed followingOtt et al(2002) In the LEKFthe representation of the state of the flow is decomposed overlocal domains around each grid point A detailed descriptionof the LEKF can be found inOtt et al(2004) so here we onlyprovide an overview of the approach following the notationadopted in their paper

In the 2-dimensional horizontal grid of the model the stateof the flow may be represented by means of a vector fieldx(r t) wherer runs over the grid pointsrmn At each gridpoint x is a vector of the state variables of the model (in ourcase potential vorticity and temperature) at all vertical levelsLet U denote the dimension ofx(r t) at a fixedr In theframework of the QG-model used in this work where theforecast variables are potential vorticity in the five internallayers and potential temperature at the top and the bottomlevelsU=7 However as discussed in Sect35 the methodadopted to locally handle the observation operator requiresinstead the alternate use ofu v T at all levels so that in theimplementation used in this workU (in observation space)is equal to 21 We could have also localized in the verticalas done inSzunyogh et al(2005) but because of the lowvertical resolution we have included a full column in eachlocal volume

Following the work ofPatil et al(2001) and Ott et al(2002) we introduce at each grid point a local vectorxmn de-fined asx(rm+mprimen+nprime t) for minuslle(mprime nprime)lel That isxmn(t)

contains the information of the state of the model over the lo-cal area within a(2l+1) by (2l+1) sub-grid points centeredat rmn with a dimension equal to(2l+1)2U Consideringan ensemble ofk forecasts started from the previous analysisstep it is possible to construct the local vectors associatedwith each member of the ensemble (denoted byxb

mn wherethe superscriptb stands for ldquobackgroundrdquo) As inOtt et al(2002) xb

mn at timet can be described by a probability dis-tribution functionFmn(x

bmn t) approximated by a Gaussian

distribution identified by a local background error covariancematrix Pb

mn and the most probable statexbmn Assuming that

the dimension of the null space ofPbmn is large compared

with thek dimensional subspace orthogonal to the null spaceand sincePb

mn is symmetric it is possible to compute an or-

thonormal set ofk eigenvectorsu(j)mn(t) with a correspond-

ing set ofk non-negative eigenvaluesλ(j)mn(t) generally dis-

tinct and such asλ(j)mn(k)gt0

In terms ofu(j)mn andλ

(j)mn the covariance matrix is given by

Pbmn(t) =

nsumj=1

xb(j)mn (t)

(x

b(j)mn (t)

)T

(1)

wwwnonlin-processes-geophysnet14892007 Nonlin Processes Geophys 14 89ndash101 2007


where

xb(j)mn (t) =

radicλ

(j)mn(t)u

(j)mn(t) (2)

Working in thek dimensional spaceSmn spanned by the vec-torsu

(j)mn(t) is particularly advantageous since it is possible

to representPbmn(t) as a diagonal matrix equal to

Pbmn(t) = diag

[λ(1)

mn λ(2)mn λ

(k)mn

] (3)

so that its inverse is trivial (here and in the following text thehat represents vectors or matrices in theSmn space)

Patil et al (2001) used 30-pairs of ensembles of bredvectors from the NCEP system (Toth and Kalnay 19931997) and found that forecast errors in the mid-latitudeextra-tropics tend to lie in a low dimensional subset of the(2l+1)2U dimensional local vector space For the systemused in this work this result was also very apparent inCorazza et al(2003) Thus it is possible to make the hy-pothesis that the dimension ofSmn is rather low and that theerror variance in all other directions is negligibly small com-pared to the variance

ksumj=1

λ(j)mn (4)

in the directionsumn(j) j=1 2 k

32 Data assimilation

Following Ott et al(2002) let xamn denote the random vari-

able (analysis) at timet representing the local vector afterusing the observations Ifyo

mn(t) is the vector of observa-tions within the local region with errors normally distributedwith covariance matrixRmn(t) the probability distributionof xa

mn is also Gaussian and is identified by the most prob-able statexa

mn and the associated covariance matrixPamn(t)

The data assimilation step determinesxamn (the local analy-

sis) andPamn(t) (the local analysis covariance matrix) Sim-

ilarly to a global 3D-Var data assimilation system the solu-tion to this problem can be written in the ensemble subspaceSmn as follows

ˆxamn(t) = xb

mn(t) +

Pbminus1

mn (t) + HTmnRminus1

mnHmn

minus1times

HTmnRminus1

mn

(yo

mn(t) minus Hmnxbmn(t)

)(5)

and

Pamn(t) =

[Pbminus1


mnHmn

]minus1(6)

whereHmn (assumed to be linear) is the local observation op-erator which maps the local vectorxmn(t) to the local obser-vationsyo

mn(t) Going back to the local space representationwe have

xamn(t) = Qmn(t) ˆxa

mn(t) (7)

Pamn(t) = Qmn(t)Pa

mn(t)QTmn(t) (8)

whereQmn is the(2l + 1)2U by k matrix

Qmn(t) =

u(1)

mn(t)|u(2)mn(t)| |u

(k)mn(t)

(9)

33 Updating the background field

The analysis informationPamn and xa

mn can be used toobtain an ensemble of global analysis fieldsxa(i)(r t)i=1 2 k as described below (seeOtt et al 2002)These fields represent the ensemble of initial conditionsfor the atmospheric model By integrating the ensembleof global fields forward in time to the next analysis timet+1t it is possible to obtain the background ensemblexb(i)(r t+1t) This completes the analysis cycle whichif the procedure is stable can be repeated for as long as de-sired Thus at each analysis step a global initial conditionis available to be used to compute forecasts of the desiredduration

The remaining task is to specify the global analysis fieldsfor the members of the ensemblexa(i)(r t) starting fromthe analysis informationPa

mn andxamn Let

xa(i)mn (t) = xa

mn(t) + δxa(i)mn (t) (10)

denotek local analysis vectors where

ksumi=1

δxa(i)mn (t) = 0 (11)

and

Pamn = (k minus 1)minus1

ksumi=1

δxa(i)mn (t)

[δxa(i)

mn (t)]T

(12)

By Eq (11) the local analysis statexamn(t) represents the

mean over the local analysis ensemblexa(i)(t) and byEq (12) the error covariance matrix is computed from thevectorsδxa(i)(t) The vectorsδxa(i)

mn (t) can be representedin the form

δxa(i)mn (t) = Ymn(t)δx

b(i)mn (t) (13)

where the matrixYmn(t) can be thought of as a generalizedldquorescalingrdquo of the original background fields andδxb(i)

mn (t)

are defined as

δxb(i)mn (t) = xb(i)

mn (t) minus xbmn(t) (14)

and are such that

Pbmn(t) = (k minus 1)minus1

ksumi=1

δxb(i)mn (t)

(δxb(i)

mn (t))T

(15)

There are many possible choices forYmn(t) due to thenonuniqueness of the square root of matrices Here we fol-low the choice made byOtt et al(2004) making the hypoth-esis that the original background fields are a good represen-tation of a physical state so as to minimize the difference



(a) (b)

Fig 1 (a)Time evolution of Analysis Error (rms) of potential vorticity at mid-level for the regular 3D-Var system (filled dots) The emptydots line represents the 48 h forecast error(b) Same as (a) for the reference LEKF system Here as in the other figures variables arenondimensionalized following the relations described for instance inRotunno and Bao(1996 page 1058 Sect 3c)

between the new analysis and the background ensembles Asa result the solution forYmn(t) is

Ymn = (16)[I + XbT

mn

(Pb

mn

)minus1 (Pa

mn minus Pbmn

) (Pb

mn

)minus1Xb

mn

] 12

where

Xbmn = (k minus 1)minus

12

δxb(1)

mn | δxb(2)mn | | δxb(k)

mn

(17)

We can therefore build an ensemble of global fieldsxa(i)(r t) that can be propagated forward in time to thenext analysis time For this purpose it is possible to set

xa(i)(rmn t) = Jxa(i)mn (t) (18)

whereJ maps the(2l + 1)2U dimensional local vector totheU dimensional vector updating the vertical profile of themodel (U=7) at the grid pointrmn at the center of the patchEquation (18) is applied at every grid point(m n) of the at-mospheric model Thus Eq (18) defines an ensemble ofk+1global analysis fieldsxa(i)(r t) for i=1 2 middot middot middot k

34 Handling the stability of the system

It is known that a frequent problem that can arise from an en-semble of vectors not globally orthogonalized is its tendencyto collapse toward a subspace that is too small As a resulteven for a perfect model the background error covariancetends to be underestimated These effects will tend to un-derestimate the forecast error and therefore to give too littleweight to the observations which can lead to the divergence

of the filter (Anderson 2001) In order to avoid this problemtwo different methods aimed at separating and enlarging theensemble perturbations have been tested and compared in theimplementation of the Local Ensemble Kalman Filter

1 Multiplicative inflation the amplitude of each pertur-bation vector is increased by a factor larger than 1 Byapplying this method no change in direction is added atthe analysis step but the amplitude of the forecast per-turbation vectors is kept sufficiently large (Anderson2001)

2 Additive random perturbations a globalHT operatoris applied to a set ofk random perturbations vectors inthe observational space with an amplitude comparablewith the errors of the observations The resulting fieldsare added to the global vectors of the ensemble Thisallows to ldquorefreshrdquo the ensemble at every analysis step(Annan 2004) by introducing perturbations in new ran-dom directions and therefore increasing the Local En-semble Dimension (Patil et al 2001 Oczkowski et al2005) which represents the effective number of inde-pendent directions present in an ensemble This method(Corazza et al 2002) effectively avoids the excessiveconvergence of bred vectors into a small dimensionalspace (Wang and Bishop 2003) and is therefore an al-ternative solution to the rank-deficiency problem thatcauses filter divergence We note that in practice thisadditive inflation can be implemented without requiringthe use of the transpose of the observation operator (egHunt 2005)



(a)

(b)

Fig 2 (a) Shaded example of background error for potentialvorticity at mid-level at a fixed time (day=123) for the regular 3DVarsystem The contour plot represents the analysis increments and thered dots indicate the position of the radiosonde stations(b) sameas (a) but for the LEKF Note different scales in the two figures

3 Methods 1) and 2) can be used at the same time in whatwe call ldquothe combined systemrdquo

35 Handling the Poisson Solver in the local domain

As mentioned in Sect2 the observation operatorH performsthe transformation from potential vorticity and potential tem-perature in the model space to horizontal wind velocity andtemperature in the observation space The introduction ofHandHT requires the solution of a Poisson equation that in theoriginal system used byMorss(1998) is solved by means ofthe use of global spectral transforms with two boundary con-ditions (1) periodic solution in the zonal direction and (2)rigid lateral boundaries at the northern and southern wallsWhen we work locally as in the LEKF this global spectralapproach is not feasible A solution to this problem is found

by splitting the observation operator into two parts

H = H H (19)

whereH is the operator mappingu v andT defined on themodel grid to the observation locations andH is the opera-tor transforming potential vorticity and potential temperatureinto u v andT over the global model grid

Localizing H is trivial since only an interpolation pro-cess has to be carried on In our experiments this processis even simpler since the observation locations are on thegrid points Thus it is possible to applyH once at the begin-ning of the data assimilation step to all the vectors in orderto compute global vectors foru v andT over the model do-main The computation of the new analysis and perturbationscan therefore be carried on considering these as the modelvariables At the end of the process the new global vectors inu v andT are transformed back to the potential vorticity andpotential temperature fields needed by the quasi-geostrophicmodel

Table1 summarizes the parameters used for the configura-tion of the LEKF The optimal parameters shown in this tableare used unless otherwise noted

4 Results

41 Comparison between the 3D-Var and the LEKF

The time evolution of the analysis root mean square error ofthe potential vorticity as well as of the 48 h forecast error forthe optimized version of the 3D-Var scheme is presented inFig 1a plotted every 12 h for about a year (711 values) Thesystem is stable in time even though the presence of partic-ularly large maxima (eg around days 50 and 120) indicatesignificant variability of the errors in time This is an ex-ample of the importance of the ldquoerrors of the dayrdquo ie thevariability of the errors in space and time discussed in theintroduction that can be considered an intrinsic property ofthe stability of the flow The results shown in Fig1a areconsistent with those obtained using the original 3D-Var sys-tem developed for this quasi-geostrophic model (seeMorss1998)

The correspondence of large forecast errors with peaks inthe analysis errors valid at the same time is a clear indica-tion that these errors are linked to the most unstable modesof the flow This implies that their correction is importantnot only to reduce the amplitude of the fluctuations of theanalysis errors but also to delay the fast growth of the mostunstable (and therefore most important) errors of the flowThe corresponding results for LEKF (Fig1b) discussed inthe next subsection clearly indicate that most large ldquoerrorsof the dayrdquo are eliminated

The Local Ensemble Kalman Filter implementation(Sect3) uses the same observational settings as the opti-mized 3D-Var system Unless otherwise noted the number



Table 1 List of the optimal settings LEKF system used for the simulations presented in this work unless otherwise noted In particular wehave adopted a local domain characterized by an horizontal base of 7times7 points (l=3) with no vertical localization (7 layers) Since the dataassimilation variables areu v andT U=21 (number of variables in a column Sect31) Reference simulations have been performed using30 ensemble members (k=30) and constructing the final global fields as an average of all the analysis fields available from different localdomains (ldquomulti columnrdquo method Sect4) The average amplitude of the random perturbations added to the ensemble members at the endof the analysis step is 5 of the average amplitude of the ensemble vectors (Sects34and4) Finally when multiplicative inflation is usedin the combined method the ensemble vectors are multiplied by a factor equal to 102 at the end of the data assimilation step (Sect34)

Horizontal grid 64times32 pointsVertical levels 5 plus top and bottomHorizontal dimension of the local domain l=3Number of variables in a column (u v andT in 7 layers) U=21Number of members in the ensemble k=30Method to update the global field average (ldquomulti columnrdquo)Amplitude of the random perturbations sim5 of the average amplitude of the

ensemble vectorsInflation factor for the combined system 102

of ensemble members is equal to 30 Two methods for up-dating the new global vectors were tested (see Sect33) Inthe first one only the central column of the local domain isupdated so that each point of the global domain is modifiedonly once (ldquocentral column methodrdquo) In the second methodall the points within the domain of each local assimilationare updated so that the final analysis global vector is repre-sented by an average of the(2l+1)2 values (ldquomulti columnmethodrdquo) Only results for the multicolumn method are pre-sented since they were slightly better than those of the centralcolumn method

Several tests were performed using different values ofmultiplicative inflation and we found that this method wasinitially able to improve significantly the simulation results(during the first 100 analysis steps) However a slow in-crease of the error was observed in the second part of thesimulation for a wide range of inflation values suggestingthat at least for this system inflation alone is not enough tokeep the dimension of the space spanned by the ensemblevectors sufficiently large This result is not necessarily rep-resentative of the behavior of a real more complex system(eg Szunyogh et al 2005) where multiplicative inflationwas satisfactory In the present quasi-geostrophic system it ispossible that the amplitude of the vectors is sufficiently smallto keep their behavior close to linear Under this assumptionthe inflation method gives more weight to the observationsbut only within the same subspace of the ensemble perturba-tions which with time may become too small

A marked improvement in the stability of the LEKF withtime was obtained by the use of additive random perturba-tions instead of inflation Unless otherwise noted all the re-sults shown hereafter are obtained using this method Testsusing the combination of additive perturbations and inflationwere slightly better compared to those of the random pertur-

Fig 3 Shaded example of background error for potential vortic-ity at midlevel at the same time of Figs2a and b (day=123) forthe LEKF system The contour plots represent one vector of theensemble

bations alone A similar strategy was found optimal in Yanget al (2007)1

Optimal results have been obtained by adding random per-turbations to the analyses at the observation points followingthe procedure described in Sect34 The standard deviationof the perturbations is equal to about 5 of the average am-plitude of the ensemble vectors Since the average error ofthe analysis decreases with the density of the observationalnetwork and the amplitude of the ensemble vector spread de-

1Yang S C Corazza M Carrassi A Kalnay E andMiyoshi T Comparison of ensemble-based and variational-baseddata assimilation schemes in a quasi-geostrophic model in prepara-tion 2007



(a) (b) (c)

Fig 4 (a) Scatterplot of Analysis Error of potential vorticity at mid-level for the regular 3D-Var and for the LEKF(b) same as (a) forstreamfunction(c) same as (a) for the 72 h forecast

0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r

3D-VarLEKF 20LEKF 30

Fig 5 Forecast evolution of the Average Squared Error for the 3D-Var and the LEKF system

creases as well it might be thought that in order to keep thesystem optimal the amplitude of the random perturbationsalso needs to change with observational density Howeveran important result of this work is that the ratio between theamplitudes of the optimal random perturbations and the en-semble vector spread does not depend on the observationaldensity and is equal to about 5

As indicated before Fig1b shows the same analysis and2-day forecast errors as Fig1a but for the LEKF systemCompared to 3D-Var there is a marked improvement in boththe analysis and the forecasts In particular the analysis er-ror peaks are reduced with the consequence that the corre-sponding forecast errors are much smaller as well As willbe discussed further this indicates that the LEKF system iscapable of correcting the errors lying in the directions of thefast growing modes

Figure2a shows an example of background errors (colorshaded) for potential vorticity at mid-level obtained by means

of the regular 3D-Var scheme The corresponding analy-sis increment is shown by contours The location of therawinsonde observations is marked by red dots This time(day=123) has been chosen because it is representative ofa relatively large error maxima both for the 3DVar and theLEKF systems (see Figs1a and b respectively) As ex-pected the analysis increments generally correct the errorsaccording to the information provided by the observationswith the amplitude of the increments decreasing with the dis-tance to the observation However the 3D-Var corrections donot take into consideration the shape of the background errorsince the constant 3D-Var background error covariance ma-trix is built as the time average of many error estimations

The performance of the LEKF in computing the analy-sis increments is shown in Fig2b As anticipated at thebeginning of this section the representation of the back-ground covariance matrix by means of the ensemble of vec-tors allows the creation of analysis increments with a shapesimilar to that of the background error This is in con-trast to 3D-Var where the analysis increments are isotropicrather than stretched toward the errors of the day becausethe background error covariance matrix has been obtainedfrom a time average of estimated forecast errors (the ldquoNMCmethodrdquo ofParrish and Derber 1992) In the LEKF the ma-trix is locally built from the vectors of the ensemble each ofwhich contributes to the estimation of the errors of the dayThis can be seen in Fig3 where one of the vector pertur-bations after a data assimilation step is plotted on top of thebackground error The similarity of the analysis incrementshown in Fig2b with this vector is apparent

Figure 4a is a scatterplot of the space averaged analysiserrors at midlevel for the LEKF and the 3D-Var respectivelyIt shows a consistent reduction in the LEKF analysis errorscompared to 3D-Var Most of the large errors from 3D-Varare reduced or completely removed The only exception inthis figure where errors from the two schemes have similar



5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum

PV of the true state

5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum

KE of the true state

5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum

SF of the true state

5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR

Fig 6 Power spectrum of the true state (top panels) for potential enstrophy kinetic energy and streamfunction respectively The samequantities for the analysis errors of 3D-Var and LEKF are represented in the bottom panels Figures are computed using data from step 361to step 720 (180 days)

amplitude corresponds to the first part of the simulation thatcan be considered a transition phase (see Figs1a and b) Alarge part of the improvement (about 40 reduction in rmserror) is due to the reduction of the error maxima Theseresults are similar for the space averaged analysis errors instreamfunction at midlevel reported in Fig4b For this vari-able the improvement is even larger than for potential vortic-ity the larger error peaks are significantly reduced and therelative average improvement in the rms error is about 50In Fig 4c the scatter plot of the forecast errors for the 72 hforecast is shown for LEKF and 3Dvar It is important tonotice that the improvement gained at the analysis is main-tained during the forecast (Fig5) further confirming thatthe errors corrected by the LEKF are those responsible forthe large growth of forecast errors In Fig1 it can be seenthat the peaks in analysis errors and 48 h forecast errors oc-cur at the same time This indicates that fast growing ldquoerrorsof the dayrdquo are present in the initial conditions 48 h beforeattaining their large amplitude and that the LEKF evidentlysucceeds in correcting these small amplitude fast growinganalysis errors before they have a chance to grow into theforecast

The top panels of Fig6 represent the power spec-trum coefficients for potential enstrophy kinetic energyand streamfunction computed from the true state plot-ted against the global wavenumber FollowingMorsset al (2001) the global wavenumbers are computed as

NWV =((25timesk)2

+ (52times5timesl)2) 12 where k is the zonal

wavenumber andl is the meridional half wavenumber whilethe coefficients 25 and 52 are introduced in order to makethe global wavenumber comparable to that of the ldquorealrdquoglobal atmosphere The distributions of the power spectrumcoefficients of the three quantities are similar Most of thespectrum amplitude is concentrated at wavenumbers smallerthan 20 The corresponding spectrum coefficients for theanalysis errors are shown in the bottom panels of Fig6LEKF has smaller errors than 3D-Var for all wavenumbersand shows a very large improvement for all scales in thekinetic energy field The improvement for potential vortic-ity is distributed particularly in mid to small scales whilethe improvement in the streamfunction appears at the largerscales The large improvement in the KE field benefits di-rectly from the fact that the assimilation variables areu vandT (Sect3) The 3D-Var background covariance matrix isable to successfully correct large wavelengths but its perfor-mance becomes much worse as the wavenumber increasesThis is due to the fact that information on small scales in3D-Var background error covariance is lost in the statisticalaverage By contrast LEKF is able to reduce errors at allwavelengths so that its error spectrum is similar to the spec-trum of the true state The forecast error is dominated bylocal structures and the results suggest that the backgrounderror covariance in LEKF represents well these errors



0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r

Analysis24h Forecast48h Forecast72h Forecast3Dvar Analysis

0

50

100

150

200

250

300

1 2 3 4 5 6Dimension of the local domain (l)

RM

S E

rro

r


(a) (b)

Fig 7 (a) Analysis and Forecast Errors averaged in space and time for potential vorticity at midlevel for the LEKF system as the numberof vectors used in the ensemble varies Cases for 10 15 20 25 30 and 40 vectors are reported(b) same as (a) but the Forecast Errors areshown as the horizontal dimension of the local domains varies Cases forl=2 3 4 and 5 are reported In both figures 3DVar Analysis erroris shown for reference

42 Sensitivity of the LEKF to choice of parameters

As could be expected LEKF results are sensitively depen-dent on the number of vectors forming the ensemble InFig 7a the error averaged in space and time for potentialvorticity at midlevel is shown for the LEKF system runningwith 10 15 20 25 30 and 40 vectors using a fixed size ofthe local domain (l equal to 3) The results with 3D-Var arealso shown for reference It can be seen that for this system10 vectors are not sufficient to improve upon 3D-Var Theperformance of the LEKF improves with the number of en-semble members but only up to 30 (chosen as reference inthe previous results) The reduction of errors with ensem-ble size converges and no further significant improvementsare observed with 40 members An ensemble of 30 mem-bers has much fewer degrees of freedom than the model andsupports the hypothesis that the local subspace of the mostunstable modes is small and that the LEKF could be used foroperational purposes with a reasonable computational effort

Another important parameter in the LEKF is the size ofthe local volumes (ldquopatchesrdquo) used in the localization Fig-ure 7b shows the analysis and forecast errors for differentpatch sizes (l=2 3 4 5 corresponding to squares of 5times57times7 9times9 and 11times11 grid points respectively) using an en-semble of 30 members The analysis errors improve with sizeup to l=4 (9times9 grid points) and then they become slightlyworse This can be explained by the fact that when the num-ber of points of each local domain increases the local dimen-sion of the system also increases and the number of vectorsrequired to describe well the local instabilities of the systemeventually needs to be larger In other words the larger thelocal dimension the larger the sampling errors in the repre-sentation of the background errors with a limited number ofensemble perturbations

This interpretation is supported by an analysis of the abil-ity of the ensemble of vectors to describe the backgrounderror In Fig8 the error variance explained by the ensem-ble (relative amplitude of the projection of the backgrounderror onto the subspace spanned by the ensemble of vectors)is shown This is done by projecting at each local domain thelocal vector of the values of the error in potential vorticityonto the subspace spanned by the local vectors representingthen members of the ensemble The ratio between the am-plitude of this projection and the total amplitude of the erroris computed and then averaged over all the local domainsand represented in the plots The process is then repeated ateach data assimilation step Two cases are shown in Fig8athe variation of the projected component is represented usinga fixed dimension of the local domain with a varying numberof vectors In Fig8b the number of vectors is fixed and thesize of the local domains is varied The percentage of the er-ror represented by the ensemble is on average of the order of90 and it increases with the number of ensemble membersand decreases with the dimension of the local domain

It is interesting to see how the ensemblersquos ability to explainthe background error depends on the amplitude of the errorIn Fig9 the relative amplitude of the projection of the back-ground error at a fixed time (the same as in Figs2a2b and3) is compared with the total amplitude of the error (contourplot) It clearly shows that the percentage of the vector pro-jected onto the subspace varies substantially in space and thatthe larger the error the better its projection on the ensembleThis important property of the LEKF indicates that the pro-jection of the largest errors of the system onto the ensembleis even larger than what is suggested by the average valuesshown in Fig8 and explains the success of the LEKF in re-ducing the analysis errors even with relatively few ensemblemembers (in agreement withSzunyogh et al 2005)



0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)

Fig 8 Time evolution of projection of the background error in potential vorticity on the subspace spanned by the ensemble of local vectors(a) The dimension of the local domain is constant (l=3) and the number of vectors varies (N=20 30 40) (b) The number of vectors isconstant (N=30) and the dimension of the local domain varies (l=2 3 4)

The sequential methods for assimilation of observations(Houtekamer and Mitchell 1998 Tippett et al 2002) usea ldquolocalizationrdquo of the error covariance obtained by multi-plying the correlation between grid points and observationpoints forecast errors by a Gaussian function of the distancebetween these points This has the effect of reducing thecovariance sampling errors at long distances By contrastthe LEKF uses a step function (1 within the local domainand zero outside)Miyoshi (2005) found that this made theLEKF slightly worse than the Ensemble Square Root Filter(EnSRF) of Whitaker and Hamill for the SPEEDY globalmodel (Molteni 2003) He followed a suggestion ofHunt(2005) to apply instead the inverse of the Gaussian localiza-tion operator to theR matrix in such a way that

R = e(rminusro)2

2σ2 R (20)

whereσ is of the order ofl rminusro is the distance from thecenter of the domain and the symbol indicates a Shur (orHadamard) product The effect of this is to increase the ef-fective observation error for observations far away from thecentral analysis point achieving the same result as with thestandard localization We tested this approach and found avery small improvement (about 2) when using the ldquomulti-columnrdquo approach

Finally we discuss the vertical localization of the domainIn all the experiments performed in the framework of thiswork the local domain has been computed including all thevertical levels of the model without vertical decomposition ofthe domain Nevertheless it is interesting to explore whetherfurther improvements could be expected from vertical de-composition of the local domain For this decomposition tobe useful it is necessary to prove that there is not a highcorrelation between the background errors at different lev-els In Table2 the average vertical correlation between the

background error at each level of the local domain is shownThe correlation is computed considering each local domainand then averaged in space and time The values presentedin Table 2 do not depend significantly on the detailed im-plementation of the system Results clearly show a signifi-cant correlation between each level and its closest neighborswhile the correlation with the second neighbor and levels fur-ther away is much smaller This is a clear indication that foran optimal configuration of this system vertical localizationshould be introduced with local domains based on 3 verticallevels

5 Summary and conclusions

In this work we have implemented the Local EnsembleKalman Filter (LEKF) described inOtt et al (2004) for aquasi-geostrophic model widely used in the literature Re-sults obtained with the LEKF have been compared to thoseobtained by means of an optimized version of a 3D-Varshowing a marked improvement both in the analyses and theforecasts As pointed out in the introduction the optimiza-tion of the 3D-Var did not include an attempt to generate aflow dependent background error covariance matrixB whichcould have reduced the improvement of the LEKF A moregeneral comparison between different data assimilation sys-tems including 4D-Var and hybrid systems is presented inYang et al (2007 see footnote1 at page95) In order to adaptto the local framework of LEKF in the presence of modelglobal spectral transforms we proposed an alternative splitof the observation operator that can be easily implemented inoperational setups

Different configurations of the LEKF system have beenconsidered and compared to the 3D-Var With an ensemble of30 vectors best results have been obtained using a horizontal



Fig 9 Example showing the relationship between the percentageprojection of the background error onto the subspace spanned bythe local vectors (colors) and the background error (contours) Thetime chosen is the same of Figs2a and b and3 the number ofvectors used isn=30 and the dimension of the local domain is of7times7times5 grid points (l=3)

dimension of the local domains of 9times9 grid points which al-lows for the inclusion of a sufficient number of observationsleaving the dimension of the local space small enough to bedescribed by a small set of vectors The fact that using arelatively small ensemble an optimal value of the horizontaldimension of the local domain does exist is an indication ofthe need to find a compromise between including the maxi-mum observational information and limiting the dimensionof the local vectors in order to let the ensemble member pro-vide an optimal description of the instabilities of the systemBy contrast using a fixed local domain (l=3) results showa monotonic improvement as the number of ensemble mem-bers increases with saturation at about 30 members indicat-ing that a relatively small ensemble is sufficient to provide agood description of the instabilities and therefore of the er-rors

In order to get stable results it is necessary to prevent theforecast perturbations from collapsing with time into a sub-space which is too small For the quasi-geostrophic modelthe use of additive random perturbations after each data as-similation step gives much better results than multiplicativeinflation which in fact fails to prevent filter divergence

For this system the introduction of the localization of theobservational error covariance matrixR tested byMiyoshi(2005) based on a suggestion ofHunt(2005) does not resultin a significant improvement This result may not be generaland may be associated with the fact that we obtained bestsimulations by means of the ldquomulti columnrdquo method so thatthe contribution of the localization is less important than inthe case of the ldquocentral column updaterdquo method

It is important that the entire relative improvement ob-tained for the analysis is maintained throughout the 3-day

Table 2 Time and space average of vertical correlation of the back-ground errors in potential vorticity at different levels of the localdomain Level 1 represents the bottom level level 5 the top level

Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100

forecasts indicating that the system is indeed correcting theanalysis errors laying in the directions of the fast growingmodes

The vertical correlation of background errors (Table2)shows that the correlation quickly drops beyond the closestlevel This suggests that the LEKF system used in this workcould be improved by introducing a vertical localization ofthe domain in particular considering local domains based on3 vertical levels

Our results are very encouraging indicating that LEKF ismuch more accurate than 3D-Var even with a rather smallensemble size Some areas that still require further study are

1 the use of a flow-dependent amplitude of the additiverandom perturbations

2 the performance of the system during the initial spinupespecially if the starting point of the simulation is farfrom the true state of the flow

3 the relationship between additive or multiplicative in-flation and the observational density in particular whenthe observational network is not homogeneous

AcknowledgementsWe are grateful to the University of MarylandChaosWeather group and in particular to E Ott for suggesting theapplication of the LEKF to the QG model Rebecca Morss kindlymade available the model and 3D-Var system This research waspartially supported by the European Project WERMED InterregIIIb

Edited by Z TothReviewed by two referees

References

Anderson J S An ensemble adjustment filter for data assimilationMon Wea Rev 129 2884ndash2903 2001

Annan J D On the orthogonality of Bred Vectors Mon WeaRev 132 843ndash849 2004

Bennett A F Chua B S and Leslie L M Generalized inversionof a global NWP model Meteorol Atmos Phys 60 165ndash1781996



Corazza M Kalnay E Patil D J Ott E Yorke J Szunyogh Iand Cai M Use of the breeding technique in the estimation ofthe background error covariance matrix for a quasigeostrophicmodel in AMS Symposium on Observations Data Assimila-tion and Probabilistic Prediction pp 154ndash157 Orlando Florida2002

Corazza M Kalnay E Patil D J Yang S-C Morss R CaiM Szunyogh I Hunt B R and Yorke J A Use of the breed-ing technique to estimate the structure of the analysis ldquoerrors ofthe dayrdquo Nonlin Processes Geophys 10 233ndash243 2003httpwwwnonlin-processes-geophysnet102332003

Courtier P Thepaut J-N and Hollingsworth A A strategy foroperational implementation of 4D-Var using an incrementealapproach Quart J Roy Meteorol Soc 120 1367ndash1387 1994

Daley R Atmospheric data analysis Cambridge University PressNew York 1993

De Pondeca M S F V Purser R J Parrish D F and Der-ber J C Comparison of strategies for the specification ofanisotropies in the covariances of a three-dimensional atmo-spheric data assimilation system NOAANCEP Office Note 45213 pp 2006

Etherton B J and Bishop C H Resilence of Hybrid Ensem-ble3DVAR Analysis Schemes to Model Error and Ensemble Co-variance Error Mon Wea Rev 132 1065ndash1080 2004

Evensen G Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast errorstatistics J Geophys Res 99 10 143ndash10 162 1994

Hamill T M and Snyder C A Hybrid Ensemble Kalman Filter-3D Variational Analysis Scheme Mon Wea Rev 128 2905ndash2919 2000

Houtekamer P L and Mitchell H L Data assimilation usin anensemble Kalman filter technique Mon Wea Rev 126 796ndash811 1998

Houtekamer P L Mitchell H L Pellerin G Buehner M Char-ron M Spacek L and Hansen B Atmospheric Data Assim-ilation with an Ensemble Kalman Filter Results with Real Ob-servations Mon Wea Rev 133 604ndash620 2005

Hunt B R Efficient Data Assimilation for Spatiotemporal Chaosa Local Ensemble Transform Kalman Filterhttpwwwcitebaseorgcgi-bincitationsid=oaiarXivorgphysics0511236 2005

Kalnay E Atmospheric Modeling Data Assimilation and Pre-dictability Cambridge University Press 340pp 2003

Kalnay E and Toth Z Removing growing errors in the analysiscycle in Tenth Conference on Numerical Weather Predictionpp 212ndash215 Amer Meteorol Soc 1994

Klinker E Rabier F Kelly G and Mahfouf J-F The ECMWFoperational implementation of four-dimensional variational as-similation III Experimental results and diagnostics with opera-tional configuration Quart J Roy Meteorol Soc 126 1191ndash1216 2000

Lorenc A C Analysis methods for numerical weather predictionQuart J Roy Meteorol Soc 112 1177ndash1194 1986

Miyoshi T Ensemble Kalman Filter Experiments with a primi-tive equation global model PhD thesis University of MarylandCollege Park MD 2005

Molteni F Atmospheric simulations using a GCM with simplifiedphysical parametrizations I model climatology and variabilityin multi-decadal experiments Clim Dyn 20 175ndash191 2003

Morss R E Adaptive observations Idealized sampling strategiesfor improving numerical weather prediction PhD thesis Mas-sachusetts Institute of Technology 225 pp 1998

Morss R E Emanuel K A and Snyder C Idealized adaptiveobservation strategies for improving numerical weather predic-tion J Atmos Sci 58 210ndash234 2001

Oczkowski M Szunyogh I and Patil D J Mechanisms forthe Development of Locally Low-Dimensional Atmospheric Dy-namics J Atmos Sci 62 1135ndash1156 2005

Ott E Hunt B R Szunyogh I Corazza M Kalnay E PatilD J and Yorke J E Exploiting low-dimensionality of theatmospheric dynamics for efficient ensemble Kalman FilteringhttparXivphysics0203058 2002

Ott E Hunt B R Szunyogh I Zimin A V Kostelich E JCorazza M Kalnay E Patil D J and Yorke J A local en-semble Kalman filter for atmospheric data assimilation Tellus56A 415ndash428 2004

Parrish D F and Derber J C The National Meteorological Cen-terrsquos Spectral Statistical-Interpolation Analysis System MonWea Rev 120 1747ndash1763 1992

Patil D J S Hunt B R Kalnay E Yorke J A and Ott E Lo-cal Low Dimensionality of Atmospheric Dynamics Phys RevLett 86 5878ndash5881 2001

Purser R J A geometrical approach to the synthesis ofsmooth anisotropic covariance operators for data assimilationNOAANCEP Office Note 447 60 pp 2005

Rabier F Jarvinen H Klinker E Mahfouf J-F and Sim-mons A The ECMWF operational implementation of four-dimensional variational physics Quart J Roy Meteorol Soc126 1143ndash1170 2000

Rotunno R and Bao J W A case study of cyclogenesis using amodel hierarchy Mon Wea Rev 124 1051ndash1066 1996

Szunyogh I Kostelich E J Gyarmati G Hunt B R Ott EZimin A V Kalnay E Patil D J and Yorke J A A localensemble kalman filter for the NCEP GFS model no 94 J inProceedings of the AMS meeting Seattle Washington 2004

Szunyogh I Kostelich E J Gyarmati G Patil D J Hunt B RKalnay E Ott E and Yorke J A Assessing a local ensem-ble Kalman filter Perfect model experiments with the NationalCenters for Environmental Prediction global model Tellus 57A528ndash545 2005

Tippett M K Anderson J L Bishop C H Hamill T M andWhitaker J S Ensemble square-root filters Mon Wea Rev131 1485ndash1490 2002

Toth Z and Kalnay E Ensemble forecasting at NMC the gen-eration of perturbations Bull Amer Meteorol Soc 74 2317ndash2330 1993

Toth Z and Kalnay E Ensemble forecasting at NCEP the breed-ing method Mon Wea Rev 125 3297ndash3318 1997

Wang X and Bishop C H A comparison of Breeding and En-semble Transform Kalman Filter Ensemble Forecast Schemes JAtmos Sci 60 1140ndash1158 2003



this formulation makes its operational application impossiblewithout major simplifications even considering the potentialdevelopment of the computational facilities during the nextdecade There are other methods that try to account for theldquoerrors of the dayrdquo in the forecast error covariance such asthe modification ofB in 3D-Var (Purser 2005 De Pondecaet al 2006 Corazza et al 2002) 4-dimensional variationaldata assimilation (4D-Var egCourtier et al 1994 Rabieret al 2000 Klinker et al 2000) ensemble Kalman Filter-ing (EnKF egHoutekamer and Mitchell 1998 Hamill andSnyder 2000) and the method of representers (Bennett et al1996)

Recently several studies have been carried out in orderto substitute the standardB in 3D-Var with a more complexcovariance matrix no longer constant but capable to takeinto account time dependent flow In particular a generalmethod to implement any arbitrary anisotropy and thereforealso flux-dependent anisotropies has been implemented atNOAANCEP as described byPurser(2005) regarding thetheoretical geometrical implementation and byDe Pondecaet al(2006) concerning different implementations of the fil-ter All the implementations produced an improvement re-spect to the standard 3D-Var further confirming the impor-tance of the inclusion of the errors of the day in the data as-similation processCorazza et al(2002) experimented witha simpler method based on the use of the information pro-vided by an ensemble of bred vectors inB using the samesystem as in this work and also showed significant improve-ments with respect to 3D-Var

4D-Var is a powerful approach that reduces the compu-tational cost of Kalman Filtering by estimating the initialconditions such that the model forecast best fits the obser-vations within a data assimilation window 4D-Var producesthe same results as Kalman Filtering under the assumptionsof perfect linear model framework of Gaussian errors andof a Kalman Filter representation of the background errorcovariance matrix at the beginning of the computation Al-though 4D-Var is computationally affordable with the mostpowerful available computers its cost is still high comparedto 3D-Var In addition it requires the development and main-tenance of the adjoint of the model

Another approximation to Kalman Filtering is EnsembleKalman Filtering (EnKF) where the background error co-variance is estimated by the sample covariance from thebackground ensemble vectors As a result it has the advan-tage of being model independent ie no tangent linear andadjoint model are needed since EnKF only requires as inputan ensemble of forecasts and their corresponding predictionsof the observations and from that it produces a new ensembleof analyses In recent implementations that allow for the useof a relatively small number of ensemble members EnKFhas been shown to be less computationally expensive than4D-Var (Evensen 1994) A number of studies have been car-ried out with promising results (Houtekamer and Mitchell1998 Hamill and Snyder 2000 Tippett et al 2002 Ander-

son 2001 Etherton and Bishop 2004) and the system de-veloped byHoutekamer et al(2005) has been implementedin Canada to provide initial ensemble perturbations

Kalnay and Toth(1994) have pointed out that the breedingtechnique used for ensemble forecasting describes the localpattern of the ldquoerrors of the dayrdquo and proposed a method touse the information made available by an ensemble of bredvectors to in effect approximate the background covariancematrix In their work they argued that the similarity betweenbreeding (Toth and Kalnay 1993 1997) and data assimila-tion suggests that the background errors should have localstructures similar to those of bred vectors and this conjec-ture has been confirmed byCorazza et al(2003) with thesame Quasi-Geostrophic model used in this work

More recently based on the results ofPatil et al(2001)Ott et al(2002 2004) developed a new system called Lo-cal Ensemble Kalman Filter (LEKF) in which the KalmanFilter equations are solved within the space of the ensembleforecasts but locally in space This system has been testedon a simple 40 variables Lorenz model (Ott et al 2004) andon the NCEP Global Forecasting System (Szunyogh et al2004 2005) It is one of the possible implementations ofthe square-root ensemble Kalman filters where the analysisincrements and errors are computed using a background er-ror covariance derived from the ensemble of vectors so thatthe increments lay in the subspace spanned by these vectorsThis gives the major advantage of capturing the dominant er-ror growing directions thus providing a large improvementto the informations available from the observational errorsOther square-root filters derive their computational efficiencyfrom the assimilation of one observation at a time (Tippettet al 2002) whereas the LEKF computes the analysis inde-pendently at each grid point using all the observations avail-able within a local volume surrounding it This is advanta-geous in the presence of a large number of observations suchas satellite data and it allows for a very efficient parallel im-plementation

In this work we test different configurations of the LEKFby implementing it on a quasi-geostrophic channel model de-scribed inRotunno and Bao(1996) The results are com-pared with those obtained with an optimized 3D-Var systemdeveloped byMorss(1998) A perfect model framework isassumed so that the conclusions of this work are not neces-sarily valid for a more complex model with errors Howeverthis assumption allows us to explicitly define the ldquotrue stateof the atmosphererdquo (by integrating the model from a giveninitial state) and therefore perform a direct comparison inwhich the analysis and forecast errors are explicitly repre-sented

The outline of the work is as follows In Sect2 the im-plementation of the quasi-geostrophic model and of the 3D-Var data assimilation system are briefly reviewed In Sect3a description of the implementation of the Local EnsembleKalman Filter is presented Results are discussed in Sect4and a summary and conclusions close the work










31 The localization


















Pbmn(t) =

nsumj=1

xb(j)mn (t)

(x

b(j)mn (t)

)T

(1)



where

xb(j)mn (t) =

radicλ

(j)mn(t)u

(j)mn(t) (2)




Pbmn(t) = diag

[λ(1)

mn λ(2)mn λ

(k)mn

] (3)



ksumj=1

λ(j)mn (4)











ˆxamn(t) = xb

mn(t) +

Pbminus1


mnHmn

minus1times

HTmnRminus1

mn

(yo


)(5)

and

Pamn(t) =

[Pbminus1


mnHmn

]minus1(6)




mn(t) (7)

Pamn(t) = Qmn(t)Pa

mn(t)QTmn(t) (8)


Qmn(t) =

u(1)

mn(t)|u(2)mn(t)| |u

(k)mn(t)

(9)





mn andxamn Let

xa(i)mn (t) = xa



ksumi=1

δxa(i)mn (t) = 0 (11)

and


ksumi=1

δxa(i)mn (t)

[δxa(i)

mn (t)]T

(12)





b(i)mn (t) (13)


mn (t)

are defined as



and are such that


ksumi=1

δxb(i)mn (t)

(δxb(i)

mn (t))T

(15)




(a) (b)



Ymn = (16)[I + XbT

mn

(Pb

mn

)minus1 (Pa

mn minus Pbmn

) (Pb

mn

)minus1Xb

mn

] 12

where


12

δxb(1)


mn

(17)











(a)

(b)






H = H H (19)




4 Results



















(a) (b) (c)


0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r











5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References
















































31 The localization


















Pbmn(t) =

nsumj=1

xb(j)mn (t)

(x

b(j)mn (t)

)T

(1)



where

xb(j)mn (t) =

radicλ

(j)mn(t)u

(j)mn(t) (2)




Pbmn(t) = diag

[λ(1)

mn λ(2)mn λ

(k)mn

] (3)



ksumj=1

λ(j)mn (4)











ˆxamn(t) = xb

mn(t) +

Pbminus1


mnHmn

minus1times

HTmnRminus1

mn

(yo


)(5)

and

Pamn(t) =

[Pbminus1


mnHmn

]minus1(6)




mn(t) (7)

Pamn(t) = Qmn(t)Pa

mn(t)QTmn(t) (8)


Qmn(t) =

u(1)

mn(t)|u(2)mn(t)| |u

(k)mn(t)

(9)





mn andxamn Let

xa(i)mn (t) = xa



ksumi=1

δxa(i)mn (t) = 0 (11)

and


ksumi=1

δxa(i)mn (t)

[δxa(i)

mn (t)]T

(12)





b(i)mn (t) (13)


mn (t)

are defined as



and are such that


ksumi=1

δxb(i)mn (t)

(δxb(i)

mn (t))T

(15)




(a) (b)



Ymn = (16)[I + XbT

mn

(Pb

mn

)minus1 (Pa

mn minus Pbmn

) (Pb

mn

)minus1Xb

mn

] 12

where


12

δxb(1)


mn

(17)











(a)

(b)






H = H H (19)




4 Results



















(a) (b) (c)


0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r











5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References









































where

xb(j)mn (t) =

radicλ

(j)mn(t)u

(j)mn(t) (2)




Pbmn(t) = diag

[λ(1)

mn λ(2)mn λ

(k)mn

] (3)



ksumj=1

λ(j)mn (4)











ˆxamn(t) = xb

mn(t) +

Pbminus1


mnHmn

minus1times

HTmnRminus1

mn

(yo


)(5)

and

Pamn(t) =

[Pbminus1


mnHmn

]minus1(6)




mn(t) (7)

Pamn(t) = Qmn(t)Pa

mn(t)QTmn(t) (8)


Qmn(t) =

u(1)

mn(t)|u(2)mn(t)| |u

(k)mn(t)

(9)





mn andxamn Let

xa(i)mn (t) = xa



ksumi=1

δxa(i)mn (t) = 0 (11)

and


ksumi=1

δxa(i)mn (t)

[δxa(i)

mn (t)]T

(12)





b(i)mn (t) (13)


mn (t)

are defined as



and are such that


ksumi=1

δxb(i)mn (t)

(δxb(i)

mn (t))T

(15)




(a) (b)



Ymn = (16)[I + XbT

mn

(Pb

mn

)minus1 (Pa

mn minus Pbmn

) (Pb

mn

)minus1Xb

mn

] 12

where


12

δxb(1)


mn

(17)











(a)

(b)






H = H H (19)




4 Results



















(a) (b) (c)


0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r











5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References









































(a) (b)



Ymn = (16)[I + XbT

mn

(Pb

mn

)minus1 (Pa

mn minus Pbmn

) (Pb

mn

)minus1Xb

mn

] 12

where


12

δxb(1)


mn

(17)











(a)

(b)






H = H H (19)




4 Results



















(a) (b) (c)


0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r











5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References









































(a)

(b)






H = H H (19)




4 Results



















(a) (b) (c)


0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r











5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References





















































(a) (b) (c)


0

50

100

150

200

250

300

350

0 12 24 36 48 60 72Forecast Time

RM

S E

rro

r











5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References









































5 10 20 50 10010

minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 100

10minus8

10minus6

10minus4

10minus2

Nwv

Pow

er s

pect

rum


5 10 20 50 10010

minus8

10minus6

Nwv

Pow

er s

pect

rum

PV of the errors

5 10 20 50 10010

minus14

10minus12

10minus10

Nwv

Pow

er s

pect

rum

KE of the errors

5 10 20 50 10010

minus12

10minus10

10minus8

10minus6

Nwv

Pow

er s

pect

rum

SF of the errors

LEKF3DVAR

LEKF3DVAR

LEKF3DVAR




NWV =((25timesk)2





0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References









































0

50

100

150

200

250

300

350

5 10 15 20 25 30 35 40 45

Number of Vectors

RM

S E

rro

r


0

50

100

150

200

250

300


RM

S E

rro

r


(a) (b)









0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References









































0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N20L3N40L3

0

01

02

03

04

05

06

07

08

09

1

1 26 51 76 101 126 151 176 201 226 251 276 301 326 351Time (day)

N30L3N30L2N30L4

(a) (b)



R = e(rminusro)2

2σ2 R (20)















Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References















































Level 1 2 3 4 5

1 100 046 017 010 0102 046 100 035 017 0143 017 035 100 047 0334 010 017 047 100 0635 010 014 033 063 100









References








































An implementation of the Local Ensemble Kalman Filter in a ......M. Corazza et al.: LEKF in a QG...

Documents

Transcript of An implementation of the Local Ensemble Kalman Filter in a ......M. Corazza et al.: LEKF in a QG...