Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL...

15
METEOROLOGICAL APPLICATIONS Meteorol. Appl. (2008) Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/met.76 Three-dimensional spatial interpolation of surface meteorological observations from high-resolution local networks Francesco Uboldi, a * Cristian Lussana b and Marta Salvati b a Novate Milanese, Milano, Italy b ARPA Lombardia, Milano, Italy ABSTRACT: An objective analysis technique is applied to a local, high-resolution meteorological observation network in the presence of complex topography. The choice of optimal interpolation (OI) makes it possible to implement a standard spatial interpolation algorithm efficiently. At the same time OI constitutes a basis to develop, in perspective, a full multivariate data assimilation scheme. In the absence of a background model field, a simple and effective de-trending procedure is implemented. Three-dimensional correlation functions are used to account for the orographic distribution of observing stations. Minimum-scale correlation parameters are estimated by means of the integral data influence (IDI) field. Hourly analysis fields of temperature and relative humidity are routinely produced at the Regional Weather Service of Lombardia. The analysis maps show significant informational content even in the presence of strong gradients and infrequent meteorological situations. Quantitative evaluation of the analysis fields is performed by systematically computing their cross validation (CV) scores and by estimating the analysis bias. Further developments concern the implementation of an automatic quality control procedure and the improvement of error covariance estimation. Copyright 2008 Royal Meteorological Society KEY WORDS objective analysis; data assimilation; local weather services; observation influence; cross validation Received 5 September 2007; Revised 4 March 2008; Accepted 7 March 2008 1. Introduction In many areas of the world, departmental environmental agencies or other sub-national institutions manage local meteorological observation networks. These are charac- terized by high resolution, both in space and in time, and a high degree of automation. In principle, they consti- tute a rich source of meteorological information, but they present specific problems that have limited their exploita- tion and that need to be addressed. One issue is the non-uniform quality of local obser- vational networks which are often managed by organi- zations whose primary function might not be meteorol- ogy. These observations are not, in general, included in the Global Observation System, so observation methods and quality control procedures might not fulfil WMO standards (WMO, 1996). Even if data centres com- ply with quality control requirements, observations from local networks can be affected by errors of represen- tativity. This can be due to various reasons, including practical constraints (such as funding, observation sites chosen with non-meteorological priorities, administrative boundaries), and the density of the observational net- work itself. The representativity of observations from * Correspondence to: Francesco Uboldi, Novate Milanese, Milano, Italy. E-mail: [email protected] local networks is rarely studied, so the consequences on analysed fields of their inhomogeneous spatial distribu- tion are seldom controlled. In fact, most of the objective analysis techniques were developed in the context of Numerical Weather Prediction (NWP) and data assim- ilation in centres that, until recently, focussed on larger dynamic scales (synoptic and meso-α). With good reason the major NWP centres have generally chosen to discard these data. Recently, however, hydrostatic global models have reached very high resolutions (about 20 km for the ECMWF model since 2006). Operational non-hydro- static models nowadays have a resolution that is even higher than that of local meteorological networks (about 2.2 km for the German Lokal Modell in 2007). Even so, and in spite of the achievements of data assimilation methods at larger scales, it is not yet a common practice to merge dense observational networks and local mod- els. Objective difficulties arise from differences between model topography and real topography, from approxi- mations in the parameterization of surface and boundary layer processes, and from the dynamic behaviour of short and fast (meso-γ and convective) scales. A further practi- cal difficulty is that model fields are not always available, with their full resolution and frequency, to the small organizations that manage high resolution meteorological networks. Copyright 2008 Royal Meteorological Society

Transcript of Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL...

Page 1: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

METEOROLOGICAL APPLICATIONSMeteorol. Appl. (2008)Published online in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/met.76

Three-dimensional spatial interpolation of surfacemeteorological observations from high-resolution local

networks

Francesco Uboldi,a* Cristian Lussanab and Marta Salvatiba Novate Milanese, Milano, Italyb ARPA Lombardia, Milano, Italy

ABSTRACT: An objective analysis technique is applied to a local, high-resolution meteorological observation networkin the presence of complex topography. The choice of optimal interpolation (OI) makes it possible to implement astandard spatial interpolation algorithm efficiently. At the same time OI constitutes a basis to develop, in perspective, afull multivariate data assimilation scheme. In the absence of a background model field, a simple and effective de-trendingprocedure is implemented. Three-dimensional correlation functions are used to account for the orographic distributionof observing stations. Minimum-scale correlation parameters are estimated by means of the integral data influence (IDI)field. Hourly analysis fields of temperature and relative humidity are routinely produced at the Regional Weather Serviceof Lombardia. The analysis maps show significant informational content even in the presence of strong gradients andinfrequent meteorological situations. Quantitative evaluation of the analysis fields is performed by systematically computingtheir cross validation (CV) scores and by estimating the analysis bias. Further developments concern the implementationof an automatic quality control procedure and the improvement of error covariance estimation. Copyright 2008 RoyalMeteorological Society

KEY WORDS objective analysis; data assimilation; local weather services; observation influence; cross validation

Received 5 September 2007; Revised 4 March 2008; Accepted 7 March 2008

1. Introduction

In many areas of the world, departmental environmentalagencies or other sub-national institutions manage localmeteorological observation networks. These are charac-terized by high resolution, both in space and in time, anda high degree of automation. In principle, they consti-tute a rich source of meteorological information, but theypresent specific problems that have limited their exploita-tion and that need to be addressed.

One issue is the non-uniform quality of local obser-vational networks which are often managed by organi-zations whose primary function might not be meteorol-ogy. These observations are not, in general, included inthe Global Observation System, so observation methodsand quality control procedures might not fulfil WMOstandards (WMO, 1996). Even if data centres com-ply with quality control requirements, observations fromlocal networks can be affected by errors of represen-tativity. This can be due to various reasons, includingpractical constraints (such as funding, observation siteschosen with non-meteorological priorities, administrativeboundaries), and the density of the observational net-work itself. The representativity of observations from

* Correspondence to: Francesco Uboldi, Novate Milanese, Milano,Italy. E-mail: [email protected]

local networks is rarely studied, so the consequences onanalysed fields of their inhomogeneous spatial distribu-tion are seldom controlled. In fact, most of the objectiveanalysis techniques were developed in the context ofNumerical Weather Prediction (NWP) and data assim-ilation in centres that, until recently, focussed on largerdynamic scales (synoptic and meso-α). With good reasonthe major NWP centres have generally chosen to discardthese data.

Recently, however, hydrostatic global models havereached very high resolutions (about 20 km for theECMWF model since 2006). Operational non-hydro-static models nowadays have a resolution that is evenhigher than that of local meteorological networks (about2.2 km for the German Lokal Modell in 2007). Evenso, and in spite of the achievements of data assimilationmethods at larger scales, it is not yet a common practiceto merge dense observational networks and local mod-els. Objective difficulties arise from differences betweenmodel topography and real topography, from approxi-mations in the parameterization of surface and boundarylayer processes, and from the dynamic behaviour of shortand fast (meso-γ and convective) scales. A further practi-cal difficulty is that model fields are not always available,with their full resolution and frequency, to the smallorganizations that manage high resolution meteorologicalnetworks.

Copyright 2008 Royal Meteorological Society

Page 2: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

Any group that manages an observational networkmay, however, desire to produce analysis fields, even ifit does not operationally run a model into which datacan be assimilated. A spatial analysis procedure can, inprinciple, extract from the network’s measurements itsfull informational content and filter sub-scale noise fromobservations. It also allows development of robust dataquality control procedures. Besides, analysis maps arerequired for a variety of operational and practical appli-cations, such as weather forecasting, forecast verification,environmental monitoring, land and water management,and fire prevention. On the other hand, the high reso-lution of non-hydrostatic models itself may require highresolution maps of surface variables, both for a detaileddefinition of surface parameters, and for model forecastverification. For all these purposes, local high-resolutionobservational networks may provide a source of usefulinformation, with the condition that the quality of thedata is reliably controlled.

This work describes an optimal interpolation (OI)-based analysis scheme, suitable for high-resolution localnetworks. Reliable analysis maps can be obtained even inthe absence of a model-based first guess, by using a back-ground field obtained from the observations themselvesthrough a de-trending procedure. The scheme is appliedto temperature and relative humidity observations fromthe meteorological network of Lombardia, located in thesouthern side of the Alps (Figure 1). For these variablesthe analysis method proved to be quite effective. More-over, the analysis scheme represents a solid step towardsimplementing a complete multivariate data assimilationscheme, possibly including all observations and modeldynamics.

This article is organized as follows. Section 2 describesthe interpolation scheme. Section 3 describes the imple-mentation choices made to interpolate surface meteo-rological observations from Lombardia’s high-resolutionnetwork. In Section 4 the method is applied to two verydifferent test cases: a thermal inversion with persistentfog over the Po Plain and a north foehn event with strongthermal gradients. A quantitative evaluation of the inter-polation method, with its cross validation (CV) score anda discussion of bias, is presented in Section 5.

2. Interpolation method

2.1. Optimal interpolation

The role of spatial interpolation algorithms in dataassimilation is extensively described in various books onthe subject (Daley, 1991; Kalnay, 2003). The methodused in the present work is an implementation of OI(Gandin, 1963) synthetically described here with itsknown relation to other methods.

The analysis field of a meteorological variable may berepresented by its values at appropriate locations, typi-cally at the nodes of a regular grid that covers an area ofinterest. These I values are stored as components of theanalysis state vector xa . It is assumed that two types of

information are available. The first consists of M obser-vations, the components of the vector yo. The second isthe background (or first-guess) field, available at gridpoints, xb, and at the observation locations (stations) yb.The possibility that an independent background field isnot available is considered in Section 3.1, where a de-trending technique is presented.

The OI analysis is obtained by a linear relation betweenthe analysis increment xa − xb and the innovationyo − yb:

xa = xb + K(yo − yb

)(1)

Minimizing the variance of the analysis error, the OIexpression becomes

xa = xb + G(S + R)−1 (yo − yb)

(2)

where the gain matrix K = G(S + R)−1 is expressed bymeans of the error covariance matrices

G =⟨(

xb − xt) (

yb − yt)T⟩

S =⟨(

yb − yt) (

yb − yt)T⟩

(3)

R =⟨(

yo − yt) (

yo − yt)T⟩

where xt is the unknown ‘true’ state vector (so thatxa − xt is the analysis error vector) and the angularbrackets 〈〉 represent the expectation value with respectto an appropriately defined statistical ensemble. Eachcomponent of the (I , M) matrix G is the covariancebetween the background error at a grid point and thebackground error at a station point; each component ofthe (M , M) matrix S is the background error covariancebetween a couple of station points; and the (M , M) matrixR is the observation error covariance matrix.

The OI analysis on station points is

ya = yb + S(S + R)−1 (yo − yb)

(4)

where W = S(S + R)−1 is the influence matrix, whichlinearly relates the station analysis increment to theinnovation. It is worth remarking that, just as the ‘true’state is unknown, so are all the covariance matrices: G,S, and R. The estimates of these matrices determine thecharacteristics of the analysis field.

For the purpose of spatial interpolation, the compo-nents of the matrices G and S are specified by meansof analytical correlation functions. The parameters of thecorrelation function may be chosen to fit the availablestatistics on the field variabilities and errors. In the lit-erature and in the interpolation practice, many differenttypes of functions are used to model the error correla-tions, mostly combinations of exponentials, Gaussian andpolynomials. In the applications presented here, Gaussianfunctions are used (Section 3.2).

For point-wise observations taken by different instru-ments, it is a widely used approximation to assume that

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 3: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

R is a diagonal matrix, i.e. observational errors at differ-ent station locations are uncorrelated. Observed values,besides the instrumental error, may be affected by repre-sentativity error, due to small scales and local phenomenathat cannot be represented by the state vector. This means,on the one hand, that if these phenomena are of interestfor the final user, the chosen representation of the fieldis inadequate. On the other hand, if the chosen represen-tation of the field is considered adequate, then the smallscales have no interest: they should be then consideredas noise and filtered out. In this case, it is appropriate toinclude an estimate of the representativity error covari-ance in the observation error covariance R.

2.2. Optimal interpolation and other methods

In this paragraph, the interpolation algorithm is presentedbriefly in the more general context of data assimilationmethods. In this way, a univariate spatial interpolationalgorithm may, in perspective, become a basic componentof an advanced multivariate data assimilation scheme,able to account for physical and dynamical constraints.Moreover, even for the limited purpose of spatial inter-polation, the advanced development of data assimilationmethods can be exploited to obtain robust tools for theimplementation of objective analysis techniques and thequantitative evaluation of the produced fields.

The formal equivalence between OI and other interpo-lation algorithms such as kriging and smoothing splines(Wahba and Wendelberger, 1980; Steinacker et al., 2000)is well known (Lorenc, 1986; Daley, 1991), as is the con-vergence to OI of iterative methods such as successivecorrections (SCs) (Bratseth, 1986; Daley, 1991; Uboldiand Buzzi, 1994). In order to discuss the role of OI in dataassimilation further, the concept of observation operatorneeds to be introduced.

Analysis and background fields xa and xb are modelstates. An observation operator is a function H which,starting from the background model state xb, computesthe observational estimates yb = H

(xb)

that can bedirectly compared with the observations yo. Generallyspeaking, the I state variables and the M observationsmay be very different. They are referred to differentspatial locations; they may have different physical dimen-sions; and may be referred to different times in the caseof four-dimensional algorithms. As a consequence, theobservation operator H may be very complicated. It maybe a non-linear function and it may involve integrals,thus permitting the assimilation of remote sensing obser-vations. The concept of observation operator is fullyexploited in variational algorithms, and it is explicitlyemployed in Kalman-filter-based sequential algorithms.When explicit use is made of an observation operator H ,the OI analysis becomes

xa = xb + BHT (HBHT + R)−1 (

yo − H(xb))

(5)

where H is the Jacobian matrix of the vector function H ,i.e. the matrix containing its first derivatives with respect

to the state components: [H]m,i = ∂Hm/∂xi . Equation (5)also represents a linearized (or incremental ) 3D-Varanalysis (Daley, 1991; Kalnay, 2003). The matrix B isthe background error covariance matrix, defined in a wayanalogous to Equation (3):

B =⟨(

xb − xt) (

xb − xt)T⟩

(6)

The components of the (I , I ) symmetric matrix B aredefined between couples of state variables. By comparingEquation (5) with Equation (2), it can be seen that H isused to transform the background error covariances:

G = BHT (7)

S = HBHT (8)

As a consequence, the relation between the influencematrix W and the gain matrix K is

W = HK (9)

If the concept of observation operator is extended toinclude a model integration, Equation (5) may also rep-resent a linearized (or incremental ) 4D-Var, assimilatingdata that are distributed in space and in time (Uboldiand Kamachi, 2000; Bennett, 2002; Kalnay, 2003). Theanalysis step in a Kalman filter also has the same formas Equation (5), with the important difference that thestationary matrix B used here is replaced with the dynam-ically evolving forecast error covariance matrix Pf . Therepresentativity error may be formally interpreted as theobservation operator error.

2.3. Integral data influence and cross validation

This section introduces some useful diagnostic and vali-dation tools that are used in Section 3.2 and in Section 5:the integral data influence (IDI) field and the CV score.

Once the covariance matrices are chosen, the IDI fieldis defined as the analysis field obtained when all observedvalues are set to 1 and all background values are set to 0.By doing this in Equation (2) and in Equation (4), the IDIfield can be defined both on station points, yIDI, and ongrid points, xIDI (Figure 3). The IDI field can be formallydefined from the influence matrix W (Appendix A), butit has a straightforward interpretation. The observationalinformation is that the field is uniform and it has a value1, while the background information is that the field isuniform but it has a value 0. In station dense areas, theanalysis field is almost uniform and close to 1. Sincethe correlation function approaches zero for increasingdistance, far from all stations the analysis is dominatedby the background field, resulting in an approximatelyuniform zero field. In intermediate areas the analysedvalue is between 0 and 1, but gradients are present. Valueslarger than 1 are possible and they are due to strong non-uniformities in the data distribution. The IDI field maybe used to study the effects on the analysis of removing

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 4: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

or adding stations; hence, it represents a useful tool foroptimizing an observational network, as long as the co-variance estimates are considered significant.

The CV analysis, yam, is defined as the analysis estimate

obtained for the mth observation by using all otherobservations, but without using the mth observation itself.On the basis of the CV analysis, the CV score (Wahbaand Wendelberger, 1980; Daley, 1991) is defined as

CVscore =√√√√ 1

M

M∑m=1

(ya

m − yom

)2(10)

The CV score represents an estimate of the analysiserror based on the idea that each observation is usedas an independent verification of the analysis field.The estimate is conservative because of the implicitdegradation of the local resolution of the observationalnetwork. The CV analysis ya

m is also useful for qualitycontrol tests on the mth observation.

CV and data influence can be combined by definingthe CV-IDI parameter yIDI

m in a way analogous to theCV analysis, so that the difference between isolatedstations and densely observed areas is enhanced. Infact, for a completely isolated station (i.e. having zerocorrelation with all the other stations), yIDI

m = 0 andyIDI

m = σ 2b,m/

(σ 2

b,m + σ 2o,m

). Conversely, yIDI

m = 1 impliesyIDI

m = 1. Details of the relations between IDI, the CVanalysis and other diagnostic tools are discussed inAppendix A.

The direct knowledge of the components of the inversematrix (S + R)−1 and, in particular, of its diagonalelements allows efficient computation of the CV score.A data quality control procedure (work currently inprogress) can also make use of the CV analysis ya

m ateach station. For these reasons a technique, describedin Appendix B, has been implemented for updatingthe inverse matrix (S + R)−1 when new data becomeavailable or some data are occasionally missing, replacinga previous version of the interpolation procedure, inwhich an iterative method was used at each analysis timeto solve the linear system.

3. Implementation of the analysis method fortemperature and relative humidity fields

The OI scheme described above is applied to hourlyobservations of surface temperature and relative humiditycollected by Lombardia’s local meteorological networkof automatic weather stations.

The observational network is managed by the RegionalWeather Service, operational since 2004, and has beencreated by merging previously existing networks, not allof them designed for meteorological applications (maincomponents are air quality and hydrology networks).About 270 stations measure temperature and precipita-tion, and about half of them also measure other param-eters (pressure, relative humidity, wind speed and direc-tion, solar radiation). This seemingly high density of the

Figure 1. Orography and station locations in the Lombardia domain.Triangles: temperature stations. Circles: relative humidity stations. Thebold black line is the administrative boundary. The mountain rangesare the Alps in the northern part of the map and the Appennines inthe southern part, with the Po Plain in between. The inset showsthe geographical location of Lombardia, longitude 8.5 to 11.5 °E,latitude 44.6 to 45.7 °N. This figure is available in colour online at

www.interscience.wiley.com/ma

network, though, may be misleading. The station dis-tribution is not homogeneous, with several sites locatedin urban areas. Non-standardized observation sites oftenlead to low-quality observations (i.e. high representativ-ity error). Moreover, the analysis area is on the southernslope of the Alpine range and it is characterized by verycomplex terrain; hence the vertical distribution must alsobe considered: the elevation of observation sites rangesfrom 0 to 3000 m above mean sea level. Figure 1 showsstation sites and Lombardia’s orography.

The general characteristics of the observation networkand the availability of an appropriate first-guess fieldstrongly influence the choices that must be made inimplementing the interpolation scheme, namely, the back-ground xb and yb and the error correlation matrices G,S, R.

3.1. Pseudo-background field built from theobservations (de-trending)

The use of a background or first guess field is oneof the strong points of OI schemes, which in generalaim to the optimal merging (in a statistical sense) ofinformation from different sources. A model field is oftenused for this purpose, because it represents a sourceof information that is consistent with the atmospheric

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 5: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

dynamics. This work presents a different possibility thatcan be used if a model-based analysis is undesired (asit may be in verification studies) or impossible (if modelfields are unavailable). Besides, an analysis scheme whichonly makes use of observations can provide insight onthe network characteristics. This knowledge can then beprofitably used in the integration of observations withother sources of information.

The choice presented here is to build, from thedata themselves, a ‘pseudo’ background field, meantto represent the main spatial trends present in thefield and detected by the observations. The backgroundfield is calculated as a linear function of the spatialcoordinates, by performing a least-square minimization ateach observation time. The scheme allows for a changein the vertical derivative; this has a particular relevancefor the case of thermal inversions that frequently occuron the Po Plain.

The vertical dependence of an atmospheric field result-ing from measurements taken at observing stationslocated at the surface cannot be considered, in general,an information on the vertical structure of the free atmo-sphere. On the other hand, when an important thermalinversion is present over the area it does have an influ-ence on observations taken near the surface, particularlyon those taken at stations located on the orography fac-ing a wide valley such as the Po Plain. This can easilybe seen in Figure 2, showing the temperature observa-tions and the background estimates on station points asa function of station elevation above sea level (on the y

axis) for the two application cases discussed in Section4. In both cases a thermal inversion (of different nature)is evident. At higher elevations, the vertical dependenceis very clear in Figure 2(b) (north foehn case), whereasit appears more dispersed in Figure 2(a) (persistent fogcase).

If high-resolution topographic information is availablefor the area, the de-trending scheme allows the choice ofany grid size which can be customized to the intended useof the analysis. The fields presented here are computedon a 174 × 177 grid, with 1.5 km resolution. The gridpoint elevation has been regularly sampled from a 250 m

digital elevation model without any smoothing. Thechoice is made to minimize the difference between gridpoint and station elevation, while, at the same time,retaining a reasonable grid size in terms of computationaltime. As a consequence, the grid orography is verydiscontinuous. For the present application of the analysismaps, which is purely diagnostic, the choice appearsjustified. If the interpolated fields are needed as inputin numerical integrations then different choices must bemade.

It is worth remarking that the grid resolution doesnot correspond to the analysis scale resolution, whichis determined by the network density distribution, by theproperties of the analysed variable, and by the choicesmade in specifying the error covariance matrices.

3.2. High resolution topography and three-dimensionalcorrelations

It is assumed that the observation error covariance matrixR is diagonal and all observation errors have the samevariance σ 2

o :R = σ 2

o I (11)

where I is the identity matrix. A diagonal matrix impliesthe assumption, reasonable for point-wise measurements,that different observation sites have independent errors.On the other hand, assuming the same error variance forall observations is quite a drastic simplification: a morerealistic representation of observational error is currentlyunder study.

The function of horizontal and vertical distances thatis used to estimate the background error correlation is

γ (d, �z) = e−1

2

(d

Dh

)2

e−1

2

(�zDz

)2

(12)

where d is the horizontal distance between the two points,and �z is the difference between their elevations abovesea level. Dh and Dz are the de-correlation distancesin the horizontal direction and vertical direction, respec-tively. If the background error variance σ 2

b is assumed to

0

500

1000

1500

2000

2500

3000(a) (b)

-10 -5 0 5 10 15

elev

atio

n (m

am

sl)

temperature (°C) temperature (°C)

0

500

1000

1500

2000

2500

3000

-20 -15 -10 -5 0 5 10

elev

atio

n (m

am

sl)

Figure 2. Station elevation on the y axis, and temperature on the x axis: (a) persistent fog case: 20 January 2006, 1200 h (1100 UTC); (b) northfoehn and snow case: 12 March 2006, 0500 h (0400 UTC). Triangles: observations. Circles: background estimates on station points. This figure

is available in colour online at www.interscience.wiley.com/ma

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 6: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

Figure 3. Integral data influence (IDI) field (non-dimensional) fortemperature observations in the fog case of 20 January 2006, 1200 h(1100 UTC) obtained with Dh = 20 km, Dz = 500 m, and ε2 = 0.5.

The field is masked out below the value 0.5.

be uniform, then the background error correlation matri-ces are defined as G ≡ σ−2

b G and S ≡ σ−2b S. The analysis

is then obtained from Equation (2) as

xa = xb + G(

S + ε2I)−1 (

yo − yb)

(13)

where the scalar ε2 = σ 2o /σ 2

b is the ratio between thebackground and the observation error variances. Inthis way the components of the gain matrix, K =G(

S + ε2I)−1

, only depend on the three parameters Dh,

Dz, and ε2. From the definition of ε2, it is clear thatassuming ε2 = 0 implies assuming perfect observations,hence exact interpolation. On the other hand, settingε2 > 1 implies a greater confidence in the backgroundfield rather than in the observations.

The OI algorithm may also be seen as a numericallow-pass filter, whose cut-off wavelength depends on theparameters Dh, Dz, and ε2. This is discussed in moredetail, in an ideal case, in Appendix C.

The values of the scale parameters of the correlationfunction, Dh and Dz, must be set large enough tofilter out short scales that cannot be resolved by theobservational network. Intuitively, Dh and Dz cannot bechosen too small with respect to the typical distancesthat characterize the data distribution. Above this lowerbound, the values of Dh and Dz can be chosen in order toretain the relevant scales for the meteorological variableunder consideration.

The non-dimensional IDI field, defined in Section 2.3and shown in Figure 3, is very useful in choosing theminimum values for these scale parameters. In fact, givena set of parameter values Dh, Dz, and ε2; the IDI fieldonly depends on the data distribution and may change intime only because of missing data.

The values Dh = 20 km, Dz = 500 m, and ε2 = 0.5have been chosen for the temperature field so that theIDI field appears rather uniform and above 0.9 in thePlain and it is only marginally noisy in the mountainareas. Dh and Dz have been chosen by progressivelyincreasing their values until an acceptable IDI field isobtained. Given the filtering properties of the interpo-lation, described in Appendix C for a continuous datadistribution, an homogeneous IDI field can be interpretedas an indication that the non-uniformity existing in thedata distribution only marginally affects the filter. Thechoice of larger Dh and Dz values would essentiallyincrease the cut-off wavelength and would filter out scalesthat can actually be detected by the observational network(though with some deformation). Figure 4 shows how adecrease in these values corresponds to a drastic increasein non-uniformity, while increasing the scale lengths bythe same amount leads to smaller differences in the IDIfield uniformity.

The trade-off realized for the value of ε2 is also shownin Figure 4. Smaller ε2 values imply more weight to theobservations, and correspond to a larger IDI, but withmaxima (above 1) and gradients. Larger ε2 values implymore weight to the background field, smaller IDI values,and a more uniform field.

The analysis of the IDI field is intended to estimate alower bound, imposed by the station spatial distribution,for scales that can be detected and resolved by theobservational network. Above these minimum scales,however, it is possible to optimize the values of Dh,Dz, and ε2 by means of statistical parameter estimationmethods (Dee et al., 1999; Dee and Da Silva, 1999;Tarantola, 2005). These methods, by making use of theactual observations over the period chosen to calculateexpectation values, allow optimization of correlationscales with respect to spatial scales typically present inthe real atmospheric fields. Work on statistical parameterestimation is under way for temperature and it is plannedfor other variables. Preliminary results (not shown) seemto confirm the values chosen for temperature by means ofthe IDI field, with a remarkable variability for ε2 that mayperhaps be attributed to the representativity component ofobservational error.

3.3. Relative humidity

The procedure used for relative humidity analysis is asfollows. Since a temperature observation is always avail-able in correspondence to each relative humidity obser-vation (Figure 1), the ‘observed’ dew-point temperaturecan always be obtained. These dew-point temperatureobservations are homogeneous to temperature and can besimilarly treated. In particular, sharp vertical derivativechanges also occur for dew-point temperature profilesin the free atmosphere, affecting surface measurementsin orographic areas. A background field is then cal-culated from dew-point temperature observations, usingthe procedure described for temperature in Section 3.1.The subsequent interpolation is carried out for dew-point

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 7: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

(a) (b)

(c) (d)

(e) (f)

Figure 4. Temperature observations on 20 January 2006, 1200 h (1100 UTC). IDI fields (non-dimensional) obtained by varying the parameters withrespect to the chosen values Dh = 20 km, Dz = 500 m, ε2 = 0.5 (grey scale as in Figure 3): (a) Dh = 15 km; (b) Dh = 25 km; (c) ε2 = 0.3;

(d) ε2 = 0.7; (e) Dz = 400 m; (f) Dz = 600 m. All fields are masked out below the value 0.5.

temperature, with error covariance matrices estimated asdescribed in Section 3.2. In particular, the parametervalues are determined by making use of the OI filter-ing properties and of the IDI field: the resulting valuesare Dh = 30 km, Dz = 600 m, and ε2 = 0.5. These val-ues are larger than those found for temperature because

the relative humidity observation distribution is coarser,particularly in the Alpine area. A saturation check is per-formed both on the background and on the analysis field,by using the known temperature analysis and allowingfor over-saturation up to a maximum relative humidityof 103%. The dew-point temperature and temperature

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 8: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

analysis fields are then used to compute a relative humid-ity value on each grid point. Some small-scale details,unresolved by the hygrometer network, may appear inrelative humidity maps thanks to the higher resolution oftemperature observations.

3.4. Discussion

The interpolation scheme is applied to hourly-averagedobservations, as all stations in the network record andreport at least at hourly intervals. For temperature ithas been used as an operational tool, monitored daily,since January 2006. Relative humidity analysis mapshave been produced operationally since February 2007.The interpolation scheme has proved to be robust andsensitive to details and has correctly described majormesoscale features of temperature and relative humidityfields in the area.

There are known limits in the choices made, the mainbeing the unavailability of an independent backgroundfield. The background field built from the data is howevereffective in estimating the main trends present overthe area. Considering the relatively small size of thearea under consideration, these may be approximatelyattributed to the synoptic and meso-α scales. A betterrepresentation of the larger scales, such as that presentin a model field for example, would certainly improvethe quality of the analysis. However, as long as a modelfield is not available, imposing a more complicatedspatial dependence on the background field would berather arbitrary. It must be stressed, though, that sucha simple linear dependence on spatial coordinates wouldbe inadequate, even for de-trending purposes, in the caseof a larger area.

When a pseudo-background field is built from thedata, the assumption that background and observationalerror are independent may be put into question. Howeverthe (larger) scales present in the background field andthose (shorter) resolved in the analysis field may beconsidered uncorrelated. Parrish and Derber (1992) useda covariance matrix diagonal in the spectral space, thusassuming de-correlation between different scales; they

obtained a non-diagonal background error covariancematrix after transformation to the grid point space.

4. Application examples in strong gradient cases

This section shows two examples of temperature and rel-ative humidity analysis obtained through the OI schemein different synoptic situations. The cases discussed hererepresent a good test for the scheme for the presenceof strong temperature and humidity gradients across thearea. Resolving gradients and fronts poses a challengefor all analysis techniques and, in fact, also for obser-vational networks. In both cases the phenomena couldbe investigated from independent information (satelliteimages, soundings, synoptic analysis, SYNOP reports,surface observation network).

The quantitative diagnostic tools available to estimateanalysis error variance and bias and the results oftheir application to temperature analysis are presented inSection 5.

4.1. Strong ground based temperature inversion

Figure 5 refers to a winter case of high pressure subsi-dence causing persistent fog on the Po Plain, 20 January2006. The fog, of radiative origin, is associated with amarked ground based inversion, apparent also from the0600 UTC Linate radio sounding and corroborated bySYNOP reports (not shown). Figure 5(a) shows the tem-perature analysis field at 1200 h (1100 UTC, local time isUTC + 1 h). An intense temperature gradient is presentin the plain area. Figure 5(c) shows the relative humid-ity analysis field at the same time. In Figure 5(b) thecorresponding METEOSAT satellite image in the HRV(High Resolution Visible) channel is displayed: this isan informational source that is completely independent,and shows the presence of dense fog or low-level stra-tus clouds in the plain area. The correspondence betweenthe cloudy area of Figure 5(b) and the cold, humid areadefined by the strong temperature gradient in Figure 5(a)and by the relative humidity field in Figure 5(c) is quiteremarkable. The interpolation scheme takes advantage,

(b)(a) (c)

Figure 5. Fog on the Po Plain, 20 January 2006, 1200 h (1100 UTC): (a) temperature analysis (in °C, colour scale in Figure 6); (b) METEOSATsatellite image, visible channel; (c) relative humidity analysis (%).

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 9: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

in this case, of the high observation density in theplain, particularly for relative humidity. Moreover, thepseudo-background field, allowing for a vertical deriva-tive change, is particularly well suited to describe situa-tions where a main thermal inversion is present.

The CV score for the temperature analysis field shownin Figure 5 is 2.43 °C. This value, higher than the averagevalue of 1.49 °C (the CV score distribution is shown inFigure 8), is determined by contributions from stationslocated in the Alpine area. In fact, if the CV scorecalculation is restricted to the plain and to the orographicarea immediately facing the plain (thus including thegradient and the warm belt appearing in Figure 5(a),a value very close to the average is obtained, 1.61 °C.On the one hand, as it is apparent in Figure 2(a), theobserved temperature values appear quite dispersed forelevations higher than 500 m: the background field, inthis case, can only be considered a rough approximationfor the temperature field in the Alpine region. On theother hand, many observations are missing, almost allrelative to stations located in the mountain area: only206 stations provided useful measurements, while, forcomparison, 242 observations are used for the map ofFigure 6 (274 temperature stations are nominally active,August 2007). This lack of data limits the correctionthat the interpolation can perform to the backgroundfield. In conclusion, the temperature analysis field isquite satisfactory in the Po Plain and its immediatesurroundings, but it cannot be considered very accurate inthe Alpine areas in this particular case. Presumably, thesame is also true for relative humidity, due to the smallernumber of humidity observations and to the dependenceof the relative humidity field on the temperature analysis.

4.2. North foehn

Figures 6 and 7 refer to a completely different meteoro-logical situation. In this case, 12 March 2006, the west-ern part of Lombardia is under the influence of intensenortherly winds (caused by a meridional intensificationof the jet stream) which assumed the character of (north)foehn during the last part of the night and in the earlymorning. Heating due to adiabatic compression, a rela-tively frequent phenomenon in the mountain valleys inthe north-western part of Lombardia (Valchiavenna), wasalso detected in some western plain areas in this case.In the eastern part of the analysis domain, the high-level circulation assumed a cyclonic character. Very lowtemperature values were observed in the mountain areasto the northeast and light snowfall occurred during themorning in the south-eastern plain.

The temperature analysis field influenced by all thesephenomena is shown in Figure 6. The symbols on themap mark the locations of the four observing stationswhose measurements are plotted versus time in Figure 7.In the time plots of Figure 7(a)–(c), it is easy to see thefoehn onset, marked by the sudden increase in wind andtemperature and the strong decrease in relative humidity.

Figure 6. Foehn and snow case, 12 March 2006, 0500 h (0400 UTC):temperature analysis field. Symbols mark the four stations whose timeplots are shown in Figure 7. Closed circle in the northwest: Samolaco,elevation 210 m amsl, in the Alpine valley Valchiavenna; open squarein the western Po Plain: Minoprio, elevation 320 m; closed square inthe south-western Po Plain: Castello, elevation 110 m; closed triangle

in the south-eastern Po Plain: Mantova, elevation 20 m.

In Figure 6, mild temperatures (up to +9 °C) areevident in the western areas affected by the subsi-dence warming (foehn): Valchiavenna, where the symbolmarks Samolaco, Figure 7(a); the λ-shaped Como Lake;the western plain, where symbols mark Minoprio andCastello, respectively in Figure 7(b) and (c). It is worthnoting that in Castello’s time plot, Figure 7(c), foehnonsets about 2 h after Figure 6’s analysis time. In fact,in the map, a cold area still surrounds the observingstation: the progressive foehn expansion can be clearlyseen through the sequence of hourly analysis maps (notshown).

At the same time, the map of Figure 6 shows verylow temperatures (below −15 °C) in the north-easternmountains (Alta Valtellina) and a cold area (about 0 °C) inthe south-eastern plain, where a symbol marks Mantova,Figure 7(d). This last time plot shows very differentweather compared to the other stations of the same figure:weak winds and high relative humidity almost constantduring the whole morning.

This complex situation puts the approximations used,in particular those present in the background field, closeto their limits. The field shown in Figure 6 appears quiteacceptable, however, for diagnostic purposes. Its CVscore is 1.68 °C, a value near the average.

5. Diagnostics

The hourly maps produced by the algorithm appeardetailed and consistent with independent meteorologicalinformation. However, to understand the characteristics

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 10: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

0 8 12

–5

0

5

10

15

20Samolaco

0

20

40

60

80

100

4

0 8 124

0 8 124

0 8 124

–5

0

5

10

15

20Minoprio

0

20

40

60

80

100

–5

0

5

10

15

20Castello

0

20

40

60

80

100

–5

0

5

10

15

20Mantova

0

20

40

60

80

100

(a) (b)

(c) (d)

Figure 7. Foehn and snow case, 12 March 2006. Time plot of temperature (dashed line), relative humidity (thick solid line), and wind speed(thin solid line) observed by the four stations marked in Figure 6. Left y-axis: temperature (°C) and wind speed (ms−1); right y-axis: relativehumidity (%). Time on the x-axis is local time (UTC + 1). In all graphs the vertical line indicates the time corresponding to the map of Figure 6.

This figure is available in colour online at www.interscience.wiley.com/ma

of the interpolation scheme, a quantitative analysis hasbeen carried out for temperature using CV and biasestimates.

5.1. Cross validation score

The CV score, defined in Section 2.3, represents anestimate of the analysis root-mean-square error based onthe idea that each observation is used as an independentverification of the analysis field.

The CV score is routinely computed for temperatureanalysis fields using Equation (10). The values obtainedfor the application cases have been presented and dis-cussed in Section 4. A global CV score was computedfor temperature analysis fields by averaging the individ-ual hourly values over the whole year 2006, giving themean value CVscore = 1.49 °C. The 2006 hourly CV scoredistribution is shown in Figure 8.

CV SCORE [°C]

0.0 0.5 1.0 1.5 2.0 2.5 3.00

400

800

1200

Figure 8. Distribution of hourly temperature analysis CV scores for theyear 2006. The mean is CVscore = 1.49 °C.

0.0

1.0

2.0

<0.45 0.45 – 0.85 >0.85

CV–IDI

Figure 9. Box-plot of absolute values of CV residuals,∣∣ya

m − yom

∣∣ (°C),versus CV-IDI, yIDI

m (non-dimensional), for hourly 2006 temperatureanalysis.

A systematic check of cross validation residuals∣∣yam − yo

m

∣∣ for each observation site was carried out. Thebox-plot of Figure 9 shows the distribution of the CVresiduals (calculated in absolute value for each station)against the corresponding CV-IDI value yIDI

m (Section2.3). This parameter measures the data influence due tothe other stations: isolated stations have yIDI

m ≈ 0, whilestations located in densely observed areas have yIDI

m ≈ 1.In other words, the CV-IDI may be interpreted as ameasure of observational coverage (dependent on thecorrelation estimates). It can then be seen that the proba-bility of large errors, represented by the upper box height,decreases with increasing observational coverage. Themedian of the CV residuals is below 2 °C for all areas.For data-dense areas, characterized by yIDI

m > 0.85, themedian is smaller, about 1 °C: it is however larger thanzero because local scales affecting the observations are

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 11: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

filtered out by the interpolation. It is worth remarkingthat in data-sparse areas, corresponding to small CV-IDI values yIDI

m < 0.45, the analysis is determined by thebackground field. In this case the analysis error can beinterpreted as background error: its rather low averagevalues support the assumptions of Section 3.1.

5.2. Analysis bias

OI provides an unbiased analysis under the hypothesisthat both the observations and the background fieldare unbiased estimates of the atmospheric state. Inpractice, both background and observations can be biased(observation bias is mostly due to representativity error).The resulting analysis bias should then be estimated.

On station points, the (station dependent) backgroundbias

⟨εb⟩

and the observational/representativity bias 〈εo〉appear as components of the average innovation

〈d〉 = ⟨yo − yb

⟩ = ⟨εo⟩− ⟨εb

⟩(14)

The distribution of the 〈d〉 components evaluated onthe whole 2006 dataset is practically normal (Figure 10),with mean 0.02 °C and standard deviation 1.72 °C. Formost stations the average innovation is almost zero,but in some cases a background bias or representativitybias is present. A closer inspection shows featuresthat have intuitive explanations. Systematically negativeinnovations are mainly located in mountain valleys,because the background estimate on stations at lowelevation in valley floors is influenced by the warmerand more numerous stations in the Po Plain. On the otherhand, most of the stations with positive 〈d〉 are located inlarge urban areas, subject to the urban heat island effect.

Warm anomalies in urban areas and cold anomalies inmountain valleys are also visible in Figure 11, that showsthe analysis increment averaged over the year 2006. Thisfield can be written as

⟨xa − xb

⟩ = K(⟨εo⟩− ⟨εb

⟩)(15)

If the observations and the background were bothactually unbiased, this average field should be practicallyzero: the non-zero features can be attributed to thepresence of a background or a representativity bias,reduced or amplified by the interpolation (the effect ofthe gain matrix K). For example, the extension of thewarm anomaly observed by stations located in the urbanarea of Pavia (square marker in Figure 11) is amplifiedand extended in the south-eastern direction to a largedata-void area. Such an effect is typical of isotropiccorrelation functions in the presence of inhomogeneousdata distribution.

The urban heat island effect is presently neglectedboth in the background field and in the covarianceestimates. In the case of large urban areas, these warmanomalies should be considered as background errors,and their correction could be achieved by including landuse information in the covariance model. On the other

innovation [°C]

−10 −5 0 5 100

100000

250000

Figure 10. Frequency distribution of innovation (d = yo − yb , °C) onstation points for the year 2006.

Figure 11. Average temperature (°C) analysis increment⟨xa − xb

⟩for

the year 2006. The black dots represent the station sites. The squareencloses the urban area of Pavia.

hand, bias effects due to smaller scale features should beconsidered as representativity errors and filtered out. Thiscould be obtained by means of a station-dependent ε2.

Existing techniques to estimate and correct the back-ground bias in data assimilation (Dee and Da Silva, 1998;Dee and Todling, 2000) will be taken into considerationin future work.

6. Conclusions

This document presents a spatial interpolation schemebased on OI and suitable for observations from high-resolution local networks. Despite the presence of impor-tant representativity errors and the complex topography of

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 12: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

Lombardia, the observations do provide useful mesoscaleinformation, that is correctly recovered by the interpola-tion.

The resulting temperature and relative humidity analy-sis maps are presented in two example cases, chosen forthe presence of strong gradients that stress the approxima-tions used in the implementation. Independent observa-tions show that the analysis maps correctly describe rel-evant meteorological phenomena. The accuracy of tem-perature analysis fields is quantitatively evaluated throughthe distribution of hourly CV scores over the whole year2006 resulting in rather low values, always below 3 °Cand with average of about 1.5 °C. The presence of bias inthe analysis fields is evaluated by computing the averageanalysis increment, and it is discussed with reference torepresentativity bias and background bias.

The presented scheme has been running daily for overa year, producing hourly maps of temperature and relativehumidity. The analysis maps, quite satisfactory in allweather situations, are used for the daily activity ofweather analysis and forecasting at the Regional WeatherService of Lombardia. The quality of the hourly maps isgenerally analogous to that of the two examples shown(in fact it is often higher, as the CV scores of the examplefields are larger than the average).

A better knowledge on reliability and errors of theobservational network also resulted from the studies car-ried out to implement the interpolation scheme. An auto-matic quality control procedure based on the analysis ofCV residuals (Lorenc, 1984; Gandin, 1988) is currentlyunder development and will be the subject of a sepa-rate communication. By means of the spatial consistencycheck, it is possible to automatically detect and discardrough errors that inevitably affect the observations. TheIDI field, presented and used here to tune the correla-tion coefficients, also represents a useful tool for networkplanning and management: its changes allow quick eval-uation of the effects of adding, moving, or removingobservation sites.

Some limitations are necessarily present in any interpo-lation procedure that does not account for the atmospheredynamics. Even so, the simplifications employed in theimplementation of the interpolation scheme very seldomresulted in unrealistic features, always very local andwith small amplitude. Qualitative and quantitative diag-nostics indicate that, among the assumptions used in theimplementation, the least realistic appear to be the use ofisotropic correlation functions and the hypothesis of unbi-ased background and observations. Consequently, workis at present devoted to

• estimating the background-to-observation error ratio ε2

for each station, including local effects as representa-tivity error;

• refining the covariance estimates, with the aim ofexplicitly accounting for local terrain features (urbanheat islands, orientation of orographic slopes, andproximity to surface water masses) to attenuate thecorrelation between differently characterized sites;

• studying the feasibility of a bias estimation and correc-tion procedure (Dee and Da Silva, 1998) even thoughin the absence of a model background field, correctionsin data sparse areas may remain difficult;

• extending the quantitative CV and bias evaluation torelative humidity.

Improvements to the analysis scheme are regularlychecked by evaluating CV scores and by studying inno-vation and analysis increment statistics.

The realization of a full four-dimensional data assim-ilation system, including model dynamics and remotesensing information, may be beyond the immediate reachof small working groups with operational priorities. How-ever, there are plans to progressively merge all the infor-mation available into a multivariate analysis scheme, byextending the interpolation procedure to other variablesmeasured near the soil surface. Physical relations amongdifferent variables can then be included as weak or strong(Lagrangian) constraints, by implementing a variationalformulation of the analysis algorithm.

Appendix A: Observation influence

The ‘Degrees of Freedom for Signal’ (DFS) parameter isdefined as the trace of the influence matrix W. It has beenused with four-dimensional data assimilation schemes byCardinali et al. (2004) and Fourrie et al. (2006), whoused it to assess the impact of different observationalcomponents in the North Atlantic THORPEX RegionalCampaign. From Equation (4),

ya = Wyo + (I − W)yb (A.1)

Each element of the matrix W can be interpreted asthe first-order ‘sensitivity’ of a particular analysed valuewith respect to an observation

∂yam

∂yok

= [W]m,k (A.2)

SinceI − W = R(S + R)−1 (A.3)

if R is diagonal: R = diag(σ 2

o,1, . . . , σ2o,M

), the following

relation between the mth diagonal elements is obtained

1 − [W]m,m = σ 2o,m

[(S + R)−1]

m,m(A.4)

These diagonal elements are useful to obtain the CVanalysis. When R is diagonal one obtains

(ya

m − yom

) = σ 2o,m

[(S + R)−1]

m,m

(ya

m − yom

)(A.5)

A similar equation has been used by Cardinali et al.(2004).

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 13: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

The IDI for station m is re-defined here as the sumof all observational sensitivity contributions to the mthanalysed value:

yIDIm =

M∑k=1

∂yam

∂yok

(A.6)

where all the terms ∂yam/∂yo

k are non-dimensionalbecause, in the present univariate context, all observa-tions have the same physical dimensions. By making useof Equations (A.2) and (4), the IDI parameter can be eas-ily seen as the analysis obtained on the mth station whenall observed values are set to the value 1 and all back-ground values are set to 0, i.e. the intuitive definitiongiven in Section 2.3.

By making use of the CV-IDI parameter yIDIm , defined

in Section 2.3, a relation analogous to Equation (A.5)between IDI and the diagonal elements of the influencematrix (appearing in the DFS definition) is obtained whenR is diagonal:

(yIDI

m − 1) = (

1 − [W]m,m

) (yIDI

m − 1)

(A.7)

In principle, the IDI definition can be generalized toother assimilation algorithms, at least for each dimen-sionally homogeneous component of an observationalnetwork.

Appendix B: Updating the inverse matrix

As long as observations are taken at stations having fixedlocations, and covariance estimates are stationary in time,the error covariance matrices change only because ofoccasionally missing data. Moreover, direct knowledgeof the diagonal elements of the matrices (S + R)−1 andS(S + R)−1 is useful for diagnostic purposes and forquality control. It is then useful and efficient to computethe inverse matrix (S + R)−1 for a typical data set, tostore it in memory, and to update it at each analysistime to account for changes in the data set. For the caseof a missing datum this means to discard a row and acolumn of the matrix (S + R); for the case of a newdatum, absent or invalid at the previous analysis time,this means inserting an additional row and column (theyare inserted at the end here: they can be sorted afterwards,if needed).

Consider the block decomposition of the (M ,M) sym-metric matrix S + R:

S + R =[

A bbT c

](B.1)

Here the scalar c = [S + R]M,M is the last diagonalelement of S + R; b is a column vector of length M − 1,so that

[bT c

]is the Mth row of the symmetric matrix S +

R (and the transpose of its Mth column); A is the (M − 1,M − 1) symmetric sub-matrix obtained from S + R by

taking its last column and row away. Proceeding in thesame way, the inverse of S + R is

(S + R)−1 =[

X qqT z

](B.2)

where X, q, and z have respectively the same dimensionsas A, b, c. Define I as the identity matrix of order M − 1,0 as the column vector, and 0T the corresponding rowvector, composed of (M − 1) zeroes. The inverse matrixrelation is then

[A bbT c

] [X qqT z

]=[

I 00T 1

](B.3)

that is to say

AX + bqT = I (B.4)

Aq + bz = 0 (B.5)

bTX + cqT = 0T (B.6)

bTq + cz = 1 (B.7)

From M − 1 to M

In case the inverse A−1 of A is known and one newobservation becomes available, then the inverse of orderM can be computed by defining the auxiliary vector p as

p = A−1b (B.8)

and proceeding as follows:

z = 1

bTp − c(B.9)

q = −zp (B.10)

X = A−1 − pqT (B.11)

This can be done under the condition that A−1 existsand that c �= bTA−1b.

From M to M − 1

When the inverse of order M is known and the mthobservation is missing, then the inverse of order M − 1,A−1, can be computed, if zm �= 0, as

A−1 = Xm + 1

zm

qmqTm (B.12)

zm is here the mth diagonal element of (S + R)−1;qm is the mth column; qm

T the mth row (without thediagonal element zm); and Xm is what remains of theinverse matrix when the mth row and column are takenaway.

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 14: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

F. UBOLDI ET AL.

Appendix C: Response function of the optimalinterpolation for an ideal data distribution

The OI analysis increment, when R = ε2I is

δx = G(S + ε2I

)−1d (C.1)

It is well known that this expression can be obtainedas the limit of the SC iteration (Bratseth, 1986; Daley,1991; Pedder, 1993; Uboldi and Buzzi, 1994). In fact,the inverse matrix can be written as the sum of aseries [converging when all eigenvalues of (S + ε2I) havevalues between 0 and 2]:

δx = G∞∑

j=0

[I − (S + ε2I

)]jd (C.2)

where each partial sum

δxn = Gn∑

j=0

[I − (S + ε2I

)]jd = G

n∑j=0

sj (C.3)

can be interpreted as an n-step SC scheme (Uboldi andBuzzi, 1994), with

sj = [I − (S + ε2I

)]sj−1 (C.4)

s0 = d (C.5)

Barnes (1964, 1973) calculated the response functionof the SC scheme in the idealized case of a one-dimensional continuous data distribution and Gaussiancorrelation functions. By proceeding in the same way,the OI response function can be calculated as the limit.In the same idealized condition, each field is a functionof the space coordinate ξ , and the matrix multiplicationbecomes a convolution:

δx(ξ) =∞∑

j=0

gj (ξ) (C.6)

gj (ξ) = 1√2πD

∫e−1

2

(ξ − η

D

)2

sj (η)dη (C.7)

sj (ξ) = (1 − ε2) sj−1(ξ) − gj−1(ξ) (C.8)

s0(ξ) = d(ξ) (C.9)

The innovation, the SC analysis at step n, and the OIanalysis (the limit) are then written as Fourier integrals.Respectively,

d(ξ) = 1√2π

∫e−ikξω(k)dk (C.10)

δxn(ξ) = 1√2π

∫e−ikξ γn(k)dk (C.11)

δxOI (ξ) = 1√2π

∫e−ikξ γ∞(k)dk (C.12)

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5 2

λ/D

Figure 12. Response function for OI (solid), and for different SC steps:1 (dashed), 2 (short-dashed), 5 (dotted). The SC curves approach theOI curve as the step number increases. The independent variable iswavelength λ in unit D. The horizontal line corresponds to the limit

value 11 + ε2 .

The response function for SC at step n is calculatedby making use of the properties of the Fourier transformand of the Gaussian functions. Written as a function ofwavelength λ = 1

kit is

n

(λ; ε2, D

) ≡ γn

ω=1 −

1 −

ε2 + e

−12

(Dλ

)2

n+1

e−1

2

(Dλ

)2

ε2 + e−1

2

(Dλ

)2 (C.13)

The limit, i.e. the response function for OI is then

OI

(λ; ε2, D

) ≡ γ∞ω

= e−1

2

(Dλ

)2

ε2 + e−1

2

(Dλ

)2

= 1 − ε2

ε2 + e−1

2

(Dλ

)2 (C.14)

These functions are plotted in Figure 12: they appearas approximations of a step function. The SC and OIanalysis are then interpreted as realizing a low-pass filterwhose cut-off wavelength is approximately D.

Acknowledgements

The authors thank Maria Ranci who participated indiscussions regarding the statistical evaluation of theinterpolation scheme and Umberto Pellegrini for hishelp in the meteorological characterization of the casespresented here. The data used in the presented resultswere provided by ARPA Lombardia. A previous version

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met

Page 15: Three-dimensional spatial interpolation of surface ... · SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS R is a diagonal matrix, i.e. observational errors at differ-

SPATIAL INTERPOLATION OF METEOROLOGICAL DATA FROM LOCAL NETWORKS

of the analysis technique was implemented for ARPAPiemonte and another version was recently adapted forthe weather service of P. A. Trento. Part of this work hasbeen performed within the project ‘FORALPS – Meteo-hydrological forecast and observations for improvedwater resource management in the Alps’, supported bythe European Union through the European RegionalDevelopment Fund under the Interreg III B ‘AlpineSpace’ Initiative.

References

Barnes SL. 1964. A technique for maximizing details in numerical mapanalysis. Journal of Applied Meteorology 3: 395–409.

Barnes SL. 1973. Mesoscale objective map analysis using weightedtime-series observations. Technical report ERL NSSL-62, NationalOceanic and Atmospheric Administration (NOAA): Washington,DC.

Bennett AF. 2002. Inverse Modeling of the Ocean and Atmosphere.Cambridge University Press: New York.

Bratseth A. 1986. Statistical interpolation by means of successivecorrections. Tellus 38A: 439–447.

Cardinali C, Pezzulli S, Andersson E. 2004. Influence-matrix diagnos-tic of a data assimilation system. Quarterly Journal of the RoyalMeteorological Society 130: 2767–2786.

Daley R. 1991. Atmospheric Data Analysis. Cambridge UniversityPress: Cambridge, UK.

Dee PD, Da Silva A. 1998. Data assimilation in the presence of forecastbias. Quarterly Journal of the Royal Meteorological Society 124:269–295.

Dee PD, Da Silva A. 1999. Maximum-Likelihood estimation of fore-cast and observation error covariance parameters. Part I:methodology. Monthly Weather Review 127: 1822–1834.

Dee PD, Gaspari G, Redder C, Rukhovets L, Da Silva A. 1999.Maximum-Likelihood estimation of forecast and observation errorcovariance parameters. Part II: Applications. Monthly WeatherReview 127: 1835–1849.

Dee PD, Todling R. 2000. Data assimilation in the presence of forecastbias: the GEOS moisture analysis. Monthly Weather Review 128:3268–3282.

Fourrie N, Marchal D, Rabier F, Chapnik B, Desroziers G. 2006.Impact study of the 2003 North Atlantic THORPEX RegionalCampaign. Quarterly Journal of the Royal Meteorological Society132: 275–295.

Gandin LS. 1963. Objective Analysis of Meteorological Fields ,Gidromet, Leningrad. English translation by Israeli Program forScientific Translations, Jerusalem.

Gandin LS. 1988. Complex quality control of meteorologicalobservations. Monthly Weather Review 116: 1137–1156.

Kalnay E. 2003. Atmospheric Modeling, Data Assimilation andPredictability. Cambridge University Press: Cambridge, UK.

Lorenc A. 1984. Analysis methods for the quality control ofobservations. In Proceedings of ECMWF Workshop on the useand Quality Control of Meteorological Observations for NumericalWeather Prediction. 6–9 November 1984 , Reading, 397–428.

Lorenc A. 1986. Analysis methods for numerical weather prediction.Quarterly Journal of the Royal Meteorological Society 112:1177–1194.

Parrish D, Derber J. 1992. The national meteorological centre’sspectral statistical-interpolation analysis system. Monthly WeatherReview 120: 1747–1763.

Pedder MA. 1993. Interpolation and filtering of spatial observationsusing successive corrections and Gaussian filters. Monthly WeatherReview 121: 2889–2902.

Steinacker R, Haberli C, Pottschacher W. 2000. A transparent methodfor the analysis and quality evaluation of irregularly distributed andnoisy observational data. Monthly Weather Review 128: 2303–2316.

Tarantola A. 2005. Inverse Problem Theory and Methods for ModelParameter Estimation. SIAM: Philadelphia, PA.

Uboldi F, Buzzi A. 1994. Successive-correction methods applied tomesoscale meteorological analysis. Il Nuovo Cimento 17C: 745–761.

Uboldi F, Kamachi M. 2000. Time-space weak constraint dataassimilation with nonlinear models. Tellus 52A: 412–421.

Wahba G, Wendelberger J. 1980. Some new mathematical methodsfor variational objective analysis using splines and cross validation.Monthly Weather Review 108: 1122–1143.

WMO. 1996. Guide to Meteorological Instruments and Methodsof Observation , World Meteorological Organization. WMO-No.8Edition.

Copyright 2008 Royal Meteorological Society Meteorol. Appl. (2008)DOI: 10.1002/met