9780511626166_9780521583121

download 9780511626166_9780521583121

of 271

Transcript of 9780511626166_9780521583121

  • 8/8/2019 9780511626166_9780521583121

    1/271

    Engineers and applied geophysicists routinely encounter interpolation and es-timation problems when analyzing data from field observations.Introductionto Geostatisticspresents practical techniques for the estimation of spatial func-tions from sparse data. The author's unique approach is a synthesis of classicand geostatistical methods, with a focus on the most practical linear minimum-variance estimation m ethods, and includes suggestions on how to test and extendthe applicability of such methods.

    The author includes many useful methods often not covered in other geo-statistics books, such as estimating variogram parameters, evaluating the needfor a variable mean, parameter estimation and model testing in complex cases(e.g., anisotropy, variable mean, and multiple variables), and using informationfrom deterministic mathematical m odels.

    Well illustrated with exercises and worked examples taken from hydrogeol-ogy, Introduction to Geostatisticsassumes no background in statistics and issuitable for graduate-level courses in earth sciences, hydrology, and environ-mental engineering and also for self-study.

  • 8/8/2019 9780511626166_9780521583121

    2/271

  • 8/8/2019 9780511626166_9780521583121

    3/271

    INTRODUCTION TO GEOSTATISTICS

  • 8/8/2019 9780511626166_9780521583121

    4/271

    STANFORD-CAMBRIDGE PROGRAM

    The S tanford-Cambridge Program is an innovative publishing venture result-ing from the collaboration between Cambridge University Press and StanfordUniversity and its Press.

    The Program provides a new international imprint for the teaching and com-munication of pure and applied sciences. Drawing on Stanford's em inent facultyand associated institutions, books within the Program reflect the high qualityof teaching and research at Stanford University.

    The Program includes textbooks at undergraduate and graduate level, andresearch monographs, across a broad range of the sciences.

    Cam bridge University Press publishes and distributes books in the Stan ford -Cambridge Program throughout the world.

  • 8/8/2019 9780511626166_9780521583121

    5/271

    I N T R O D U C T I O N

    TO G E O S TAT I S T I C S :Applications to Hydrogeology

    P. K. KITANIDISStanford U niversity

    CAMBRIDGEUNIVERSITY PRESS

  • 8/8/2019 9780511626166_9780521583121

    6/271

    PUBLISHED BY THE PRESS SYNDICATEOF THE UNIVERSITY OF CAMBRIDGEThe Pitt Building, Trumpington Street, Cambridge, United Kingdom

    CAMBRIDGE UNIVERSITY PRESSThe Edinburgh Building, Cambridge CB2 2RU, UK http: //www.cup.cam.ac.uk

    40 West 20th Street, New York, NY 10011 -4211, USA http: //www .cup.org10 Stamford Road, Oakleigh, Melbourne 3166, Australia

    P. K. Kitanidis 1997

    This book is in copyright. Subject to statutory exception andto the provisions of relevant collective licensing agreements,

    no reproduction of any part may take place without

    the written permission of Cambridge University Press.

    First published 1997Reprinted 1999

    A catalogue record for this book is available from the British Library

    Library of Congress Cataloguing-in-Publication data

    Kitanidis, P. K. (Peter K.)Introduction to geostatistics : applications to hydrogeology /

    P. K. Kitanidisp. cm .Includes bibliographical references

    1. Hyd rogeology - Statistical methods. I. TitleGB1001.72.S7K57 1997

    551.49'072-dc20 96-28608CIP

    ISBN 0 521 58312 8 hardbackISBN 0 521 58747 6 paperback

    Transferred to digital printing 2003

  • 8/8/2019 9780511626166_9780521583121

    7/271

  • 8/8/2019 9780511626166_9780521583121

    8/271

  • 8/8/2019 9780511626166_9780521583121

    9/271

  • 8/8/2019 9780511626166_9780521583121

    10/271

    viii Contents

    51

    54

    54

    61

    62

    64

    65

    67

    71

    72

    72

    74

    76

    76

    77

    77

    79

    79

    80

    82

    83

    83

    84

    85

    86

    86

    89

    90

    91

    93

    95

    96

    96

    4.6 Variogram fitting 96

    4.7 On modeling 994.8 Estimation simulator 101

    4.8.1 Synthetic realizations 1014.8.2 Example 1 1024.8.3 Example 2 105

    4.9 Key points of Chap ter 4 109

    3.33.4

    3.53.63.73.8

    3.93.103.11

    3.123.133.143.153.16

    Intrinsic isotropic modelCommon models

    3.4.1 Stationary models3.4.2 Intrinsic nonstationary models3.4.3 Model superposition3.4.4 Special topic: microstructure

    Interpolation using krigingKriging systemKriging with moving neighborhoodNonnegativity

    DerivationThe function estimateConditional realizations3.11.1 Point simulation3.11.2 Function simulation

    Properties of krigingUniquenessKriging as exact interpolatorGeneralized covariance functionsKey points of Chapter 3

    4 Variogram fitting4.14.24.34.4

    4.5

    The problemPrior informationExperimental variogramResiduals

    4.4.1 The concept4.4.2 Residuals in kriging

    4.4.3 Distribution of residualsModel validation

    4.5.1 g i statistic4.5.2 Q2 statistic4.5.3 Normality4.5.4 No correlation4.5.5 Ordering

  • 8/8/2019 9780511626166_9780521583121

    11/271

    Contents ix

    5 Anisotropy 1105.1 Exam ples of anisotropy 110

    5.2 Overall approach 1125.3 Directional variogram 1135.4 Geoanisotropy 114

    5.4.1 General 1145.4.2 Two dimensions 1165.4.3 Three dimensions 117

    5.5 Practical considerations 1185.6 Key points of Chapter 5 119

    6 Variable mean 1206.1 Limitations of constant mean 1206.2 The linear model 1236.3 Estim ation with drift 1256.4 Generalized covariance function 1276.5 Illustration of the GC F concept 1306.6 Polynomial GC F 1336.7 Stationary-increment processes 1346.8 Splines 1356.9 Validation tests 136

    6.10 Parameter estimation 1396.11 On model selection 1406.12 Data detrending 1436.13 An application 1436.14 Variance-ratio test 1456.15 Key points of Chapter 6 148

    7 More linear estimation 150

    7.1 Overview 1507.1.1 Kriging with known mean 1517.1.2 Estim ation of drift coefficients 1517.1.3 Estimation of continuous part 1517.1.4 Spatial averaging 152

    7.2 Estimation with known mean 1527.2.1 Known drift 1527.2.2 Simple kriging 154

    7.3 Estim ation of drift coefficients 1567.4 Continuous part 1587.4.1 Rationa le 1587.4.2 Kriging equations 159

  • 8/8/2019 9780511626166_9780521583121

    12/271

    x Contents

    7.5 Spatial averaging 1637.5.1 Stationary 163

    7.5.2 Intrinsic 1657.5.3 Variable mean 166

    7.6 Spatial averaging implem entation 1667.6.1 Nugget effect 1667.6.2 Numerical quadrature 1677.6.3 Exam ples 168

    7.7 Key points of Chapter 7 1718 Multiple variables 111

    8.1 Joint analysis 1728.2 Second-m oment characterization 1738.3 Best linear unbiased estimation 1758.4 Another cokriging example 1778.5 Model selection 1798.6 Method of auxiliary variables 1808.7 Matrix form 182

    8.7.1 Linear model 1828.7.2 BLUE 1828.7.3 Parameter estimation 183

    8.8 Key points of Chapter 8 1839 Estimation and GW models 184

    9.1 Groundwater models 1849.2 Using a mathematical model 187

    9.2.1 Equations and conditions 1879.2.2 Log-transmissivity 1889.2.3 Derivation of joint moments 188

    9.3 Monte Carlo simulations 1899.3.1 The approach 1899.3.2 Exam ple 190

    9.4 First-order approximation 1969.4.1 The principle 1969.4.2 Linearization 1969.4.3 Moment derivation 198

    9.5 Large-dom ain analytical solutions 205

    9.6 Num erical small-perturbation method 2129.7 An inverse problem 2149.8 Key points of Chapter 9 2199.9 Addendum 220

  • 8/8/2019 9780511626166_9780521583121

    13/271

    Contents xi

    A Probability theory review 221A.I Introduction 221

    A. 1.1 Experiments 221A. 1.2 Probability 222A. 1.3 Random variables 222

    A.2 Jointly distributed random variables 223A.3 Expectation and mom ents 225A.4 Functions of random variables 227A.5 Further reading 228A.6 Review exercises 228

    B Lagrange multipliers 232B.I The method 232B.2 Explanation 233

    C Generation of realizations 235C.I Background 235C.2 Linear transformation 236C.3 Random vector 237C.4 Stationary random functions 238

    References 239Index 247

  • 8/8/2019 9780511626166_9780521583121

    14/271

  • 8/8/2019 9780511626166_9780521583121

    15/271

    List of tables

    1.1 Transmissivity data for exam ple in this section 31.2 Porosity versus location (depth) 92.1 Head observations in a regional confined aquifer 142.2 TCE concen trations in groundwater in a vertical cross section 152.3 Summary statistics for transmissivity data of Table 1.1 212.4 Summary statistics for head data of Table 2.1 212.5 Summary statistics for concen tration data of Table 2.2 224.1 The 0.025 and 0.975 percentiles of theQ2 distribution 94

    4.2 Coefficients for the Filliben test for departures from normality 954.3 Optimizing the fit 1076.1 Data for illustrative example 1217.1 Data (observation and location) 1627.2 Coordinates of the center of each elemen t 1709.1 Measured transmissivity and coordina tes of observation points 2149.2 Measured head and coordinates of observation points 214

    x in

  • 8/8/2019 9780511626166_9780521583121

    16/271

  • 8/8/2019 9780511626166_9780521583121

    17/271

    List of figures

    1.1 Location of transmissivity measurements (o) and unknow n (x) 41.2 Distribution of nearly normal data 62.1 Plan view of aquifer showing location of head data 132.2 Vertical cross section with location of TCE measurements 142.3 Histogram of transmissivity data 162.4 Histogram of head data 172.5 Histogram of TC E concen tration data 172.6 Ogive of transmissivity data 18

    2.7 Ogive of head data 182.8 Ogive of concen tration data 192.9 Box plot of transmissivity data 22

    2.10 Box plot of head data 232.11 Box plot of TCE concentration data 232.12 Histogram and theoretical distribution of norm al data 242.13 Box plot of the logarithm of the concen tration data 252.14 Plot of data versus the first spatial coordinate 282.15 Plot of data versus the second spatial coordina te 292.16 Plot showing the location and relative magnitude of data

    (o < median, x > m edian). When outliers are present, theyare indicated by * 29

    2.17 Draftsman's display of some three-dimensional data 302.18 Raw and experimental variogram of transmissivity data 312.19 Experimental variogram of the head data 322.20 Plot of the measurements for the discontinuous case 342.21 Experimental variogram for the discontinuous data 352.22 Plot of the data for the parabolic case 362.23 Plot of the slopes for the parabolic case 362.24 Experimen tal variogram for the parabo lic case 37

    xv

  • 8/8/2019 9780511626166_9780521583121

    18/271

    xvi List o f figures

    2.25 The data for the linear case 372.26 Slope of data for linear case 38

    2.27 Experimental variogram for linear case 392.28 Experimental variogram indicative of stationary and nonstationary

    behavior 403.1 The interpolation problem . Observations are indicated by the

    symbol o. 423.2 Five realizations from a family ofz(x) functions 433.3 Five realizations from another family of functions 443.4 Five realizations from yet another family of functions 45

    3.5 Plot of covariance function (periodic) 483.6 Five sample functions 493.7 Conditional mean, given z(0) = 0.5 503.8 Conditional variance, given z(0) = 0.5 503.9 Sample function and model for Gaussian variogram and

    covariance function 553.10 Sample function and model for exponential variogram and

    covariance function 573.11 Sample function and model for spherical variogram and

    covariance function 583.12 Sample function and model for hole-effect variogram and

    covariance function. 593.13 Sample function and model for nugget-effect variogram and

    covariance function 603.14 Sam ple function for power variogram with exponent 0.4 623.15 Sample function for linear variogram 633.16 Experimental variogram and model 64

    4.1 Variogram (continuous line)fitted to experimental variogram(dashed line) 85

    4.2 Fitting a straight line to data 874.3 Fitting a straight line to data that would better fit a quadratic 884.4 Probability density function ofQ\ statistic forn = 20 924.5 Probability density function of Q2 forn = 20 944.6 Recom mended procedure to estimate variogram parameters 994.7 List for checking the fit of the model 100

    4.8 Plot of observations versus locations for Exam ple 1 1024.9 Experimental variogram and preliminary fit 103

    4.10 Experimental variogram and fit from residuals 1034.11 Experimental variogram and fitted model for Exam ple 1 1044.12 Data for Exam ple 2 105

  • 8/8/2019 9780511626166_9780521583121

    19/271

    List of figures xvii

    4.13 Experimental variogram and preliminary fit for Exam ple 2 1064.14 Experimental variogram and preliminary fit for Exam ple 2

    using another discretization 1064.15 Experimental variogram and best fit estimate of variogram 1074.16 Orthonormal residuals pass normality test 1084.17 Experimental variogram of residuals in nugget effect 108

    5.1 Vertical cross section of alluvial formation 1115.2 Variograms in horizontal and vertical directions 1115.3 Streamlines and iso-concentration contour lines 1125.4 Vertical dimension stretched by a factor 12 115

    5.5 Rotation of coordinate system (new system is denoted by *) 1166.1 Experimental and model variogram for intrinsic case 1226.2 Com parison of best estimates 1226.3 Com parison of mean square error of estimation 1236.4 Interpolation through data (shown as o) using cubic and

    thin-plate splines 1366.5 How to proceed in developing a model with variable mean 1376.6 Experimental (semi)variogram of original data. (Adapted

    after [83].) 1446.7 Experimental (semi)variogram of detrended data. (Adapted

    after [83].) 1446.8 Experimental variogram and fitted equation (exponential GCF)

    using detrended data. (Adapted after [83].) 1457.1 Simple kriging lambda coefficient as function of distance from

    observation 1557.2 Mean square estimation error 1567.3 Variogram ofz versus variograms of zc and rj 159

    7.4 Resu lts from ordinary kriging 1627.5 Resu lts from continuous-part kriging 1637.6 Map of area, showing subdivision into blocks and location of

    observations 1699.1 Example of a simple flow domain 1879.2 Mean head from Monte Carlo simulations 1929.3 Variance of head from Monte Carlo simulations 1929.4 Correlation of head atx with heat atx = 0.25 193

    9.5 Cross-covariance of head -log-conductivity at same location 1939.6 Correlation of log-conductivity atx with head atx = 0.25 1949.7 Log-conductivity variance from Monte Carlo simulations 1949.8 Com parison of head variance from two sets, each with

    400 realizations 195

  • 8/8/2019 9780511626166_9780521583121

    20/271

    xviii List of figures

    9.9 Com parison of head-log-conductivity covariance (samelocation) from two sets, each with 400 realizations 195

    9.10 Com parison for mean head(a 2 = 1) 2019.11 Com parison for head variance ( a2 = 1) 2019.12 Com parison of 0(0 .25 ) to 0(x) correlation (or 2 = \) 2029.13 Com parison of cross-covariance(a 2 = 1) 2029.14 Com parison of 0 (0.25) toY (x) correlation (a2 = 1) 2039.15 Com parison of mean head(a 2 = 2) 2039.16 Com parison of head variance ( a2 = 2) 2049.17 Com parison of 0 (0.25) to 0(x) correlation (a2 = 2) 204

    9.18 Com parison of cross-covariance ( a2

    = 2) 2059.19 Com parison of 0 (0.25) toY(x) correlation (a 2 = 2) 2059.20 Influence function for one dimension. Positiverj indicates

    that the head fluctuation is downgradient of the log-conductivityperturbation 207

    9.21 Influence function in two dimensions 2089.22 Contour plot of cross-covariance 2099.23 The cross-covariance in the direction of flow 2109.24 ^ ^ ( r ) in the two principal directions.r\ is the distance in the

    direction of flow 2109.25 Contour plot of the variogram 2119.26 True transmissivity and location of transmissivity observations 2159.27 True head and location of head observations 2169.28 Location of transmissivity data and estimate of transmissivity

    obtained using only these data 2169.29 Location of head data and estimate of head obtained using

    only these data 217

    9.30 Estimate of transmissivity obtained from transmissivity andhead observations 218

    9.31 Estimate of head obtained from transmissivity andhead observations 219

    B. 1 Contour lines of constant MSE and the line of the constant 234

  • 8/8/2019 9780511626166_9780521583121

    21/271

    Preface

    This book grew out of class notes for a course that served two purposes:

    1. To familiarize graduate students in hydrogeology and environmental engi-neering with some practical methods for solving interpolation and relatedestimation problem s. The course emphasized geostatistical methods.

    2. To illustrate how one may use data to develop empirical models, that is, toprovide an introduction to applied statistical modeling.

    Engineers and applied geophysicists routinely encounter estimation prob-lems: From data, they must surmise the values of unknown variables. A casein point is drawing the map of the free surface (elevation of water table) in anaquifer from observations of the free-surface elevation at a few w ells in additionto other information. A practitioner faces such estimation problem s much m oreoften than some other problems that are covered in the academic curriculum.And yet most practitioners have received no formal instruction on estimationand, in many cases , are unableto judge the applicability ofa method or to inter-pret the results produced by software they u se. Their efforts to develop a graspof the subject are frustrated by unfamiliar jargon. Indeed, the type of trainingone receives in mathematics and physics does not help one to develop the skillsneeded for using data to build empirical models. I believe that it is absurd toexpect one to "pick u p" estimation m ethods w ithout some systematic trainingand a fair degree of effort.

    After teaching the course for about ten years, I realized that there mightbe room for a textbook that presumes no background in statistics and thatuses common sense to motivate and justify methods used for estimation of

    spatial functions. This book resulted from this realization. As it propoundsmethods and tools that the practitioner is likely to use, the book discusses inplain terms the reasoning involved in building up empirical models and fittingparameters. Jargon and mathematical abstraction have been avoided as much

    xix

  • 8/8/2019 9780511626166_9780521583121

    22/271

    xx Preface

    as possible. Nevertheless, the student is expected to have had a calculus-basedcourse in probability theory and to have at least a rudimentary knowledge of

    linear algebra.The book could have been m uch shorter if a more abstract approach had been

    followed. However, to write a single equation that describes ten different appli-cations does not mean that one has understood at once all these applications!To proceed from the general to the specific is mathem atically elegant but moreappropriate for advanced texts, because it requires some degree of familiaritywith the methods. For an introductory textbook, particularly on a subject soforeign to the intended audience, my experience has taught me that the only

    approach that works is to proceed from the simple and specific to the morecomplex and general. The same concepts are discussed several times, everytime digging a bit deeper into their meaning.

    Because statistical methods rely to a great extent on logical arguments it isparticularly important to study the book from the beginning. A lthough this bookmay appear to be full of equations, it is not mathematically difficult providedagain that one starts from the beginning and becomes familiar with the no tation.The book is intended for a one-semester course for graduate-level engineersand geophysicists and also can be used for self-study. The m aterial is limited tolinear estimation methods: Thatis , we presum e that the only statistics availableare mean values and covariances. I cannot overemphasize the point that thebook was never meant to be a comprehensive review of available methods oran assessment of the state of the art.

    Every effort has been made to catch mistakes and typographical errors, butfor those that are found after the publication of the book, a list of errata will bemaintained at

    http://www-ce.stanford.edu/cive/faculty/Kitanidis.html

    I thank all my co-workers and students at Stanford and elsewhere who withtheir comments have assisted me so much in writing this book.

    Peter K. KitanidisPalo Alto, CaliforniaSeptember 1996

  • 8/8/2019 9780511626166_9780521583121

    23/271

    1

    Introduction

    1.1 Introduction

    It is difficult and expensive to collect the field observations that an environ-mental engineer or hydrogeologist needs to answer an engineering or scientificquestion. Therefore, one must make the best use of available data to estimate theneeded param eters. For example, a large number of m easurements are collectedin the characterization of a hazardous-waste site: water-surface level in wells,transmissivity and storativity (from well tests), conductivity from permeameter

    tests or borehole flowm eters, chemical concentrations m easured from water andsoil samples, soil gas surveys, and others. However, because most subsurfaceenvironments are complex, even a plethora of data is not sufficient to resolvewith accuracy the distribution ofthe properties that govern the rates of flow, herates of transport and transformation of chemicals, orthe distribution of concen-trations in the water and the soil. The professionals who analyze the data mustfill in the gaps using their understanding ofthe geologic env ironment and of theflow, transport, or fate mechanisms that govern the distribution of chemicals.

    However, process understanding is itself incomplete and cannot produce aunique or precise answer. Statistical estimation methods complement processunderstanding and can bring one closer to an answer that is useful in makingrationaldecisions.Their main contribution is that they suggest how to weigh thedata to compute best estimates and error bounds on these estimates. Statisticshas been aptly describedas a guide to the unknown; itis an approach for utilizingobservations to make inferences about an unm easured quantity. Rather than theapplication of cookbook procedures,statistics is a rational methodo logyto solvepractical problems. The purpose of this book is to provide some insights intothis methodology while describing tools useful in solving estimation problemsencountered in practice. Two examples of such problemsare:point estimationand averaging.

  • 8/8/2019 9780511626166_9780521583121

    24/271

    2 Introduction

    In point estimation one uses measurements of a variable at certain points toestimate the value of the same variable at another point. For example, consider

    measurements of concentration from the chemical analysis ofsoil samples fromborings. The question is how to estimate the concentration at the many otherlocations where soil samples are unavailable. Ano ther example is drawing linesof constant transmissivity (in other w ords, contour lines of transmissivity) fromthe results of pumping tests ata number of nonuniformly spacedwells.Drawinga contour map is equivalent to interpolating the values of the transmissivity ona fine mesh. Examples can be found in references [54, 55, 105, 106, 119, 116,141, and 15].

    In averaging one uses point estimates of concentration to determine the aver-age concentration over a volume of soil; this estimate is needed for evaluationof the total mass of contaminants. Another example, drawn from surface hy-drology, is the estimation of mean areal precipitation over a watershed frommeasurements of rainfall at a number of rain gages.

    Due to complexity in the spatial variability of the variables involved, one can-not obtain exact or error-free estimates of the unknowns. Statistical methodsweigh the evidence to computebest estimatesas well aserror bars that describethe potential magnitude of the estimationerror. Error bars, or information abouthow much faith to put in the estimates, are essential in making engineering de-cisions. Statistical methods are applied with increased frequency to evaluatecompliance with regulatory requirements because the best one can do is to pro-vide a reasonable degree of assurance that certain criteria have been met. Also ,using statistics one can anticipate the impact of additional measurements onerror reduction before the measurements are taken. Thus, statistics is useful indeciding whether the present data base is adequate for detecting all importantsources of contamination and, if not, where to collect the additional measure-

    ments so that the objectives ofmonitoring (such as demonstrating regulatorycompliance) are met in the most cost-effective way.

    Once one masters the application of the statistical methodology to relativelysimple problems, such as those above, one can tackle m ore complicated p rob-lems such as estimating one variable from measurements of another. It is oftenconvenient to use a variable that can be easily observed or computed to estimateanother variable that is difficult or expensive to m easure. Forexamples,(a ) landtopography may be used to estimate the phreatic surface elevation of a surficial

    aquifer; (b ) overburden and aquifer thickness may correlate and can be used toestimate the transmissivity ofa confined permeable layer; and(c) hydraulic headmeasurements may provide information about the transmissivity and vice versa.

    Such problems can be rather challenging, particularly if one integrates dataanalysis with mathematical models that simulate geologic, flow, or transport

  • 8/8/2019 9780511626166_9780521583121

    25/271

    A simple example 3

    processes. This book deals with relatively simple applications, but the samegeneral methodology appliesto complicated problem s as well. In fact,the power

    of the methods to be described becomes most useful when utilizing measure-ments of different types, combining these w ith deterministic flow and transportmodels, and incorporating geological information to achieve the best charac-terization possible. However, one is advised not to attempt to solve complexproblems before developing a sound understanding of statistical techniques,which is obtained only through practice starting with simpler problems. Statis-tical methods are sometimes misapplied because professionals who use themhave received no training and apply them without awareness of the implicit

    assumptions or a firm grasp of the meaning of the results. (It has been said thatthey are often used the way a drunk uses a lamppost: for support rather thanillumination.) Blindly following methods one does not understand providescountless opportunities for misapplication.

    1.2 A simple example

    This first example will be used to provide context and m otivate the applicationof some of the techniques that w ill follow. (To understand all the details, somereaders will find it useful to go through the review of basic probability theorypresented in Appendix A.)

    Well tests were conducted at eight wells screened in a confined aquifer pro-viding values of the transmissivity. (The location of the wells on plan view isshown as o in Figure 1.1. The values are given in Table 1.1.) The question is:

    Given the information currently available, if a well were drilled at anotherlocation (indicated by an x in Figure 1.1) and a similar pumping test wereconducted, what value of the transmissivity would be observed?

    Table 1.1. Transmissivity data forexample in this section

    T (m2/day)

    2.92.54.7

    4.24.22.12.45.8

    x ( k m )

    0.8760.1882.716

    2.7173.7391.5342.0783.324

    ? ( k m )

    0.1380.2142.119

    2.6850.0311.5340.2671.670

  • 8/8/2019 9780511626166_9780521583121

    26/271

    Introduction

    oo

    o

    X

    o

    o

    o

    -

    o

    0 1 2 3 4

    Figure 1.1 Location of transmissivity measurements (o) and unknown ( x).

    Assume also that the driller's logs indicate that all wells were drilled in thesame formation and geologic environment. There are good reasons to believethat the formation is a confined aquifer bounded by nearly impermeable layersabove and below. Beyond that, however, and despite considerable geologicalinformation available at the site, the variability among transmissivity m easure-ments cannot be explained in terms of other measurable quantities in a manneruseful in extrapolation to unmeasured locations. If we actually admit that wecannot explain this variability, how can we extrapolate from the sample of theeight observations to the unknown value? The point is that, because we cannot

    come up with a deterministic mechanism that explains variability, we postulatea.probabilistic model, i.e., a set of mathem atical equations that summ arize whatwe know and are useful in making predictions.

    The simplest approachis to compute the frequency distribution of the data andthen to use itto describe the odds thatthe transmissivity atthe location of interestwill have a certain value. The prem ise is that "each transmissivity observation israndomly and independently sampled from the same probability distribution."It is like saying that every time we measure the transmissivity we perform

    an experiment whose outcome is a random variable with given probabilitydistribution. Of course, this experimentis only a convenient concep t; this simplemodel is not meant to represent the physical reality of what transmissivity is orhow it is measured, but rather, it constitutes a practical and reasonable way touse what we know in order to make p redictions.

  • 8/8/2019 9780511626166_9780521583121

    27/271

    A simple example 5

    We are still faced w ith the problem of estimating the probability distribution.We may approximate it with the experimental probability distribution (i.e., the

    distribution of the data); that is, we assume that the transmissivity takes any ofthe eight observed values with equal probability 1/8. N ow we are able to makepredictions. According to the model that we constructed, we predict that thevalue of transmissivity at point x isT\, T 2 , ..., or Tg with equal probability 1/8.

    Such a model may appear crude, but it is a rational way to use experience as aguide to the unknown. In fact, this simple model is adequate for some practicalapplications. Also, the reader should not be left with the impression that theapproach boils down to subjective judgement. The questions of the validity

    or suitability of a model and of the sensitivity of the prediction to modelingassumptions can be addressed, but this is left for later in this book.In many applications, what is needed is to determine a good estimate of

    the unknown,TQ,and a measure of the error. Anestimator is a procedure tocompute 7b from the data. Even though we cannot foretell the actual error in theestimate, we can describe its probability distribution. It is common to measurethe anticipated error by the quantity known as themean square error, i.e.,the expectedvalue (average weightedby the probab ility) of the square differenceof the estimate from the true value:

    ol = ^(f0 - Ti)2 + ^(t 0 - T2)

    2 + + ^(t 0 - 78)2. (1.1)

    After some algebraic manipulation to rearrange terms, the expression for themean square error becomes

    o - 1 + 2 t " + j +%(T? + Ti + --- +Ti)

    (1.2)

    Equation (1.2) demonstrates that the value of7b that makes the mean squareerror as small as possible is the arithmetic mean of the observations,

    o

    which is the estimate with mean square error

    (.3,

    An estimator with minimum mean square error will be referred to as abestor minimum-varianceestimator.

  • 8/8/2019 9780511626166_9780521583121

    28/271

    Introduction

    Using Equation (1.3), the expected value of the estimation error is

    l + 7-, + - + 7i, = a(1.5)

    When the expected value of the estimation error is zero, the estimator iscalled unbiased.

    Notice that the estimate was obtained from a formula that looks like

    - + ^ 7 ; , (1.6)

    (1.7)

    which, using the summation sign notation, can be written as

    i=\

    where n is the number of measurements and Ai, A.2,...,A.n are coefficientsor weights. In this example, the weightXt can be conveniently interpreted asthe probability that the unknown equals the value of the /-th measurement,assuming that the only possible values are the observed ones. This expression,

    Equation (1.7), is known as alinear estimator.In practice, the most useful class of estimators comprises best (minimum-

    variance) linear unbiased estimators (affectionately known asBLUEs),whichis the subject of this book.

    As already mentioned, for any estimator, the error is a random variable, i.e.,it is described through a probability distribution. In the example, the error is(7 Q 7i ), for / from 1 to 8, with equal probability 1/8. If these errors aresymm etrically d istributed about a central value, as are those of Figure 1.2, and

    Figure 1.2 Distribution of nearly normal data.

  • 8/8/2019 9780511626166_9780521583121

    29/271

    Statistics 1

    follow a bell-shaped distribution that resembles anormal {Gaussian)distribu-tion, then the mean of the distribution is the most representative value and the

    mean square value is the definite measure of the spread. However, in the case oftransmissivities or hydraulic conductivities and concentrations, the histogramof these errors usually indicates that the distribution is not symmetric. In thiscase, there is no unequivocal representative value. In addition to the mean,there are other possibilities, such as the value that minimizes the m ean absoluteerror or the median (the value exceeded by half the values). For this reason,the minimum-variance estimators are most suitable when we have reason tobelieve that the frequency distribution of the estimation errors may resem ble the

    normal distribution. In the case of transmissivities or concentrations, it is com-mon to use linear estimators with the logarithm of transmissivity,Y = ln (7) ,instead of withT. That is, instead of analyzingT we analyzeY. Examples willbe seen in other chapters.

    The weights in the linear estimator used in this example are equal; that is,the probability that the unknown is equal to a measured value is presumed tobe the same no matter how far or in what direction the unknow n is located fromthe location of the observation. Also, the locations of the other measurementshave no effect in the selection of the weight. In many situations, however, thetransmissivity varies gradually in space in such a way that it is more likelythat the unknown will resemble an observation near than an observation farfrom its location. Therefore, the weights should be nonuniform (larger fornearby observations). This book describes methods that analyze data for clueson how to compute these weights in a way that reflects the spatial variabilityof the quantity of interest as well as the location of measurements and theunknown.

    1.3 Statistics

    First, let us clear up a common misunderstanding. The word statistics (plural)means averages of numerical data, such as the batting average of a playeror the median of a batch of hydraulic conductivity measurements. However,data can be misleading, when improperly analyzed and presented. The wordstatistics (singular) refers to amethodology for the organization, analysis, andpresentation of data.In particular,

    statistical modelingis an approach for fitting mathematical equations to datain order to predict the values of unknown quantities from measurements.

    Hydrogeologists and environmental and petroleum engineers, like scientistsand engineers everywhere, use such methods on an almost daily basis so thatsome knowledge of statistics is essential today. Those who casually dismiss

  • 8/8/2019 9780511626166_9780521583121

    30/271

    8 Introduction

    statistical methods are the ones most likely to misuse them or to be misled bythem.

    Basically, we are concerned with estimation problems in which the value ofan unknown needs to be inferred from a set of data. It is convenient to subdividethe methodology into the following steps:

    Postulate a model that describes the variability of the data and can be used toextrapolate from the data to the unknown.

    If the model involves undetermined parameters, these parameters are esti-mated or fitted.

    The model is put to the test orvalidated. If the model is deemed acceptable, it is used to predict the unknown.

    The postulated model is probabilistic. Parameter fitting, model validation,and prediction involve computations of probability distributions or moments(such as the mean, the variance, etc.). These m odels must be reasonably simpleor else the computations may be just too complicated for the approach to beof practical use. The computations in the methods presented in this book arereasonably simple and involve only mean values, variances, and correlation co-

    efficients. However, there are even more important reasons for selecting simpleprobabilistic models, as will be discussed later.Conceptually, the part that novices in statistics have the most trouble under-

    standing is the selection of the emp irical model, i.e., the model thatis introducedto fit the data. So let us say a few things on this subject. How do we know thatwe have the right model? The truth is that one cannot (and may not even needto) prove that the postulated model is the right one, no matter how many theobservations. There is nothing that anyone can do about this basic fact, whichis not a limitation of statistics but to various degrees affects all sciences thatrely on empirical data. In the example of Section 1.2, we cannot prove theassumption that the data were somehow generated randomly from the samedistribution and even more we cannot prove that the unknown was generatedfrom the same distribution. However,

    1. unless there is evidence to the contrary, it is not an unreasonable assumption,and

    2. one can check whether the data discredit the assumption.

    It is best to approach empirical models from a utilitarian perspective and seethem as a practical means to:

    1. summarize past experience and2. find patterns that may help us to extrapolate.

  • 8/8/2019 9780511626166_9780521583121

    31/271

    Statistics

    rj 0.39*(ft) -24

    0.41- 2 6

    Table

    0.39- 2 8

    1.2. Porosity

    0.37 0.36- 3 0 - 3 2

    versus location (depth)

    0 29- 3 4

    0.38 0.34 0.31- 3 6 - 3 8 - 4 0

    0.28- 4 2

    0 3?44

    0 30- 4 6

    The model of Section 1.2 is another way of saying that frequency analysisis a practical way to describe the variability in the data and to make use ofpast observations in predicting future ones. It is a reasonable approach, whichshould lead to rational decisions.

    A m odel should be judged on the basis of information that is available at thetime when the model is constructed. Thus, a model that looks right with10 mea-surements may be rejected in favor of another model when 100 measurementshave been collected. It will be seen that, in all cases, the simplest empiricalmodel consistent with the data is likely to be best for estimation purposes (aprinciple known asOccam's razor).Furthermore, it will be seen that one of themost important practical contributions of statistical estimation methods is tohighlight the fine distinction be tween fitting (a model to the data) and obtaininga model that we m ay trust to some degree for m aking predictions.

    Exercise 1.1 What are the basic characteristics of an estimation problem?Describe an estimation problem with which you are familiar.

    Exercise 1.2 Describe a common-sense approach to utilize observations(known facts) to make extrapolations or predictions (about unknown facts).Describe two examples,one from your everyday life and experience and onefrom scientificresearch (e.g., put yourselvesin Newton s shoes and try to imag-

    ine how he came up with the basic law of motion F ma). Outline the stepsyou follow in a systematicway. (You may find it useful to review what is knownas the scientific method and discuss its generality and relevance to everydaylife.)

    Exercise 1.3 Consider observations of porosity in a borehole (first column ismeasured porosity, second column is depth) as shown in Table 1.2. Find thebest estimate and standard error1 at locations x = 37 ft using the simple

    model of Section 1.2. What is the significance ofthe standard error? Discussthe pros and cons of this simple model and whether it seems that this model isa reasonable description for this data set.

    1 The standard error of estimation is the square root of the mean square error of estimation.

  • 8/8/2019 9780511626166_9780521583121

    32/271

    10 Introduction

    1.4 Geostatistics

    In applied statistical modeling (including regression and time-series) leastsquares or linear estimation is the most widely used approach. Matheron [94and 95] and his co-workers advanced an adaptation of such methods that is wellsuited to the solution of estimation problems involving quantities that vary inspace. Examples of such quantities are conductivity, hydraulic head , and soluteconcentration. This approach is known as thetheory of regionalized variablesor simply geostatistics. Popularized in mining engineering in the 1970s, it isnow used in all fields of earth science and engineering, particularly in the hydro-logic and environmen tal fields. This bookis an introductory textto geostatistical

    linear estimation methods.The geostatistical school has made important contributions to the linear esti-

    mation of spatial variables, including the popularizing ofthe variogram and thegeneralized covariance function. Geostatistics is well accepted among practi-tioners because it is a down-to-earth approach to solving problems encounteredin practice using statistical concepts that were previously considered recondite.The approach is described in books such as references [24, 36, 37, 70, 73, 121,and 76] with applications mainly in mining eng ineering, petroleum eng ineering,and geology. Articles on geostatistics in hydrology and hydrogeology include[7 and 102] and chapters can be found in [13 and41]. A book on spatial statis-tics is [30]. Software can be found in references [50,14 5, and 43] and trends inresearch can be discerned in reference [44].

    The approach presented in this book departs from that of the books citedearlier (which, forthe sake of convenience w ill be called "mining geostatistics")in consequential ways. For the readers who are already familiar with mininggeostatistics, here is a list of the most important differences:

    1. The estimation of the variogram in mining geostatistics revolves aroundthe experimental variogram; sometimes, the variogram is selected solely onthe basis that it fits the experimental variogram. This approach is simple toapply but unsatisfactory in most other aspects. In contrast, in the approachfollowed in this book, the variogram is selected so that it fits the data, i.e.,the approach relies more on the minimization of a criterion of agreementbetween the data and the predictions than on the experimental variogram.

    2. Unlike mining geostatistics, which again relies on the experimental vari-

    ogram to select the geostatistical model, the approach preferred in this workis to apply an iterative three-step approach involving: 1. exploratory analy-sis that suggests a model; 2. parameter estimation; and 3. model validation,which may show the way toa bettermodel.Model validationis implementeddifferently and has a much more important role than in mining geostatistics.

  • 8/8/2019 9780511626166_9780521583121

    33/271

    Key points of Chapter1 11

    3. Ordinary kriging, which describes spatial variability only through a vari-ogram and is the most popular method in mining geostatistics, can lead

    to large mean square errors of estimation. In many environmental applica-tions, one may be able to develop better predictive models by judiciouslydescribing some of the "more structured" or"large-scale"variability throughdrift functions. The error bars can be further reduced by making use ofadditional information, such as from the modeling of the processes. Thisadditional information can be introduced in a number of ways, some ofwhich will be seen in this book.

    1.5 Key points of Chap ter 1

    This book is a primer of geostatistical estimation methods with applications incontaminant hydrogeology. Statistics is a methodology for utilizing data andother information to make inferences about unmeasured quantities. Statisticalmethods com plement deterministic process understanding to provide estimatesand error bars that are useful in making engineering decisions. The methods inthis book are an adaptation and extension of linear geostatistics.

  • 8/8/2019 9780511626166_9780521583121

    34/271

    2

    Exploratory data analysis

    The analysis of data typically starts by plotting the data and calculating statis-tics that describe important characteristics of the sample. We perform such anexploratory analysis to:

    1. familiarize ourselves with the data and2. detect patterns of regularity.

    Graphical methods are useful to portray the distribution of the observations

    and their spatial structure. Many graphical methods are available and evenmore can be found and tailored to a specific application. The modest objectiveof this chapter is to review com mon too ls of frequency analysis as well as theexperimental variogram . Exploratory analysis is really a precursor to statisticalanalysis.

    2.1 Exploratory analysis scope

    Before computers, hydrogeologists used to spend hours transcribing and plot-ting theirdata.Although time consum ing, labor intensive, and subject to humanerrors, one cannot deny that this process enhanced familiarity with data to theextent that the analyst could often discern patterns or spot "peculiar" measure-ments. This intimacy with one's data might appear lost now, a casualty of theelectronic transcription of data and the extensive use of statistical computerpackages that perform the computations.

    However, data analysis and interpretation cannot be completely automated,particularly when m aking crucial modeling choices. The analyst must use judg -

    ment and make decisions that require familiarity with the data, the site, and thequestions that need to be answered. It takes effort to become familiar withdata sets that are often voluminous and describe complex sites or processes.Instead of striving for blind autom ation, one should take advantage of available

    12

  • 8/8/2019 9780511626166_9780521583121

    35/271

    Exploratory analysis scope 13

    computers and com puter graphics to organize and display data in ways unimag-inable using manual methods (for review ofbasic ideas see, for exam ple, [20]).

    Measurements may vary over a wide range. In most cases it is impossible,and often useless, for any person to remember every single measurement in-dividually. One may start by summarizing in a convenient way the behaviorof measurements that act similarly and by pointing out the measurements thatbehave differently from the bulk of the data. What is the best way to organizeand display the data? What are the measures that summarize the behavior of abunch of data? And could it be that a certain data transformation can simplifythe task of summarizing the average behavior of a data batch? These are some

    of the issues to be discussed. But first, here are three basic principles: It does little good to just look at tabulated data. However, the human eye can

    recognize patterns from graphical displays of the data. It is important to look at the data in many ways and to keep in mind that some

    techniques im plicitly favor a certain model. During exp loratory data analysisone should make as few assumptions as possible.

    Conclusions made during exploratory analysis are tentative. A model cannotbe accepted on the basis of exploratory analysis only but should be corrobo-

    rated or tested.To illustrate the objectives and usage of exploratory data analysis, consider

    the following data sets:

    1. Measurements of transmissivity of a thin sand-and-gravel aquifer at 8 loca-tions (see Table 1.1 and Figure 1.1).

    2. Measurements of potentiometric head at 29 locations in a regional confinedsandstone aquifer (see Table 2.1 and Figure 2.1).

    Figure 2.1 Plan view of aquifer showing location of head data.

  • 8/8/2019 9780511626166_9780521583121

    36/271

    14 Exploratory data ana lysis

    3 . Measurements at 56 locations of concentration of trichloroethylene (TCE)in groundwater on a transect in a fine-sand surficial aquifer (see Table 2.2

    and Figure 2.2).

    Table 2 . 1 . Head observations in a regional confined aquifer

    H e a d (ft)

    1 0 6 1

    1 1 9 4

    1 1 1 7

    8801 2 0 2

    7571 0 3 8

    817630617986625840847645

    X

    6 . 8 6

    4 . 3 2

    5 . 9 811 . 6 1

    5 . 5 2

    1 0 . 8 7

    8 .61

    1 2 . 6 4

    1 4 . 7 0

    1 3 . 9 1

    9 . 4 7

    1 4 . 3 6

    8 . 9 9

    11 . 9 3

    11 . 7 5

    y

    6.41

    5 . 0 2

    6.01

    4 . 9 9

    3 . 7 9

    8 . 2 7

    3 . 9 2

    6 . 7 7

    1 0 . 4 3

    1 0 . 9 1

    5 . 6 2

    11 . 0 3

    7 .31

    6 . 7 8

    1 0 . 8

    H e a d (ft)

    662685

    1 0 2 3

    998584611847745725688676768782

    1 0 2 2

    X

    1 3 . 2 3

    1 3 . 5 6

    8 . 0 6

    1 0 . 9 5

    1 4 . 7 1

    1 6 . 2 7

    1 2 . 3 3

    1 3 . 0 1

    1 3 . 5 6

    1 3 . 7 6

    1 2 . 5 4

    8 . 9 7

    9 . 2 2

    9 . 6 4

    y

    1 0 . 1 8

    9 . 7 4

    5 . 7 63 . 7 2

    11 . 4 1

    7 . 2 7

    6 . 8 7

    7 . 0 5

    7 . 4 2

    8 .35

    9 . 0 4

    8.68 .55

    3 . 3 8

    -40

    -50

    -60

    -70

    -80

    -90

    -100

    oooooooooo

    o

    o

    ooooooo

    o

    o

    oooooo

    ooooooooo

    oo

    ooooo

    oo

    ooooooo

    o

    ooo

    100 200

    X(ft)

    300 400

    Figure 2.2 Vertical cross section with location of TCE measurements.

  • 8/8/2019 9780511626166_9780521583121

    37/271

    Experimental distribution

    Table 2.2. TCE concentrations in groundwater in a vertical crosssection

    15

    No

    123456789

    101112131415161718

    19202122232425262728293031323334353637

    TCE(ppb)

    1.00e+011.00e+016.40e+015.42e+023.46e+021.91e+028.50e+013.00e+017.40e+001.46e+011.20e+012.30e+011.87e+021.64e+027.01e+022.13e+041.86e+046.22e+02

    1.39e+034.00e+014.20e+016.55e+022.16e+046.77e+043.89e+045.84e+022.54e+036.00e+006.63e+023.75e+038.76e+031.40e+041.61e+041.28e+047.63e+022.89e+021.14e+01

    .(ft)

    0000000000

    7070707070707070

    70110110110110110110110110150150150150150150150150150200

    y(ft)

    - 4 5- 5 0- 5 5- 6 0-65- 7 0- 75- 8 0- 85- 9 0- 45- 5 0- 55- 6 0- 65- 7 0- 75- 8 0

    - 8 5- 45- 5 0- 55- 6 0- 65- 7 0- 7 5- 8 0- 45- 5 0- 55- 6 0- 65- 7 0- 75- 8 0- 85- 45

    No

    383940414243444546474849505152535455

    56

    TCE(ppb)

    1.16e+034.27e+039.87e+037.32e+034.03e+032.73e+021.90e+025.22e+026.70e+015.50e+011.36e+027.08e+028.82e+029.39e+023.08e+021.40e+017.00e+002.50e+00

    2.50e+00

    .(ft)

    200200200200200200200200270270270270270270270370370370

    370

    y(ft)

    - 5 0-5 5- 60-6 5- 70-7 5- 80-8 5-45- 50-55- 6 0-65- 70-75-55-6 5- 70

    -75

    2.2 Experimental distribution

    We want to describe the frequency distribution of a set of n measurements (suchas those on Tables 1.1,2.1, and 2.2) without regard to their location.We call thisdistribution "experimental" or "empirical"because it depends only on the data.

  • 8/8/2019 9780511626166_9780521583121

    38/271

    16 Exploratory data ana lysis

    We can describe the distribution of a set of data through the histogram, theogive, and the box plot. We will also review numbers that represent impor-

    tant characteristics of a data set, such as central value, spread, and degree ofasymmetry.

    2.2.7 Histogram

    The histogramis a common wayto representthe experimental distribution of thedata. Consider a batch withn measurements that are sorted in increasing order,z\ < Z2 < - - < zn. The interval between the largest and the smallest value is

    divided intom bins (intervals) by the pointsao,a\,... ,am-\9am. The intervalsare usually of equal length and selected so that the histogram is relatively freefrom abrupt ups and dow ns. A measurement z belongs to the &-th bin if

    ak-\

  • 8/8/2019 9780511626166_9780521583121

    39/271

    Experimental distribution 17

    10

    8

    E

    5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 1 00 0 1 1 00 1 20 0

    Head (ft)

    Figure 2.4 Histogram of head data.

    50

    1 40

    $ 30

    2 20

    5 10

    6

    x100 1 2 3 4 5TCE(ppb)

    Figure 2.5 Histogram of TCE concentration data.

    way the experimental distribution of the data, particularly if the number ofmeasurements is small(e.g., less than 50).

    2.2.2 OgiveThe ogive is the experimental cum ulative distribution. For the sorted data,z\ fnax

    X

    ox

    X

    X

    XX

    0

    X

    0

    0

    x*x

    X

    o o W n0

    0

    0

    o o 0

    8 10 12 14 16 18

    Figure 2.16 Plot showingthe locationand relative magnitude of data(o < median, x >median). When outliers are present, they are indicated by *.

    Three-dimensional graphics are gradually becoming available with perspec-tive, movement, and shading for better visualization. Data are representedas spheres or clouds of variable size or color indicative of magnitude of theobservation.

  • 8/8/2019 9780511626166_9780521583121

    52/271

    30 Exploratory data a nalysis

    0.5

    OCP

    * * O o

    * * o

    0.5 1

    JC3 0 .5

    ** **o o

    . * * 4' ***

    * o o

    o * om *. o_

    0 0.5 1 0 0.5 1

    X2 Jt,

    Figure 2.17 Draftman's display of some three-dimensional data.

    2.6 The experimental variogram

    The remaining sections of this chapter deal with theexperimental variogram(or semivariogram),which plays a leading role in practical geostatistics.

    Consider the case ofn measurements z(xi ),z(x2),..., z(xn). The bold letterx stands for the array of coordinates of the point where these measurementswere taken. Plot the square difference\[z(\i) z(xj)]2 against the separationdistance ||x; xj|| for all measurement pairs (where || || means the length ofa vector). Forn measurements, there aren{-n~X) such pairs that form a scatterplot known as theraw variogram (represented by the dots in Figure 2.18 for the

    head data). The experimental variogram is a smooth line through this scatterplot.In the common method of plotting the experimental variogram, the axis of

    separation distance is divided into consecutive intervals, similarly as for thehistogram. The &-th interval is[h lk, h%] and containsNk pairs of measurements

  • 8/8/2019 9780511626166_9780521583121

    53/271

    The experimental variogram

    [z(x,), z(x-)]. Then, compute

    Y(h k) = xrr

    31

    (2.11)

    where index/ refers to each pairof measurementsz(x/) and z(xj) for which

    hi

  • 8/8/2019 9780511626166_9780521583121

    54/271

    32 Exploratory data ana lysis

    . x 1 04

    y(h)

    0 1 2 3 4 5 6

    Figure 2.19 Experimental variogram of the head data.

    1. Use three to six intervals.2. Include more pairs (use longer intervals) at distances where the raw vari-

    ogram is spread out.

    As an exploratory ana lysis tool, the experimental variogram has the drawbackthat the graph depends on the selected intervals. It may also be somewhataffected by the method of averaging. For example, some analysts prefer to usefor hjc the median value while others prefer to use the midpoint of the interval,i.e.,

    hk = - hu

    k - hi , (2.14)

    resulting into slightly different plots.

    The experimental variogram presented above is a measure of spatial correla-tion independent of orientation. In some cases, however, better predictions canbe made by taking into account the anisotropy in the structure of the unknownfunction; for example, conductivities in a layered medium are more correlatedin a horizontal d irection than in the vertical. The variogram should then dependon the orientation as well as the separation distance (anisotropic model). Theissue of anisotropy will be discussed in Chapter 5.

    2.7 Meaning of experimental variogram

    2.7.1 Scale

    The experimental variogram is the graph that is most comm only used in appliedgeostatistics to explore spatial interdependence. It contains information about

  • 8/8/2019 9780511626166_9780521583121

    55/271

    Meaning of experimental variogram 33

    the scale of fluctuations ofthe variable, as we will explain in this section. Somereaders m ay prefer to skip this section at first reading and come back to it after

    Chapter 3 or 4.To grasp the concep t of scale, consider the function z that varies over a one-

    dimensional domainx. It is useful to approxim atez(x) by the sum of cosines

    z(x) ~ J2 A> cos(2jrjc/L/ + 0, ), (2.15)

    where At is coefficient of the trigonometric series, L/ is spatial scale, and 0/ isphase shift. Without getting too technical and disregarding pathological cases,practically any function can be sufficiently approximated over a finite domainby a few terms of a trigonometric, or Fourier, series. Ifthe length ofthe domainis L, then 1/L; = i/L, where / = 0, 1, 2, WhenAj is relatively large, wesay that a proportionally large part of the variability is "at scale L/." One canreconstructz(x) from knowledge of the triplets (L;, A/,

  • 8/8/2019 9780511626166_9780521583121

    56/271

    34 Exploratory data a nalysis

    The presence of variability ata scale com parable to the samplingdomain.Thisdepends on the behavior of the experimental variogram at large distances.

    2J2 Near the originThe behavior of the variogram at small separation distances determines whetherthe spatial function appears continuous and smooth. We will consider threeexamples, which are intended to give you an intuitive feeling about what wemean by continuity and smoothness in a practical context.

    2.7.2.1 Discontinuous

    First, consider the case that the actual variable is

    = COS(27TJC/0.001) . (2.16)

    That is , all the variability is at scale0.001.Considernow that zi(x) is sampledat 100 locations randomly distributed in the interval between 0 and 1. (The samesampling locations will be used in all three examples.) Note that the averagesampling interval(i.e., distance between measurement locations), 0.01, is ten

    times larger than the scale of fluctuations of the variable. As a result, twoadjacent measurements are about as different as two distant measurements. Atthe scale of the sampling interval, the variablei\ is discontinuous because itchanges abruptly from one sampling point to the next, as shown in Figure 2.20.

    1

    0.5

    i 0

    -0.5

    -1

    o

    o

    o

    o

    o

    o

    o

    o

    * ^ o C

    ooO r

    Oo

    o

    oo

    o

  • 8/8/2019 9780511626166_9780521583121

    57/271

    Meaning of experimental variogram 35

    z, data

    0.02 0.08.04 0.06

    hFigure2.21 Experimental variogramfor the discontinuous data.

    The experimental variogram, shownin Figure 2.21, is approximatelya straighthorizontalline. Because the experimental variogram doesnot seem to convergeto zero as the separation decreases,we say that there is a discontinuity of theexperimental variogramat the origin or a nugget effect.

    In general, a discontinuity at the origin in the experimental variogramisindicative of fluctuations at a scale smaller thanthe sampling interval, calledmicrovariability. It may also be due to random observation error,as we willdiscuss further elsewhere.

    2.722 ParabolicAs a second example, consider

    z 2 ( j t ) = cos(27TJt/2). (2.17)

    All the variabilityis at a scale much larger than the scaleof the sampling inter-vals.Figure 2.22is a plot of the dataZ2(xm) versus xm and Figure 2.23is a plotof the slopes of the data Zl(x fmZ zx2^~ l) versusxm-U for all m = 2 , . . . , 100. Thechanges in m easured values are so gradual that bothz and its slope are observed

    to vary continuously.The experimental variogram, shownon Figure 2.24,hasparabolic behavior near the origin; that is,it is proportionalto h 2 for small val-ues ofh. Generally, parabolic behav ior near the origin is indicativeof a quantitythat is smooth at the scaleof the measurementsso that it is differentiable (i.e.,it has a well-defined slope).

  • 8/8/2019 9780511626166_9780521583121

    58/271

    36 Exploratory data analysis

    0 0.2 0.4 0.6 0.8 1X

    Figure 2.22 Plot of the data for the parabolic case .

    0 0.2 0.4 0.6 0.8X

    Figure 2.23 Plot of the slopes for the parabolic case.

    2.7.23 LinearAs a third exam ple, we w ill consider a case in between the first two exam ples.Consider that

    z3 (*) = COS(2TTJC/0.2)+ 0.10cos(27rjt/0.02) (2.18)

  • 8/8/2019 9780511626166_9780521583121

    59/271

    Meaning of experimental variogram 37

    with the same sam pling poin ts. Note that this variable has most ofits variabilityat a scale larger than the average sampling interval but also some variability

    at a scale comparable to that of the measurement spacing. The changes inthe value of z3 between adjacent sampling points are gradual, as shown onFigure 2.25, so z3 is practically continuous at the scale of the measurements.

    Data z 2

    0.012

    j(h) 0.008

    0.004

    0.02 0.04 0.06

    hFigure 2.24 Experimental variogram for the parabolic case.

    0.08

    1.5

    1

    0.5

    Z3 0

    -0.5

    -1

    -1.5

    %

    (c

    ooo

    8

    hoo

    o

    G

    I

    o

    o

    o

    9

    o

    o

    oo o

    o

    o

    o

    o

    8o

    o

    0

    %

    %

    oo

    o8

    c

    o

    o

    I'

    0.2 0.4 0.6 0.8 1

    X

    Figure 2.25 The data for the linear case.

  • 8/8/2019 9780511626166_9780521583121

    60/271

    38 Exploratory data ana lysis

    80

    40

    -4 0

    o

    o

    8

    PO

    oo

    (

    8oo(p OOO

    o 8 n

    O

    o

    O o

    o

    o

    )O

    o o0 0

    oo

    o o

    o

    o

    o

    jQ

    o

    o

    o

    o

    o

    ooooo

    o .00 .

    oo

    % O B-* %) *>

    oo ooo

    0.2 0.4 0.6 0.8x

    Figure 2.26 Slope of data for linear case.

    However, the slope changes rather abruptly between adjacent intervals,as seen

    in Figure 2.26, so thatZ3 does not appear to have continuous derivativesat thescale of the measurements. The reasonis that the derivative

    = -IOTT sin(27TJt/0.2) - IOTT sin(27rx/0.02)ax

    (2.19)

    has as much variabilityat scale 0.2 as at the small scale 0.02 . The experimentalvariogram, shownin Figure 2.27,has approximately linear behavior neartheorigin, i.e., the variogramis proportionalto h for small valuesof h.

    An exampleof a function that is continuous but not differentiableis the pathof a small particlein a fluid (knownas "Brownian mo tion"). The particle getshit by molecules so often thatit constantly changes direction and speed. Thus,although the particle trajectoryis continuous, the particle speedmay changeinstantaneously andis thus not continuous.

    In summary, we have seen that the behavior ofthe experimental variogram atthe origin (at short distances) reveals the degreeof smoothnessof the function.We distinguished among parabolic behavior, which characterizesa smoothly

    changing variable with continuous slope; linear behavior, which characterizesa continuous variable without continuous derivatives (suchas a Brownian m o-tion); and discontinuous behavior, which characterizes a discontinuous variable(such as random "noise").

  • 8/8/2019 9780511626166_9780521583121

    61/271

    Meaning of experimental variogram 39

    y(h)

    0.02 0.04 0.06h

    Figure 2.27 Experimental variogram for linear case.

    0.08

    2.7.3 Large-scale behavior

    The behav ior of the variogram at distances com parable to the size of the dom aindetermines whether the function isstationary. We will later give a technicalmeaningto the term stationary; intuitively,a function is stationary if it consists ofsmall-scale fluctuations (compared to the size ofthe domain) about some well-defined mean value. For such a function, the experimental variogram shouldstabilize around a value, called thesill, as shown in Figure 2 .28. For a stationaryfunction, the length scale at which the sillis obtained describes the scale at whichtwo measurements ofthe variable become practically uncorrelated. This length

    scale is known asrange or correlation length.Otherwise, the variogram keeps on increasing even at a distance comparable

    to the maximum separation distance of interest, as shown in Figure 2.28.

    Exercise 2.1 Consider two functions describing a quantity that varies alongthe spatial coordinate x in the interval[0, 4] :

    z\(x) = cos| V IA J -|nx\ 1 f2>7tx\ 1 J +^cosf I +

    ( TCXTC X \ 1Z2(X) = COS + 01 I + COS

    \ Z J 5

    5 2

    + 3

    5nx

  • 8/8/2019 9780511626166_9780521583121

    62/271

    40 Exploratory data analysis

    Figure 2.28 Experimental variogram indicative of stationary and nonstationarybehavior.

    where 0i, 03, 05 are numbers picked completely randomly from the interval[0, 2n]. P lot these two functions and discuss, based on what you have readin this chapter, how the experimental variograms of these two functions areexpected todiffer. Based on this example, discuss the strengths and limitationsof the experimental variogram a s an exploratory analysistool.

    2.8 Key points of Chapter 2

    The objective of exploratory analysis is to familiarize the analyst with theimportant characteristics of the data. The analyst should keep an open mind andavoid techniques that may be misleading if certain assumptions are not met.We start by analyzing the d istribution of data independently of their location inspace; this distribution may be portrayed using the histogram, the ogive, andthe box plot. Important summary statistics are the median and the mean, theinterquartile range and the standard deviation, and the skewness coefficient.We discussed the practical advantages of working with symmetric and nearlynormal distributions and how transformations can be used to achieve this goal.Spatial variability can be analyzed using graphical techn iques, but the difficultyincreases significantly from variability in one dim ension to variability in threedimensions. The experimental variogram is an important tool that provides

    information about the distribution of spatial variability with respect to scales.Finally, note that conclusions reached during an exploratory analysis are usuallytentative. The next step is to use the ideas created during exploratory analysisto select tentatively an "equa tion to fit to the da ta."

  • 8/8/2019 9780511626166_9780521583121

    63/271

    3

    Intrinsic model

    We preview the general methodology underlying geostatistical modeling andapply it to the most common model, which is known as theintrinsic isotropicmodel and is characterized by the variogram. This chapter introduceskriging,which is a method for evaluating estimates and mean square estimation errorsfrom the data, for a given variogram. The discussion in this chapter is limited toisotropic correlation structures (same correlation in all directions) and focuseson the methodology and the basic mathematical tools. Variogram selection and

    fitting w ill be discussed in the next chapter.

    3.1 Methodology overview

    Consider that we have measured porosity along a borehole at several locations(see Figure 3 .1). To estimate the value of the porosity at any location from themeasured porosity values, we need a mathematical expression (or "equation"or "model") that describes how the porosity varies with depth in the borehole.In other words, we need a model of spatial variability.

    However, hydrologic and environmental variables change from location tolocation in complex and inadequately understood ways. In most applications,we have to rely on the data to guide us in developing an em pirical model. Themodel involves the concept of probability in the sense that spatial variability isdescribed coarsely by using averages. For example, the best we can do mightbe to specify that the porosity fluctuates about some mean value and to comeup with a formula to correlate the fluctuations at two locations depending ontheir separation distance. This is often the most practical scheme to summarize

    incomplete information or erratic data.Consider the porosity or any other spatially variable quantity, such as chem-

    ical concentration or precipitation; this quantity is a function of the spatial co-ordinates and may be represented asz(x\), z(x\, x2), or z(x\, x2, JC3) depending

    41

  • 8/8/2019 9780511626166_9780521583121

    64/271

    42 Intrinsic model

    Figure 3.1 The interpolation problem . Observations are indicated by the symbol o.

    on whether it varies in one, two, or three dimensions. For brevity, the notationz(x) will be used to include all three cases, where x is thelocation index(avector with one, two, or three components). Thus,

    or x = or x = (3.1)

    The function z(x), known as aregionalized or fieldvariable, is not knowneveryw here but needs to be estimated from available observations and , perhaps,additional information.

    We are now ready to discuss the logical underpinnings of the approach. If

    statistical modeling is new to you and you wonder what it means, pay particularattention to this part.

    In practice, our objective is to estimate a field variable z(x) over a region.Usually, because of scarcity of information, we cannot find a unique solution.It is useful to think of the actual unknown z(x) as one out of a collection(or ensemble) of possibilities z(x; 1), z(x; 2), ... . This ensemble defines allpossible solutions to our estimation problem. The m embers of the ensemble areknown asrealizations or sample functions.

    Consider, for example, Figures 3.2, 3.3, and 3.4. Each figure contains fiverealizations from a different ensemble (family of functions). Notice that despitethe differences among the realizations in each figure, they share som e generalstructural characteristics. The functions in Figure 3.2 are all "smooth" curveswith well-defined slope at every point. The functions in Figures 3.3 and 3.4 are

  • 8/8/2019 9780511626166_9780521583121

    65/271

    Methodology overview 43

    Figure 3.2 Five realizations from a family ofz(x) functions.

    continuous but rough curves with ill-defined slopes. The curves in Figures 3.2

    and 3.3 have fluctuations with much smaller periods than the fluctuations inFigure 3.4.

    Assume for argument's sake that we have selected an ensemble and thatwe have computed the probability that a realization is the actual unknown,i.e., we can specify that the probability that z(x) = z(x; / ) , for any /, is P/.

  • 8/8/2019 9780511626166_9780521583121

    66/271

    44 Intrinsic model

    Figure 3.3 Five realizations from another family of functions.

    Mathematically, we write

    Pi = Pr[z(x) = z(x; /) ]. (3.2)

    (We will see later how we can assign these probabilities.)The ensem ble of realizations w ith their assigned probabilities defines w hat is

    known as arandom function (or random fieldor spatial stochasticprocess). We

  • 8/8/2019 9780511626166_9780521583121

    67/271

    Methodology overview 45

    Figure 3.4 Five realizations from yet another family of functions.

    are interested in calculating averages over all possible realizations. Expectation,

    denoted by the symbolE, is the process of computing a probability-weightedaverage over the ensemble. Thus, the expected value ofz at location x is

    [z(x)] = P x z(x; 1) + Pi z(x; 2) + = ^ /> z(x; /) .(3.3)

  • 8/8/2019 9780511626166_9780521583121

    68/271

    46 Intrinsic model

    Instead of specifying all possib le solutions and the ir probabilities, it is moreconvenient to specify and to work with ensemble averages orstatistical mo-

    ments. In linear estimation, we use the first two statistical moments of therandom field, which are

    1. the mean function (first moment), which gives the expected value at anypoint x,

    m(x) = E[z(x)], (3.4)

    2. and the covariance function (second moment), which is the covariance forany pair x and x',

    R(x, x') = [( z( x ) - m(x) )(z(x ') - ro(x'))]. (3.5)

    Important structural characteristics of each ensemble can be describedthrough the mean and covariance functions. For exam ple, the degree of smooth-ness of the realizations or the scale of the fluctuations can be described nicely.

    In a general sense, we use the first two moments instead of the probabilitiesPu Pi, Thus,

    the model of spatial structure consists of the mathematical expressions chosen

    to describe the mean function and the covariance function.It is important to grasp the meaning of the term structure in the context

    of estimation of spatial functions: It is the information that characterizes theensemble of plausible solutions that our unknown function belongs to!

    A statistical analysis of spatial data is conveniently subdivided into twophases:structural analysis and best linear unbiased estimation {BLUE).

    Structural analysis is the selection and fitting of mathematical expressions forthe required first two moments of the regionalized variable. The form of these

    expressions comprises the model. Many expressions can be used to representthese momen ts. Some are comm only used general-purpose m odels, such as theintrinsic model, which we will see in this chapter; others are special-purposemodels developed byyou, the user, for a specific application.A modelis selectedbased on the analysis of data and other information, including experience withdata at similar sites and geologic and hydrologic information. From those, theanalyst must decide whether the unknown function belongs, for example, inthe ensemble of Figure 3.2, 3 .3, or 3.4. Typically, model selection is an iterativeprocedure consisting of

    (a ) exploratory data analysis (see Chapter 2), on the basis of which a model istentatively selected;

    (b ) parameter estimation, such as selection of numerical values for the param-eters of the expressions of the mean and covariance function; and

  • 8/8/2019 9780511626166_9780521583121

    69/271

    Illustrative example 47

    (c) model validation or diagnostic checking, which involves careful examina-tion of the performance of the model in test cases.

    Best linear unbiased estimation deals with taking into account specific ob-servations. Specifically, we look for estimates that are as representative andaccurate as possible, using the model developed during structural analysis andthe specific observations.

    The basic idea is that we proceed to figure out an unknown function(e.g.,the concentration over a cross section) in two stages.

    1. During the first stage, structural analysis, the choice is narrowed down to the

    functions sharing certain characteristics, collectively known as structure.2. During the second stage, the choice is narrowed down further by requiringthat all possible solutions honor the data.

    These ideas will becom e clearer after w e study some examples. In this chap-ter, after Section 3.2, we will present one of the most commonly used geosta-tistical models, the intrinsic isotropic model.

    3.2 Illustrative example

    In this section, we will practice the concepts that we saw in the previous sectionby working on an example.

    Consider the following family of functions:

    z(x\ u) = sin(27rjc + w), (3.6)

    where x is the spatial location (one-dimensional) andu is a random variableuniformly distributed between 0 andIn . That is, the probability d istribution ofu is:

    , i f O < w < 2 7 r2n . (3.7)

    0, otherwise

    We are asked to perform the following tasks:(a ) Justify why this information fully defines a random functionz(x).(b ) Compute the mean functionm(x).(c) Compute the covariance functionR(x, x'). Note that the covariance func-

    tion depends only on||JC x'\\, where || || indicates the distance betweenxand x'. Plot the covariance function.

    (d ) Using a program that generates random variables (such as functions r a n dand r a n d n in MATLAB) generate and plot five realizations ofthe randomfunction.

  • 8/8/2019 9780511626166_9780521583121

    70/271

    48 Intrinsic mod el

    (e) Now assume that you also measure the value 0.5 forz at x = 0. Conditionalon this information: What are the possible solutions forz(x) and their

    probabilities? W hat is the mean function? W hat is the covariance function?

    The answers are given below:

    (a ) The random functionz(x) is fully defined because we have specified theway to generate all possible realizations or sample functions and the cor-responding probability. For example, to generate a set ofM equally likelyrealizations, generateM variates uniformly distributed between 0 and2n(see Appendix C) and apply these in Equation (3.6).

    (b ) The mean at z(x) is the weighted average over all values of z atthis location.Intuitively, we expect that the mean will be zero. More systematically,

    m(x) = ) ] = sin(27rx + u)du = 0, (3.8)In

    where ^ is the pdf (probability density function) ofu in the interval between0 and 2TT. Thus, the mean is the same everywhere, 0. Note that the meanfunction is much simpler than any of the realizations.

    (c) The covariance function isR(x, x') = [sin(27r;t + u) sin(27TJt' +u)]

    -i: sin(27rx + u) sin(27rx/ + u)du= - COS(27T(Jt ~X') (3.9)

    Thus, we can see that the covariance function depends only on the dis-

    12 'tance ||JC JC'||. The variance,R(x, x) = \, is the same everywhere. SeeFigure 3.5.

    0.4

    0.2

    R(x) 0 .

    -0.2

    -0.4

    1.0

    Figure 3.5 Plot of covariance function (periodic).

  • 8/8/2019 9780511626166_9780521583121

    71/271

    Illustrative example 49

    (d) We generate five random phases using the MATLAB command

    2 * p i * r an d (5 , 1 ) .

    This picks five numbers between 0 and In. See Figure 3.6 for examples of

    sample functions.

    (e) Given the available information, there are only two possible values for

    u: 7t/6 and 5n/6; each is equally probable. Then, the only realizations that

    honor this measurement are

    ZI(JC) = sin(27Tx + 7r/6), (3.10)

    z2(x) = sin(27r JC + 57T/6). (3.11)

    The conditional mean is

    m c(x) = - sin(27TJC + n/6) + - sin(27r;c + 5TT/6) = - COS(2TTX).

    (3.12)

    See Figure 3.7. Note that the conditional mean function is more com-

    plex than the prior mean, which is zero everywhere. Also, note that the

    *.(*)

    Z2(X)

    zt(x)

    Z5(X)

    X

    Figure 3.6 Five sample functions.

  • 8/8/2019 9780511626166_9780521583121

    72/271

    50

    Figure 3.7 Conditional mean, given z(0) = 0.5.

    conditional mean is smoother than either ofthe two possible solutions. Thecovariance function is

    Rc(x, xr) = -(sin(27rjc + TT/6) - mc)(sin(27rx/ + TT/6) - mc)

    -(sin(27TJt + 5JT/6) - mc)(sin(2nxf + 5TV/6) - mc)

    Tt/6) - sin(27r;t + 5TT/6))

    x (sin(27rjc/ + 7T/6) - sin(27rjc/ + 5 T T / 6 ) )

    3= - sin(27rjc) si

    4(3.13)

    Note that the conditional covariance function is a function oftwo locations,x and xf. The conditional variance,Rc(x, x), is plotted in Figure 3.8.

    Figure 3.8 Conditional variance, given z(0) = 0.5.

  • 8/8/2019 9780511626166_9780521583121

    73/271

    Intrinsic isotropic model 51

    Exercise 3.1 Consider the random functionz defined on one dimensionx by

    z(x) = COS(27TJC + u\) + C

    where u\ and u 2 are two independently distributed random variables with uni-form distributionin the interval0 to 2TT.

    (1) Find the mean functionand the covariance functionof z. (Plot results, ifpossible.)

    (2) Generate and plot five realizations of this random function overthe interval[0, 10].

    3.3 Intrinsic isotropic model

    In the outlineof the general approach,it was mentioned thatwe need to writeexpressions for the mean and the covariance function.One of the simplestmodelsis:

    The mean is constant and the two-point covariance function depends on lyonthe distance between the two points.

    That is,E[z(x)] = m (3.14)

    and

    [(z(x) - m)(z(x') - m)] = R(h), (3.15)

    where

    h = ||x - x'|| = ^(JCI - x[) 2 + (x 2 - x'2)2 + (JC3 - x'3)

    2 (3.16)

    is the distance between sampling locationsx and x', x = [x\ , x2 , x 3], and x r =[x[, x'2, x'3]. Equations (3.14)and (3.15) comprise thestationary model; a ran-dom function z(x) satisfying these conditionsis called stationary.1 This modelis also isotropic because it uses only the length and not the orientation of thelinear segment that connectsthe two points. In this chapterwe will focusonisotropic models;we will seeanisotropic models in other chapters.

    The valueof the covarianceat h = 0 is known as the variance or the sillofthe stationary function.

    Exercise 3.2 For the random functionz(x) defined in Section3.2:

    (1) Is z(x) stationary?

    1 Technically, thisis known as "wide-sense"or "second-mom ent" stationary.

  • 8/8/2019 9780511626166_9780521583121

    74/271

    52 Intrinsic model

    (2) Com pute the ensemble average

    y{x,xf

    )=l

    -E[{z{x)-z(xf

    ))2

    l(3) Show thaty(x, x') depends only on the distance \\x x'\\ and plot the ex-

    pression y(\\xx' ||) versusthe separa tion distance \\xx'\\. Showalso that

    y(\\x - xr\\) + R(\\x - x'\\) = R(0)

    is a constant value.

    Mathematically speaking, thereare several types of functions that satisfyEqua tions (3.14) and (3.15). The covariance function may be periodic (such asa cosine), aperiodic but consisting ofa number ofsinusoids,or a function that isnone of the above but has a continuous power spectrum (from Fourier analysis)and finite variance. In estimation applications, it is the last type thatis of interestso that, unless otherwise stated, we w ill assume that we deal w ith this type. Allwe need to know for now, without getting into Fourier analysis,is that for astationary function (of the type we are interestedin), the sillR (0) is finite and thevalue R{h) vanishes or tends to vanish whenh exceeds a value called therange.

    In order to apply this model in interpolation, w e need to find the parameterm

    and to select an expression for the covariance function and find its param eters,such as the expressionR(h) = v exp(h/l) with parametersv and I. Then,itis possible toextrapolate from the locations of the observations.

    In most cases, the meanis not known beforehand but needsto be inferredfrom the data; to avoid this trouble, it may be more convenient to work w ith thevariogram. The variogram is defined as

    y(h) = ^E[(z(x)-z(x')) 2 ]. (3.17)

    (Originally, the term variogram was usedfor 2y(h), and y(h) was calledthe semivariogram.Since we only usey(h) f we will callit the variogram.) Tounderline the distinction between the experimental variogram, whichis com-puted from the data, with the variogram, which isa mathematical expression,the latter is sometimes called thetheoretical variogram.

    For a stationary function, the relation between the variogram and the covari-ance functionis

    y(h) = X-E[(z(x) - z(x/))2] = l-E[{{z(x) -m)- (z(x') - m))2}

    = - E[(z(x) - m)(z(xf) - m)] + - [ ( z ( x ) - m)2]

    + l-E[(z(x!) - m)2] = -R(h) + R(0). (3.18)

  • 8/8/2019 9780511626166_9780521583121

    75/271

    Intrinsic isotropic model 53

    That is , the variogram is minus the covariance function plusa constant (whichhappens to be the variance):

    (3.19)

    Using the variogram, consider the m odel:

    The mean is constant but unspecified and the two-point mean square differ-ence depends only on the distance between the two locations.

    That is,

    [ z (x ) - z ( x ' ) ] = 0 (3.20)

    ^ 2 (3.21)

    where h = ||x x'|| is the distance of the separation vector. Equations (3.20)and (3.21) comprise theintrinsic isotropic model.

    At first, the reader may be unable to distinguish the intrinsic model fromthe stationary one. The difference is slight but important. Note, to begin with,that the stationary and intrinsic models differ in the parameters needed to char-acterize them as well as in mathematical generality. It takes less information

    to characterize the intrinsic model than the stationary model. Whereas bothassume constant mean, in the intrinsic model we avoid ever using a numericalvalue for the mean. Furthermore, the stationary model may use the covariancefunction, which cannot be reconstructed only from the variogram over a dis-tance smaller than the range . In a sense, to specify the covariance function oneneeds the variogram plus an extra number, the variance.

    The intrinsic model is mathematically more inclusive(i.e., general) than thestationary one. If z(x) is stationary, then it is also intrinsic because Equation(3.20) follows from (3.14) and Equation (3.21) follows from (3.15) by using(3.19). For a stationary function, the variogram at large distances equals thesill, y(oo) = R(0) = a2. However, the important point is that not all intrinsicfunctions are stationary. As a practical rule, an intrinsic functionis nonstationaryif its variogram tends to infinity ash tends to infinity. For exam ple, the

    y(h) = h (3.22)

    variogram characterizes an intrinsic function that is not stationary.It is the intrinsic model that we w ill use in the remainder of this chapter. In-

    variably, the question askedis : What is the practical significance and meaning ofthis model?The practical significanceis that itis a simple model thatis useful in:

    1. summarizing incomplete information and patterns in noisy data; and2. allowing us to interpolate from observations of z(x), as we will soon see.

  • 8/8/2019 9780511626166_9780521583121

    76/271

    54 Intrinsic model

    The meaning of this model and its applicability range will be appreciatedafter we see examples and apply it to describe data.

    Exercise 3.3 Ifz(x) is a stationary random function with mean m and covari-ance function R{h), then find the mean and variance of

    and

    2 Z

    where x and xf are two locations. Check each of these cases to see whetheryou can find the variance in terms ofthe variogram y(h) = R(0) R(h) atseparation distance ||x x'| |, without necessarily specifying the variance ofthe process.

    3.4 Common models

    There are mathem atical restrictions on which functions m ay be used as covari-ance functions or variograms. The reason is rather mundane: The variance ofa linear combination of values of z(x) at a number of points can be expressedin terms of covariance functions or variograms (see Exercise (3.3)); the modelswe use should be such that this expression cannot becom e negative. Criteria arediscussed in references [21 and 30].

    In practice, variograms describing the spatial structure of a function areformed by combining a small number of simple mat