Heterogeneity spectrum inversion from a stochastic...

submitted to Geophys. J. Int.

Heterogeneity spectrum inversion from a stochastic

analysis of global surface-wave delay-time data

S. Della Mora1, L. Boschi2, T. W. Becker3, D. Giardini4

1 E.T.H. Zurich, Institute for Geophysics. E-mail: [email protected]


3 University of Southern California, Los Angeles. E-mail: [email protected]


SUMMARY

The wavelength of mantle heterogeneity reflects the nature of Earth’s dynamics, and

constraining it on the basis of seismic data is useful to evaluate the likelihood of

different proposed models of mantle convection. We neglect the geographic distribu-

tion of mantle heterogeneity, inverting global delay-time data to determine directly

the heterogeneity spectrum of the Earth. Inverting for the spectrum is in principle

(fewer unknowns) more robust than inverting for three-dimensional (3D) structure:

as a result, this approach should ultimately help us to constrain the properties of

planetary structure at wavelengths shorter than those of current 3D models. The

linearized algorithm that we employ is based on the work of Gudmundsson and co-

workers in the early 1990s: seismic rays starting from sources in the same cell and

arriving at receivers in the same cell are collected, and the variance of the associ-

ated delay times is calculated; this exercise is repeated varying the grid size. The

dependence of calculated variance on the inter-source and inter-receiver distance can

then be linked to the heterogeneity spectrum via a linearized least-squares inversion.

Here, we limit ourselves to a two-dimensional problem, analysing surface-wave dis-

2 S. Della Mora et al.

persion in the membrane-wave approximation. Besides inverting global seismic data,

we have conducted a number of synthetic tests to evaluate the resolving power of the

method and its robustness, and the dependence of inversion results on the amount

of inverted data and on the level of complexity that we allow for. We find that the

algorithm can reconstruct realistic spectra, independently of the data coverage, but

the linearized approach involves some approximations that degrade accuracy. We

infer that a fully nonlinear inversion might be necessary to directly constrain the

Earth’s spectrum up to relatively high harmonic degrees.

Key words: Inverse Theory, Surface Waves and Free Oscillations, Planetary Inte-

riors, Seismic Tomography.

1 INTRODUCTION

After two decades of efforts to map the geographic distribution of the features of planetary

structure, the convergence between tomography and geodynamics results is only partial, lim-

ited to the large-scale features, and the small-scale components of Earth’s structure are not

well constrained. With this study we formulate a stochastic approach alternative to tomogra-

phy and aimed at using seismic data to constrain only the “statistical” properties of planetary

interiors. In practice, this amounts to inverting seismic observations to determine the depth-

dependent harmonic spectrum of planetary structure, ignoring the geographic distribution of

heterogeneity. A similar approach, in the context of global seismology, has been explored in

the early 1990s (Gudmundsson et al. 1990), hereafter defined as GDC90, and later abandoned.

We believe that this approach should be revived and improved for two reasons: (i) it involves

a reduction of the dimensionality of the solution space, which will limit the non-uniqueness

of the inverse problem, so that, particularly at high spherical harmonic degrees, the spectrum

can in principle be constrained more robustly than it is now; if harmonic degrees up to 40 are

considered, inverting for the harmonic spectrum rather than the 3D structure of the Earth

amounts to a two-order-of-magnitude reduction of the number of dimensions in the solution

space. (ii) The statistical properties, rather than exact geographic locations of structural fea-

tures, are the piece of information that is most valuable for many geodynamic questions. A

correct model of global mantle dynamics should match the thermo-chemical structure of the

mantle as imaged by seismology (e. g., Bunge et al. 2003; Conrad and Gurnis 2003), while

predicting the motion of tectonic plates, but geodynamicists’ attempts to reproduce the exact

Stochastic heterogeneity spectrum inversion 3

pattern of tomography have only had limited success. Slab trajectory models (e. g., Lithgow-

Bertelloni and Richards 1998; Steinberger 2000), for example, are affected by several input

parameters, some still poorly constrained, such as changes in plate motion configurations (e.

g., Bunge and Grand 2000; Tan et al. 2002): correlations with global tomography are therefore

easily degraded, e. g., by small offsets, and the match between fast tomographic anomalies and

the expected locations of deep slabs only holds at the largest scale lengths (Becker and Boschi

2002). Models of deep mantle chemical piles (McNamara et al. 2005) are similarly affected by

prescribed plate motions, and the problem becomes even worse for upwelling plumes (Boschi

et al. 2007, 2008).

In terms of overall mantle dynamics, it is useful to consider the planet’s heterogeneity

spectrum. By analysing the temperature spectrum of a spherical convection computation and

comparing it with tomography (e. g., Megnin et al. 1997), one can ask if the geodynamic model

matches the Earth in a statistically averaged sense. One of the long-standing issues that can

thus be addressed is how the long wavelength structure of the mantle, and in particular the

dominance of degree two, is generated (e. g., Bunge et al. 1996). The organizing motion of

the plates (Buffett et al. 1994) and viscosity stratification (e. g., Bunge et al. 1998) play a

role, though the dependence of viscosity on temperature may change the picture, and the

super-continental cycle may in fact be dominant (Zhong et al. 2007). It is only recently that

spherical convection computations were shown to produce plate-like motion (e. g., van Heck

and Tackley 2008; Foley and Becker 2009) and spectra can be used to diagnose such models

(e. g., Nakagawa et al. 2009; Yoshida 2008). Other important processes are also likely to

leave a strong imprint in the spectral heterogeneity, and could be better understood if the

latter were more robustly constrained by tomography: for example, transient impediments

to vertical mass flux at the 660 km phase change (Tackley 1996), a hot, abyssal layer which

may be invisible to tomography but would affect power spectra (Tackley 2002), and the post-

perovskite phase change (Tackley et al. 2007). A direct inversion for the power spectra might

be therefore a powerful tool to limit the range of possible Earth models.

The approach of GDC90 lends itself to two possible future developments: (i) the size of

the solution space being comparatively small, one can envisage fully nonlinear, numerical

inversions, still impossible for classical, 3-D tomography, but needed to properly constrain

small-scale structure. (ii) Because it neglects the geographical pattern of heterogeneities, our

method should work also in conditions of poor data coverage: it will be useful to map the

structure of extra-terrestrial bodies (e. g., Wieczorek 2009; Lognonne 2008), where too few

seismometers can be deployed for tomographic techniques to be applicable.


We design an inversion algorithm similar to that of GDC90, describing the theory in more

detail and, at least in our view, a simpler notation. This helps us to better understand its

fundamental assumptions and their implications. We apply the algorithm to surface waves,

treated as propagating on a zero thickness spherical membrane along the shortest great-circle

arc connecting source and receiver. In this framework, synthetic data can be generated in a

short amount of time, and the GDC90 algorithm has an easier formulation.

2 STOCHASTIC FORMULATION

A theory of wave propagation defines a mathematical relationship between anomalies in the

properties of the Earth (e. g., the reference velocity v of a seismic phase) δv(r, θ, ϕ) (with

r, θ, ϕ radius, colatitude and longitude, respectively) and anomalies δt in the travel-time of

seismic phases. Neglecting non-linear effects, this relationship has the most general form

δt = − 1v2

∫VK(r, θ, ϕ)δv(r, θ, ϕ)dV, (1)

where V denotes the volume of the Earth, and the function K, dubbed sensitivity kernel (or

partial derivative, Frechet derivative, etc.), depends on the source-station geometry associated

with the datum δt. Given phase and frequency, there exists one kernel per source-station

couple (e. g., Cheney 2001). If a 1-D Earth is used as reference, the form of K depends only on

epicentral distance (e. g., Zhou et al. 2004). If ray theory is used to describe wave propagation,

K is nonzero on the ray path (traced in the reference model), and zero everywhere else (e.

g., Nowack and Lyslo 1989). If some form of finite-frequency theory is used, K becomes more

complicated (e. g., Tromp et al. 2005). If the assumption of linearity is dropped, e. g. if we

care about multiple-scattering, then no function K can be defined, and equation (1) ceases to

be valid. Typically, equation (1) is used to set up an inverse problem with δv(r, θ, ϕ) as the

unknowns, and a set of observations of δt as data.

Equation (1) can be re-written per observed value of δt, each with its corresponding kernel

function K. δv is then expressed as a sum of unknown coefficients multiplied by some known

“basis functions”, e. g.

δv(r, θ, ϕ) =L∑l=0

l∑m=−l

N∑n=1

AlmnYlm(θ, ϕ)Rn(r), (2)

with Ylm denoting the real scalar spherical harmonic of degree l and order m

Ylm (θ, φ) =

√

(2−δ0,m)(2l+1)4π

(l−m)!(l+m)!P

ml (cos θ) cos (mϕ) m ≥ 0√

(2l+1)4π

(l−|m|)!(l+|m|)!P

|m|l (cos θ) sin (|m|ϕ) m < 0

, (3)


where δi,j denotes here the Kronecker function, i. e. δi,j = 1 if i = j, δi,j = 0 otherwise,

Pml (cos θ) is the associated Legendre function of degree l and order m (e. g., Hildebrand

1956), and Rn(r) some vertical basis function. The largest angular degree L and the total

number of vertical functions N are selected depending on the resolution that one expects

to achieve. Replacing (2) into (1) once per observation, we have a mixed-determined inverse

problem with unknown coefficients Almn, while all other values can be calculated (e. g., Boschi

and Dziewonski 1999).

The main idea of GDC90 and Davies et al. (1992) is to set up an inverse problem whose

unknowns are not the coefficients Almn, but the spectral power per unit area

Qln =1

2l + 1

l∑m=−l

A2lmn. (4)

GDC90 proceed by dividing a global database of δt observations into smaller sets, each

including data characterized by sources located in one longitude-latitude-depth bin, and re-

ceivers located in one longitude-latitude-depth bin: in the framework of ray theory, these data

subsets are dubbed “ray bundles”, or “summary rays”. The horizontal and vertical extent of

the bins are denoted Θ and Z, respectively. GDC90 then introduce the function

σ2(Θ,∆, Z) =

(nS∑k=1

nk

)−1 nS∑k=1

nk

nk∑i=1

[δti −meank(δt)]2

nk − 1, (5)

where k is the ray bundle index, from a total of nS ray bundles, nk is the number of actual rays

collected in the k-th ray bundle, and meank(δt) is the mean of all measurements of δt within

that bundle. For a given binning scheme (i. e., given values of Θ and Z), σ2 is a function of

the epicentral distance ∆. Through (5), the numerical values of σ2 can be determined directly

from the observations of δt. We limit the summation over ray bundles only to those that

include more than 10 rays, i. e. nk > 10.

σ2 as defined byThe equation (5) can be thought of as the average, calculated over all

ray bundles in the same ∆-Z bin, of the variance of δt calculated within each ray bundle. In

practice, if we introduce the expected value operators EC =1

nk − 1

nk∑i=1

(...) (sum extended

over all rays within a bundle) and E =

(nS∑k=1

nk

)−1 nS∑k=1

nk(...) (sum over all bundles in the

same ∆-Z bin), equation (5) takes the more compact form

σ2(Θ,∆, Z) = EEC[(δt− EC(δt))2

]. (6)

GDC90’s theoretical treatment (pages 28 through 34) consists of showing that expression


(6) can also be written as an integral function of the harmonic spectrum (4) of δv as a function

of depth. That way, a linear inverse problem can be set up, whose unknowns are the coefficients

Qln themselves. In the following, we shall first rewrite all the theory for body waves in a more

extensive way than GDC90 did (Section 3) and then reformulate it for surface waves (Section

4).

3 FORMULATION OF THE INVERSE PROBLEM FOR A 3-D EARTH

3.1 Theory

Following GDC90, we take a “stochastic” approach (e. g., Chernov 1960) and think of each

ray bundle (for given ∆, Z, Θ) as a different realization of the same experiment. Each time,

a different planet is sampled, with different coefficients Almn.

Equation (6) becomes, after some algebra,

σ2(Θ,∆, Z) = EEC[(δt− EC(δt))2

]=

= EEC[(δt)2 − 2δtEC(δt) + (EC(δt))2

]=

= EEC[(δt)2

]− 2EC(δt)EC(δt) + (EC(δt))2

= (7)

= EEC[(δt)2

]−[EC(δt)

]2 =

= EEC[(δt)2

]− E

[EC(δt)

]2.

The second term of the latter expression can be rewritten

E[EC(δt)

]2 = E[ 1A

∫Aδt(x)dx

]2, (8)

A being the area of the grid cell of radius Θ. Considering that the expected value of the sum

equals the sum of the expected values, and applying Fubini’s theorem(∫Bf(y)dy

)(∫Cg(z)dz

)=∫B

∫Cf(y)g(z)dydz, (9)

valid provided that B and C be complete measure space (e. g., Thomas and Finney 1996), we

find

E[EC(δt)

]2 =1A2

∫A

∫AE[δt(x1)δt(x2)

]dx1dx2. (10)

Let us now consider the other term in equation (7),

EEC[(δt)2

]=

1A


]dx1. (11)


Since1A

∫A

dx2 = 1,

EEC[(δt)2

]=

1A


]dx1

( 1A

∫A

dx2

), (12)

and again based on Fubini’s theorem,

EEC[(δt)2

]=

1A2

∫A


]dx1dx2. (13)

If we define ρ = |x2 − x1|, then δt(x1) = limρ→0

δt(x2), and

EEC[(δt)2

]=

1A2

∫A

∫A

limρ→0

E[δt(x1)δt(x2)

]dx1dx2. (14)

Substituting (10) and (14) into (7), we find

σ2(Θ,∆, Z) =1A2

∫A

∫A

limρ→0

E[δt(x1)δt(x2)

]dx1dx2 +

− 1A2

∫A


]dx1dx2 (15)

In sections 3.2 and 3.3 we will write the expression E δt(x1)δt(x2), which appears in both

terms at the right-hand side of (15), in a simpler form; we will also find that this expression

depends explicitly only on the distance ρ = |x1 − x2|, and not on x1 and x2 themselves.

In Section 3.4 we will exploit this finding to simplify the form of the double integral∫A

∫A

in equation (15), after having further simplified our expression for E δt(x1)δt(x2) via the

ray-theory approximation. This procedure will lead to a relatively simple expression for σ2

(whose value we can also determine from the data through equation (5)) in terms of the

harmonic spectrum of the Earth, Ql =N∑n=1

QlnRn(r). Such equation constitutes the basis of

GDC90’s and our formulation of the inverse problem.

3.2 Relation between delay-time variance within ray bundles, and the

harmonic spectrum of Earth’s structure

We next show how E δt(x1)δt(x2) can be written in terms of the Earth’s structure coeffi-

cients Almn. Using equation (1),

E δt(x1)δt(x2) = E

1v4

∫V

∫VK1(r1)K2(r2)δv(r1)δv(r2)dr1dr2

, (16)

with ri = (ri, θi, ϕi) (i = 1, 2) a position 3-vector defined within the Earth’s volume V , and dri

the corresponding infinitesimal volume element. The kernel function Ki is the one associated

with the position 2-vector xi (i = 1, 2) defined over the surface A, or the portion of Earth’s

surface swept by the ray bundle.


Substituting equation (2) into (16), and denoting for simplicity Alm(r) =N∑n=1

AlmnRn(r),

Eδt(x1)δt(x2) = E

1v4

∫V

∫VK1(r1)K2(r2)

∑l,m

Alm(r1)Ylm(θ1, ϕ1)

∑p,q

Apq(r2)Ypq(θ2, ϕ2)dr1dr2

, (17)

where∑l,m

is short forL∑l=0

l∑m=−l

.

Recall now that, in our stochastic approach, the expectation operator E involves an

average made over all sampled planets. Each ray-bundle, in fact, can be thought of as a

repetition of the same experiment, with constant geometry and different structure coefficients

Alm(r). We can then rewrite equation (17),

E δt(x1)δt(x2) =1v4

∫V

∫VK1(r1)K2(r2)

∑l,m

∑p,q

[E Alm(r1)Apq(r2)

Ylm(θ1, ϕ1)Ypq(θ2, ϕ2)]dr1dr2. (18)

Following GDC90, we then make the assumptions (i) that coefficients at different harmonic

degrees and orders be uncorrelated, i. e. that the expected value of their product be zero:

E Alm(r1)Apq(r2) = E Alm(r1)Apq(r2) δl,pδm,q (19)

(with no implicit summation over l, m), and (ii) that at each harmonic degree l, E AlmAlm

be constant with respect to m, so that we can write

E Alm(r1)Alm(r2) =1

2l + 1

l∑m=−l

Alm(r1)Alm(r2). (20)

For r1 ≈ r2, if we replace r1 and r2 in equation (20) with (r1 + r2)/2,

E Alm(r1)Alm(r2) ≈ Ql(r1 + r2

2

), (21)

with Ql(r) =N∑n=1

QlnRn(r) for any r, and Qln defined by equation (4). As the absolute value

|r1− r2| grows, we assume that correlation (expected value of product) will change (decrease)

in a similar way for all l,m. We introduce a function c (|r1 − r2|) describing the vertical

dependence of correlation, and correct (21) as follows:

E Alm(r1)Alm(r2) ≈ Ql(r1 + r2

2

)c (|r1 − r2|) , (22)

which is valid for any r1, r2.


If wesubstitute equation (22) into (19), and the resulting expression for E Alm(r1)Apq(r2)

into (18), we find

Eδt(x1)δt(x2) ≈ 1v4

∫V

∫VK1(r1)K2(r2)

∑l

[Ql

(r1 + r2

2

)c(|r1 − r2|)

∑m

Ylm(θ1, ϕ1)Ylm(θ2, ϕ2)]dr1dr2. (23)

The summation over m can be carried out analytically, via the addition theorem (e. g.,

Dahlen 1998, equation B.74)

l∑m=−l

Ylm(θ1, ϕ1)Ylm(θ2, ϕ2) =2l + 1

4πPl(cos ρ), (24)

with Pl the Legendre polynomial of degree l, implicit in the definition of harmonics Ylm, and

ρ the angular distance between (θ1, ϕ1) and (θ2, ϕ2), given by

cos(ρ) = cos(θ1) cos(θ2) + sin(θ1) sin(θ2) cos(ϕ1 − ϕ2). (25)

Substituting equation (24) into (23),

Eδt(x1)δt(x2) ≈ 14πv4

∫V

∫VK1(r1)K2(r2)

∑l

[(2l + 1)Ql

(r1 + r2

2

)c(|r1 − r2|)Pl(cos ρ)

]dr1dr2. (26)

3.3 Simplification by application of ray theory, and the assumption that ray

paths forming a ray bundle are parallel

As noted at the beginning of Section 2, in the ray-theory approximation the velocity-kernel

K(r) = 1 if r belongs to the ray path, and K(r) = 0 otherwise. Denoted ray1 and ray2 the

ray paths corresponding respectively to the locations x1 and x2 within A, equation (26) can

be rewritten

E δt(x1)δt(x2) ≈ 14πv4

∫ray1

∫ray2

∑l

(2l + 1)Ql

(r1(s1) + r2(s2)

2

)c (|r1(s1)− r2(s2)|)Pl(cos ρ)ds1ds2, (27)

where s1 and s2 denote length along ray1 and ray2, respectively, and r1 = r1(s1), r2 = r2(s2)

are (part of) the parametric equations of the ray paths.

Neglecting, at first, the effects of spherical geometry, GDC90 show in detail how the double

integral∫ray1

∫ray2

can be reduced to a single integral along one reference ray. Their proce-

dure requires the assumption that “all the rays have the same ray parameter and randomly

distributed endpoints in the two grid cells defining the summary ray. This implies that the


d

s2

ray 1ray 2

r2

P

r1

Figure 1. Visual explanation of variables s2, τ and d in equation (28). ρ is the horizontal distance

between (θ1, φ1) and (θ2, φ2) introduced in equation (24).

rays are approximately parallel and simply shifted horizontally with respect to each other”

[GDC90, page 32]. We shall try to overcome this approximation in Appendix C.

Consider now a point r1 on ray1. Let us call P its projection on ray2, and d the distance

between r1 and P (i. e., by the definition, the minimum distance between ray1 and ray2).

Given a point r2 on ray2, said s2 the distance between P and r2 along ray2, the distance τ

between r1 and r2 equals

τ =√d2 + s2

2, (28)

which implies

ds2 =τdτ√τ2 − d2

(29)

(see Figure 1 for a visual explanation of τ , d2 and s22). Then, for a function g(r1, r2) that

depends only on the distance τ between r1 and r2,∫ray1

∫ray2

g(r1, r2)ds1ds2 = 2∫

ray1

∫ ∞d

τ√τ2 − d2

g(τ)ds1dτ. (30)

The integral over τ in equation (30) is particularly easy if g(τ) = g0 exp(−τ2/α2) for some α,

implying ∫ray1

∫ray2

g(r1, r2)ds1ds2 =√π

∫ray1

αg(d)ds1. (31)

Said x1/2 the value of d such that g(x1/2) = g0 exp(−x21/2/α

2) = 1/2g0 (i. e., x1/2 is the

“half-width” of the gaussian g), it can be shown that x1/2 = α√

ln 2, and∫ray1

∫ray2

g(r1, r2)ds1ds2 =√

π

ln 2

∫ray1

x1/2g(d)ds1. (32)

This procedure can be applied to the integral at the right hand side of equation (27),


assuming that its argument, a function of the distance ρ between points on ray1 and ray2, be

close to gaussian. Then

E δt(x1)δt(x2) ≈ 14v4√π ln 2

∫ray

x1/2(r)∑l

[(2l + 1)Ql(r)Pl(cos(ρ))] ds, (33)

with r = (r1 + r2)/2. Notice that the arbitrary function c(|r1 − r2|) in equation (27) becomes

unnecessary, once the assumption that the integrand be approximately gaussian has been

made. Still, the assumption of very slow changes in the argument of the integral in (27) with

respect to location is required, in order for changes in x1/2 with r to not pose a problem. It

should also be noted that, in the assumption of parallel rays, the horizontal distance ρ has

been systematically approximated with the generic distance d between ray paths, assumed

constant along the ray paths themselves.

3.4 Writing the double surface integral as a single integral over distance

Recall the form of both terms at right-hand side of equation (15),1A2

∫A

∫AE δt(x1)δt(x2) dx1dx2.

In sections 3.2 and 3.3 we have rewritten the integrand E δt(x1)δt(x2), expressing it in terms

of the Earth’s harmonic spectrum Ql(r), and reducing it to the simple form (33), function only

of the constant distance between approximately parallel ray paths. Before using this result to

set up an inverse problem, with Ql(r) as unknown and σ2 as datum, we reduce analytically

the double surface integral∫A

∫A

in (15) to a single, one-dimensional integral.

3.4.1 Cartesian case

Consider the integral

I =∫A

dx1

∫Af (x1,x2) dx2, (34)

with A a circular surface of radius Θ, x1 and x2 Cartesian 2-vectors spanning A, and dx1,

dx2 the corresponding infinitesimal surface elements.

Now, let the function f (x1,x2) depend only on the distance ρ between the points x1 and

x2. We shall later make use of this property of f to simplify the double surface integral in

equation (34).

First, replace∫A

dx1 with an integral over the polar coordinates s (length) and χ (angle),

defined with respect to the centre of A. It follows that

I =∫ Θ

0sds

∫ 2π

0dχ∫Af (s, χ,x2) dx2. (35)

For each location (s, χ), we must integrate again over all points x2 within A. Let us replace


Figure 2. Visual explanation of variables s, ρ and Θ in equation (36). A is the surface where the

integrals in equation (34) are done, C is the centre of A and Θ and s are the same as in equation (36).

the Cartesian coordinates x2 with polar coordinates ρ (length) and ψ (angle), defined with

respect to the location x1, or (s, χ). By definition, ρ then coincides with the distance between

x1 and x2. As we accordingly rewrite the integral∫A

dx2 in (35), we must specify the limits

of integration in ρ and ψ. ρ ranges between 0 and s+ Θ. For each ρ, the interval of values of

ψ for which (ρ, ψ) falls within A must be determined (and integrated over). If s+ρ < Θ, that

is ρ < Θ− s, such interval is ]0, 2π[. If ρ > Θ− s, the length φ of the arc of ψ to be integrated

upon can be determined using the cosine rule,

Θ2 = s2 + ρ2 − 2sρ cos(φ/2)

φ = 2 cos−1

(s2 + ρ2 −Θ2

2sρ

), (36)

where cos−1(. . . ) is the inverse cosine function (see Figure 2 for a visual explanation of s, ρ

and Θ).

Equation (35) now becomes

I =∫ Θ

0sds

∫ 2π

0dχ[∫ Θ−s

0ρdρ

∫ 2π

0f (s, χ, ρ, ψ) dψ +

∫ Θ+s

Θ−sρdρ

∫ φ

0f (s, χ, ρ, ψ) dψ

], (37)

where we have made use of the fact that f depends only on ρ, so the actual values taken by

ψ do not matter: only the length of the arc it spans does. For the same reason we can write

f (s, χ, ρ, ψ) = f(ρ), and

I =∫ Θ

02πsds

[∫ Θ−s

02πρf(ρ)dρ+

∫ Θ+s

Θ−s2ρ cos−1

(s2 + ρ2 −Θ2

2sρ

)f(ρ)dρ

]. (38)

To further simplify the expression for I, it is convenient to change the order of integration


over s and ρ. The first term at the right hand side of (38)

I1 = 4π2

∫ Θ

0sds

∫ Θ−s

0ρf(ρ)dρ = 4π2

∫ Θ

0ρf(ρ)dρ

∫ Θ−ρ

0sds. (39)

We express the second term at the right hand side of (38) as the sum of two terms: one

denoted I2, containing an integral over ρ between Θ−s and Θ, the other denoted I3, containing

an integral over ρ between Θ and Θ + s. Changing the order of integration, we find

I2 = 2π∫ Θ

0sds

∫ Θ

Θ−s2ρf(ρ) cos−1

(s2 + ρ2 −Θ2

2sρ

)dρ

= 2π∫ Θ

0ρf(ρ)dρ

∫ Θ

Θ−ρ2s cos−1

(s2 + ρ2 −Θ2

2sρ

)ds, (40)

I3 = 2π∫ Θ

0sds

∫ Θ+s

Θ2ρf(ρ) cos−1

(s2 + ρ2 −Θ2

2sρ

)dρ

= 2π∫ 2Θ

Θρf(ρ)dρ

∫ Θ

ρ−Θ2s cos−1

(s2 + ρ2 −Θ2

2sρ

)ds. (41)

If one now combines I = I1 +I2 +I3, equation (5) of GDC90 is reproduced. GDC90 further

simplify the form of I, carrying out analytically the integrations over s. This is straightforward

for the s-integral within I1, but more complicated for those within I2 and I3. For the I2 and

I3, it holds, for x > 0 and b > 0,∫x cos−1

(x2 + a

bx

)dx =

12x2 cos−1

(x2 + a

bx

)+

−2√−z + (b2 − 4a) tan−1

(b2−2x2−2a

2√−z

)8

(42)

with z = x4 − b2x2 + 2ax2 + a2. This formula is not provided by GDC90, who jump directly

to the final result

I = 4πΘ2

∫ 2Θ

0

[cos−1

( ρ

2Θ

)− ρ

2Θ

√1−

( ρ

2Θ

)2]ρf(ρ)dρ. (43)

3.4.2 Spherical case

To derive equation (43) we have treated the surface of the Earth, and of the area A spanned

by a ray bundle, as flat. When their curvature is taken into account, equation (43) is replaced

by equation (27) of GDC90

I = A2

∫ 2Θ

0ω(Θ, ρ)f(ρ)dρ, (44)

where

ω(Θ, ρ) =

π−4 cos Θ cos−1 α+cos−1 β1+cos−1 β2

2π(1−cos Θ)2sin ρ if 0 < ρ < Θ

π−4 cos Θ cos−1 α+sin−1 β1−sin−1 β2

2π(1−cos Θ)2sin ρ if Θ < ρ < 2Θ,

(45)


and α, β1, β2 are defined

α =cos Θ(1− cos ρ)

sin Θ sin ρ, (46)

β1 =(1− cos ρ) [1 + cos ρ− cos Θ(1 + cos Θ)]

(1− cos Θ) sin Θ sin ρ, (47)

and

β2 =(1− cos ρ) [1 + cos ρ+ cos Θ(1− cos Θ)]

(1 + cos Θ) sin Θ sin ρ. (48)

3.5 Relation between variance σ2, and the harmonic spectrum of Earth’s

structure

Equations (16)-(33) show that E[δt(x1)δt(x2)

]is a function of the distance ρ = |x2 − x1|

between the two rays, assuming a parallel rays approximation (i. e. the horizontal distance

between two rays of the same ray bundle is approximately the same as the generic distance

between the rays along their path).

Equations (34)-(48) show that, for E[δt(x1)δt(x2)

]= f(ρ), the double integral over surface

is reduced to only one integral over distance by means of a weight function ω(Θ, ρ), for both

the cases of flat and spherical Earth. Replacing f(ρ) with E[δt(x1)δt(x2)

]in equation (44),

1A2

∫A


]dx1dx2 =

∫ 2Θ

0ω(Θ, ρ)E

[δt(x1)δt(x2)

]dρ. (49)

After substituting the term E[δt(x1)δt(x2)

]with its explicit expression in (33), equation

(49) becomes

1A2

∫A


]dx1dx2 =

14v4√π ln 2

∫ 2Θ

0ω(Θ, ρ)

∫ray

x1/2(r)

∑l

[(2l + 1)Ql(r)Pl(cos(ρ))] ds

dρ. (50)

Let us now consider the first term at the right-hand side of equation (15). Using equation

(33) with ρ = 0, we find

E[δt(x1)δt(x1)

]= lim

ρ→0E[δt(x1)δt(x2)

]=

=1

4v4√π ln 2

∫ray

x1/2(r)L∑l=0

[(2l + 1)Ql(r)Pl(1)

]ds = (51)

=1

4v4√π ln 2

∫ray

x1/2(r)L∑l=0

[(2l + 1)Ql(r)

]ds,

where we have used the fact that Pl(cos ρ) = 1 independent of l.


Substituting (50) and (51) into (15),

σ2(Θ,∆, Z) = EEC[(δt)2

]− E

EC[(δt)

]2 =

=1A2

∫A

∫A

limρ→0

E[δt(x1)δt(x2)

]− 1A2

∫A


]= (52)

=1

4v4√π ln 2

∫ 2Θ

0ω(Θ, ρ)

∫ray

x1/2(r)L∑l=0

(2l + 1)Ql(r)

[1− Pl(cos ρ)

]where σ2(Θ,∆, Z) can be evaluated numerically from the data, and at the right-hand side

everything but Ql(r) is known. This equation constitutes the basis of the inverse problem

solved by GDC90.

4 FORMULATION OF THE INVERSE PROBLEM IN A 2-D

DESCRIPTION OF SURFACE-WAVE PROPAGATION

4.1 Projection to two dimensions

In a JWKB ray-theory description of surface-wave propagation (e. g., Ekstrom et al. 1997),

δt(x, ω) = − 1[v(ω)]2

∫SK(θ, φ)δv(θ, ϕ, ω)dΩ, (53)

where δv(θ, φ, ω) denotes lateral heterogeneities in surface-wave phase velocity at angular

frequency ω. δv(θ, φ, ω) can naturally be rewritten as a linear combination of real spherical

harmonics

δv(θ, ϕ) =∑l,m

AlmYlm(θ, ϕ). (54)

Here and in the following we shall neglect the dependence of δv, v and δt on ω; in practice,

we shall always consider different frequencies separately.

Equations (4)-(15) remain valid in the same form as above, provided that the radial basis-

function index n, and the bin vertical extent Z, which are now meaningless, be removed. In

a spherical reference frame, equation (16) can be rewritten

E[δt(x1)δt(x2)

]= E

[1v4

∫S

∫SK1(θ1, ϕ1)K2(θ2, ϕ2)δv(θ1, ϕ1)δv(θ2, ϕ2)dΩ1dΩ2

]. (55)


For surface waves, equations (17) and (18) can also be simplified, and

E[δt(x1)δt(x2)

]= E

[1v4

∫S

∫SK1(θ1, ϕ1)K1(θ2, ϕ2)

∑l,m

AlmYlm(θ1, ϕ1)

∑p,q

ApqYpq(θ2, ϕ2)dΩ1dΩ2

](56)

=1v4

∫S

∫SK1(θ1, ϕ1)K1(θ2, ϕ2)

∑l,m

∑p,q

E[AlmApq

]Ylm(θ1, ϕ1)Ypq(θ2, ϕ2)dΩ1dΩ2.

Assuming, in analogy with the 3-D case, that coefficients at different degrees and orders

be uncorrelated,

E[AlmApq] = E[AlmApq]δl,pδm,q, (57)

and substituting equation (57) into (56),

E[δt(x1)δt(x2)

]=

1v4

∫S

∫SK1(θ1, ϕ1)K2(θ2, ϕ2)

∑l,m

E[AlmAlm]

Ylm(θ1, ϕ1)Ylm(θ2, ϕ2)dΩ1dΩ2. (58)

We then introduce the spectrum Ql = E[AlmAlm], which now does not depend on r, and

equation (58) reduces to

E[δt(x1)δt(x2)

]=

1v4

∫S

∫SK1(θ1, ϕ1)K2(θ2, ϕ2)

∑l,m

QlYlm(θ1, ϕ1)Ylm(θ2, ϕ2))dΩ1dΩ2. (59)

There is no need to introduce the vertical correlation function c as in the previous section,

since in the surface-wave case the spectrum does not depend on depth. Using (24), (25) and

(59), equation (26) becomes

E[δt(x1)δt(x2)

]=

14πv4

∫S

∫SK1(θ1, ϕ1)K2(θ2, ϕ2)

L∑l=0

(2l + 1)QlPl(cos ρ)dΩ1dΩ2 (60)

where ρ denotes, again, the angular distance between x1 and x2. In the ray-theory approxi-

mation,

E[δt(x1)δt(x2)

]=

14πv4

∫ray1

∫ray2

L∑l=0

(2l + 1)QlPl(cos ρ)ds1ds2. (61)

If the rays propagating along the Earth surface are parallel, then a distance d between the

two rays can be defined, and

ρ =√d2 + s2

2, (62)

which implies

ds2 =dρ√ρ2 − d2

(63)


d

s2

ray 1ray 2

Figure 3. Visual explanation of variables s2, ρ and d in equation (62).

(see Figure 3 for a visual explanation of ρ, s2 and d), and equation (61) becomes

E[δt(x1)δt(x2)

]=

14πv4

∫ray

∫ π

d

L∑l=0

(2l + 1)QlρPl(cos ρ)√ρ2 − d2

dsdρ. (64)

We shall try to overcome this approximation in Appendix C.

Integration over ρ in equation (64) is conducted between d and π because, in all the

applications that we are going to show, the data are taken only for the first orbit (that is,

the shorter of the two arcs on the great-circle path connecting source and receiver), and, as a

result, the largest possible value of ρ is π.

Since the quantity (2l + 1)Ql does not depend on ρ, it can be taken out of the integral

together with the summation over l, and equation (64) takes the form

E[δt(x1)δt(x2)

]=

14πv4

L∑l=0

(2l + 1)Ql

∫ray

∫ π

d

ρ√ρ2 − d2

Pl(cos ρ)dsdρ. (65)

GDC90 simplified the 3-D counterpart of (65) (equation (27) here), by assuming that the

integrand be approximately gaussian (see Section 3.3 above). In the simple, 2D case, we are

able to conduct analytically the integral at the right-hand side of (65), which in the following

we shall denote

γl(d) =∫ π

d

ρ√ρ2 − d2

Pl(cos ρ)dρ. (66)

The argument of the integral in (66) is singular at ρ = d. We therefore proceed with the

analytical, rather than numerical, integration, whose details are shown in Appendix A.


4.1.1 Simplified expression for E[δt(x1)δt(x2)] in the surface-wave case.

Using definition (66), equation (65) becomes

E[δt(x1)δt(x2)

]=

14πv4

L∑l=1

(2l + 1)Ql

∫ray

γl(d)ds (67)

Since rays within a bundle are assumed parallel, the inter-distance d can be replaced with the

distance ρ between the two receivers. With this assumption, equation (67) becomes

E[δt(x1)δt(x2)

]=

14πv4

L∑l=1

(2l + 1)Ql

∫ray

γl(ρ)ds. (68)

4.2 Relation between variance σ2(Θ,∆) and the harmonic spectrum of

surface-wave phase velocity heterogeneity

As noted in Section 4.1, equation (15) is valid, in the same form, in both the body-wave and

surface-wave cases.

Making use of equations (65)-(66), the second term at the right-hand side of (15) can be

rewritten

E[EC(δt)

]2 =1

4πv4A2

L∑l=0

(2l + 1)Ql

∫A

∫A

∫ray

γl(ρ)ds. (69)

According to equation (14), the first term at the right-hand side of (15) coincides with

the limit of the second as ρ→ 0. Let us focus first on the integrand at the right-hand side of

(14):

E[δt(x1)δt(x1)

]= lim

ρ→0E[δt(x1)δt(x2)

]=

= limρ→0

14πv4

L∑l=0

(2l + 1)Ql

∫ray

γl(ρ)ds = (70)

=1

4πv4

L∑l=0

(2l + 1)Ql

∫ray

limρ→0

γl(ρ)ds.

Details about the calculation of limρ→0

γl(ρ) are shown in Appendix A.

Substituting (A.14) into (70), and the resulting expression, together with (69), into (15),

we are left with

σ(Θ,∆) =1

4πv4A2

L∑l=0

(2l + 1)Ql

∫A

∫A

∫ray

limρ→0

[γl(ρ)]− γl(ρ)ds. (71)

The results of Section 3.4 apply, and the double integral over A can be rewritten as a

single integral over distance, by means of a weighting function ω(Θ, ρ), so that equation (71)


becomes

σ2(Θ,∆) =1

4πv4

L∑l=0

(2l + 1)Ql

∫ 2Θ

0ω(Θ, ρ)

∫ray

limρ→0

[γl(ρ)]− γl(ρ)ds

dρ. (72)

Since γl(ρ) does not depend on s,

σ2(Θ,∆) =1

4πv4

L∑l=0

(2l + 1)Ql

∫ 2Θ

0ω(Θ, ρ)

limρ→0

[γl(ρ)]− γl(ρ)∫

rayds

dρ

=∆

4πv4

L∑l=0

(2l + 1)Ql

∫ 2Θ

0ω(Θ, ρ)

limρ→0

[γl(ρ)]− γl(ρ)

dρ, (73)

which constitutes the basis of the surface-wave inverse problem that we shall solve in the

following.

5 SOLUTION OF THE INVERSE PROBLEM

5.1 Discretization

We approximate the integral in equation (73) with a discrete summation, to find

σ2(Θ,∆) =∆

4πv4

L∑l=0

(2l + 1)QlM∑m=1

ω(Θ, ρm)

limρ→0

[γl(ρ)]− γl(ρm)δρ, (74)

with

δρ =2ΘM

ρm = (m− 1)δρ+δρ

2=(m− 1

2

)δρ =

(m− 1

2

)2ΘM. (75)

We next discretize values of cell size Θ and angular distance ∆, and equation (74) takes the

form

σ2(Θi,∆j) =∆j

4πv4

L∑l=0

(2l + 1)QlM∑m=1

ω(Θj , ρm,i)

limρ→0

[γl(ρ)]− γl(ρm,i)δρi. (76)

A one-to-one correspondence can then be established between couples i, j and the values of a

single index n, and

σ2n =

∑l,m

∆n

4πv4ω(Θn, ρm,n)

limρ→0

[γl(ρ)]− γl(ρm,n)δρn(2l + 1)Ql, (77)

which, after denoting

Dn = σ2n

Fnl =M∑m=1

∆n

4πv4(2l + 1)ω(Θn, ρm,n)

limρ→0

[γl(ρ)]− γl(ρm,n)δρn, (78)


takes the simpler form

Dn =l∑l=0

FnlQl (79)

or, in a tensorial notation,

D = F ·Q (80)

where D and F are the matrices defined in equation (78) and Q is defined by (21).

After measuring the datum D, equation (79) can be solved in least-squares sense to find

the spectrum Ql. We shall only consider the values of σ2n associated to more than 10 values

of δt, that is, recalling equation (5), with nk > 10.

5.2 Least-squares solution and norm damping

In the following, we shall discretize Θ,∆, so that the largest value of n, from now on denoted

with N , exceeds L. (80) is then an overdetermined problem which admits a least-squares

solution, found by solving

FT ·D = FT · F ·Q (81)

(e. g., Trefethen and Bau 1997), where FT is the transpose of F.

Because seismic data are always polluted by measurement errors and their coverage is not

uniform, the problem is ill-conditioned, i. e. the solution is not reliable unless equation (81)

is regularized (e. g., Menke 1989).

As a regularization constraint, we impose that “size” of the solution be minimum (Leven-

berg 1944 and Marquardt 1963). The least-squares equation (81) becomes(FT · F + λ2I

)·Q = FT ·D (82)

(with λ a regularization or “damping” parameter to be selected), which in index notation

givesL∑l=0

[ N∑n=1

(FniFnl) + λ2δil

]Ql =

N∑n=1

FniDn. (83)

We solve equation (82) by means of Cholesky factorization (e. g., Press et al. 2001) (from

now on LS) and of a non-negative least-squares inversion (from now on NNLS) (Lawson and

Hanson 1974). The solution Q of the problem is positive by definition, and NNLS guarantees

that this constraint is satisfied, while Cholesky factorization has the advantage of being an

exact method.

We follow the L-curve criterion (Hansen 1992) to select an adequate value of λ: after


defining the solution norm

ν(λ) =

√√√√ L∑l=0

(Q

(λ)l

)2(84)

(where the superscript λ identifies the solution found with the corresponding value of the

damping parameter), we divide it by the norm of the undamped solution ν(0), thus defining

the normalized norm

ν(λ) =1

ν(0)

√√√√ L∑l=0

(Q

(λ)l

)2. (85)

For each λ, we also define the solution misfit

ζ(λ) =

N∑n=1

[(F ·Q)n −Dn]2

N∑n=1

D2n

, (86)

and we can build the L-curve plotting the couples (ν(λ), ζ(λ)). Our preferred value of λ is the

one corresponding to maximum curvature of the L-curve (Hansen and O’ Leary 1993).

6 ANALYSIS OF THE DATUM: AVERAGE TRAVEL-TIME VARIANCE

σ2(Θ,∆) WITHIN A RAY BUNDLE

Following GDC90, we use the function σ2(Θ,∆) to describe the statistics of travel-time data;

values of σ2(Θ,∆), computed for a discrete set of Θ,∆, constitute the datum that we linearly

invert to determine the spectrum of Earth heterogeneity. Before proceeding with the damped

least-squares solution of equation (80), we verify that a correspondence exists between the

properties of σ2(Θ,∆) and the spectrum Q, i. e. that the problem is indeed linear.

We shall analyse in detail the dependence of σ2(Θ,∆) on Θ,∆ for different travel-time

delays data sets, both from synthetically generated databases (see sections 6.1-6.4 for details)

or from real measurements (Section 6.5), to understand how variations in Ql are mapped into

σ2(Θ,∆).

We shall compare both the pattern and the actual values of different couples of σ2(Θ,∆).

We employ the Pearson’s correlation coefficient r (e. g., Acton 1966) for the former task,

while for the latter we define a “norm percentage difference” between two different variances

σ21(Θi,∆i) and σ2

2(Θi,∆i)

µ =∑

i[σ22(Θi,∆i)− σ2

1(Θi,∆i)]2∑i[σ

21(Θi,∆i)]2

. (87)


From the definition of µ, its values can vary between 0 (perfect equality between σ21 and σ2

2)

and +∞ (each σ21 σ2

2). If σ21 σ2

2, it holds µ = 1.

6.1 Comparing synthetic data from different models (i): same spectrum

We generate a synthetic set of surface waves travel-time delays δt in a ray theory approxima-

tion: surface waves are treated as propagating on a zero thickness spherical membrane (e. g.,

Tanimoto et al. 1990) along the shortest great-circle connecting source and receiver (i. e. no

ray tracing). The linear relationship (53) is assumed between the phase velocity anomaly δv

and δt, with K 6= 0 nonzero only on the ray path. We calculate the delays δt associated with

a surface-wave phase-velocity distribution characterized by a monochromatic spectrum and a

realistic set of seismic sources and receivers (see Figure 4a for the geographical representation

of the model and of sources and stations), forming a large (∼ 65,000 observations) synthetic

database. We finally compute the corresponding σ2(Θ,∆) implementing equation (5).

We repeat this exercise for two models with the same spectrum Ql =δl,9

2l + 1, but with

different harmonic coefficients: Alm = δl,9 · δm,5 for the first one (Figure 4a) and Alm =

δl,9 · δm,−5 for the second (Figure 4b).

Figure 4 shows that σ2(Θ,∆) increases with ∆ and Θ as is to be expected: σ2(Θ,∆) is a

measure of the differences between rays in the same Θ, ∆ bin, and the larger are Θ and ∆,

the more rays sampling different structures are collected in the same bin.

To quantify the similarity between σ2(Θ,∆) in Figure 4a and Figure 4b, we plot in Figure

5, for each Θ and ∆, the value of σ2 from Figure 4a versus the one from Figure 4b. We find

a value of r = 0.91, indicating that velocity distributions with the same spectrum give rise

to σ2(Θ,∆) of approximately equal, even if the geographical distribution of heterogeneity

changes. We also evaluate the quantity µ defined in equation (87) and find a value of 0.39.

If the relationship between σ2(Θ,∆) and the spectrum were indeed linear, it should be

r = 1 and µ = 0. The differences with the actual values can be ascribed to the non-uniform

sampling of the rays. In the following, we shall use the values obtained in this section as a

reference to estimate the similarity or lack thereof between σ2(Θ,∆) as found from different

databases.

6.2 Comparing synthetic data from different models (ii): different spectra

We repeat the procedure described in Section 6.1, now assuming a short-wavelength phase-

velocity distribution, with all power at degree 20, i. e. Al,m = α · δl,20 · δm,12 (Figure 6, top


Figure 4. a) Left, σ2(Θ,∆) from a synthetic data set of ∼ 65000 events, generated with a model

whose harmonic coefficients are Alm = δl,9 · δm,5; right, geographical representation of the input model

and of sources (red dots) and stations (green dots). b) Same as a), but the harmonic coefficients of the

model are Alm = δl,9 · δm,−5


0

4000

8000

12000

16000

20000

0 4000 8000 12000 16000 200000

4000

8000

12000

16000

20000

!2 f

rom

figu

re 4

a

0 4000 8000 12000 16000 20000

!2 from figure 4b

Figure 5. Values of σ2 from Figure 4b against those from Figure 4a at the same couple Θ,∆ (grey

dots); the red line is the linear regression curve associated to the grey dots, with a correlation of 0.91.

right panel), and we compare the resulting σ2(Θ,∆) (Figure 6, left panel) with that of Figure

4b.

As in Section 6.1, we calculate r and µ from Figure 4b vs Figure 6. We obtain the values

r = 0.867 and µ = 0.97. The correlation is slightly lower than out threshold value, indicating

that the two variances can be still distinguished; since the values of σ21 (associated to the

Figure 6. As in Figure 4a, but the input model has all the power at l = 20, that is Alm = 0.2·δl,20·δm,12.


Figure 7. a) Correlation between σ2(Θ,∆) associated with the 40 different monochromatic models of

Section 6.3: the central values of the compared spectra are on the horizontal and vertical coordinates,

the colour scale represents the correlation. b) µ for the same models: the colour scale saturates for

µ > 1.

model in Figure 6.1b) are much larger than those of σ22 (associated to the model in Figure 6),

we find µ ≈ 1 as expected from equation (87).

6.3 Systematic comparison of synthetic data from a family of models (i):

monochromatic spectra

We calculate σ2(Θ,∆) for 40 different models with a monochromatic spectrum, similar to those

in figures 4 and 6: the amplitude of the spectra is always the same, only the harmonic degree

where the spectrum is nonzero changes for each model, and all the possible monochromatic

spectra from l = 1 to l = 40 are considered.

We show in Figure 7 r (panel a) and µ (panel b) between all possible combination of our

40 monochromatic-spectra models. The correlation of a certain σ2(Θ,∆) with itself is always

exactly 1, corresponding to the values on the diagonal of Figure 7a. Off-diagonal values are

< 1 and decrease with increasing distance from the diagonal, i. e., for models associated

with increasingly different spectra. From the contour plot in Figure 7a, we can notice that

there is a large region (27% of the off-diagonal points) where the correlation is > 0.91, and,

in comparison with the results of Section 6.1, the corresponding σ2(Θ,∆) can be considered

correlated: σ2(Θ,∆) associated with the model with all power at l = 25 has approximately the

same pattern as the l = 20 one. The correlation for low-harmonic-degree models is generally

< 0.91, so we can infer that the patterns of the associated variances are different.

µ of a model with respect to itself is 0, corresponding to the values on the diagonal of


Figure 8. Same as Figure 7b, but the RMS is set so that the peak to peak distance of the velocity

map is the same for all the models (lowest value of the velocity: -0.2 Km/s, largest value = 0.2 Km/s).

Figure 7b, and quickly grows with increasing distance from the diagonal. The 23 % of the

off-diagonal points are associated to a value of µ < 0.39, indicating that the variances that

can not be distinguished on the base of r or µ are approximately the same. The parameter µ

is a measure of the difference between the actual values assumed by two different variances: it

may be objected that we compared spectra with the same amplitude, and that comparing the

variances from two models with the same RMS is a limited test. In addition, monochromatic

models with the same RMS but different central value of l are associated to a geographic map

whose largest value increases with l, as it can be seen from the definition of real spherical

harmonics in equation (3).

We then repeat the same exercise but for 40 models with variable spectral amplitude: this

value is set so that the peak to peak distance of the checkerboard velocity maps is the same

for all the models (0.4 Km/s). We show the resulting plot of µ in Figure 8: the 28 % of the

points is associated to a value µ < 0.39, a percentage similar to the one of the previous test.

We infer that σ2(Θ,∆) can not discriminate between high harmonic degree models, but can

be used to distinguish between two models if one of the two is dominated by low degrees. We

shall see that this has important effects on the resolving power of the algorithm in recovering

such models.


Figure 9. Left, σ2(Θ,∆) from a synthetic data set of ∼ 65000 events, generated with a model with

a random spectrum tapered by an exponential curve of the form e−12 l; top-right, geographical repre-

sentation of the input model and of sources (red dots) and stations (green dots); bottom-right, the

associated harmonic spectrum.

6.4 Systematic comparison of synthetic data from a family of models(ii):

realistic Earth models

We repeat the exercise of Section 6.3, for 100 different models with a relatively realistic

spectrum (Figure 9). For each model, we generate 40 coefficients Ql randomly distributed.

Each spectrum is then multiplied by a taper of the form e−12l, and normalized so that the

largest value of Ql is always equal to 0.02. To generate the model coefficients Alm from the

spectrum, we then proceed as described in Section 6.1. The general spectral properties of the

resulting models resemble those observed for the Earth from seismology (e. g., Becker and

Boschi 2002).

The models have the same spectral properties, but only a small fraction (5 %) of off-

diagonal points in Figure 10a are associated to a correlation r > 0.85, and the 99 % of the

points in Figure 10b has a µ > 0.39. This indicates that most of the models here employed

can be distinguished from each other on the grounds of the variance. This is an encouraging

indication that the algorithm should recover satisfactorily this kind of models.


Figure 10. a) Correlation of σ2(Θ,∆) between couples taken from 100 different “random” models

with an exponential taper: on the horizontal and vertical axes the indexes of each model are indicated.

Notice that in this case the indexes have no correlation with model properties. b) µ from the same

models as in panel a; colour scale saturates for µ > 1.

6.5 σ2(Θ,∆) from different, real databases

Real data are affected by many deleterious effects as, e. g., measurement errors or uneven data

coverage. We want then to verify if σ2(Θ,∆) is different for datasets associated to different

periods and wave types. We generate σ2(Θ,∆) from two real surface-wave databases, i. e.

∼ 16,000 travel-time delays for Love waves at 150 s period from Trampert and Woodhouse

(1996), and ∼ 65,000 travel-time delays for Rayleigh waves at 50 s period from Ekstrom et

al. (1997). We see from Figure 11 shows the result of this procedure.

We calculate both r and µ, using only the couples Θ, ∆ where variance is defined in both

cases. We find that r = 0.49 and µ = 2.81, which indicate, according to the criterion introduced

in Section 6.1, that the two variances are different. At least for this particular example,

involving profoundly different surface-wave modes, σ2(Θ,∆) contains enough information to

discriminate between the different real data sets.

7 RESOLUTION TESTS

We generate synthetic travel-time data as described in Section 6, constructing a suite of data

sets based on different models and different catalogues of sources and stations. With synthetics

resulting from this procedure, we build the input datum σ2(Θ,∆) based on equation (5), the

matrix F based on equation (78) and solve the inverse problem as described in Section 5.2.

We then compare the spectrum of the “input” model used to generate the synthetics with the


Figure 11. a) Left panel, σ2(Θ,∆) as from a real database of travel-time delays of Love waves at 150 s

from Trampert and Woodhouse (1996); top right panel, the phase velocity distribution as reconstructed

from tomography; top bottom panel, the harmonic spectrum of the tomographic reconstruction. b)

Same as in a, but for a database of Rayleigh waves at 50 s from Ekstrom et al. (1997)

one reconstructed by the inversion: their similarity is a measure of our algorithm’s resolving

power. No resolution test had been conducted by GDC90 and Davies et al. (1992).


Figure 12. a) Results of inversion of model in Figure 4a for data set A, whose sources and stations

geographical distribution is represented in Figure 4a; the spectrum of the input model is plotted in red,

the solution obtained with LS factorization in black, the one obtained with NNLS in blue. b) As panel

a, but for data set B, which is made of 3,000,000 couples of sources and stations uniformly distributed

on the Earth surface.

7.1 “Checkerboard” models (monochromatic spectra)

We first address the questions of (i) whether the algorithm is able to resolve a monochromatic

phase velocity distribution, and (ii) how it is affected by non-uniformity in data coverage.

We address (i) by generating synthetic data from the input model defined in Figure 4a,

calculating ' 65,000 travel time delays (data set A, see its geographic distribution in Figure

4a) and inverting them with our algorithm. The results are shown in Figure 12a: both LS and

NNLS reconstruct the position of the main peak of the input spectrum but not its amplitude;

they fail also in reconstructing the impulsive nature of the input peak, which is smeared in

both the solutions; the LS solution is also characterized by smooth oscillations, including some

negative values, physically meaningless.

To address (ii), we next generate 3,000,000 travel-time delays (data set B) associated to

uniformly distributed sources and stations. The results of the subsequent inversion, shown in

Figure 12b are comparable with those of Figure 12a: we infer that even a tremendous growth

in data coverage does not improve the performance of our algorithm.

We conclude that our algorithm effectively recovers the position of the peak of a monochro-

matic spectrum but not its amplitude nor its impulsive nature. In addition, a uniform data

coverage is not needed. This was expected: because the input datum σ2(Θ,∆) depends on

the epicentral distance ∆ but not on the source-station locations, coverage can be considered

good as long as a large portion of the possible range of epicentral distances is spanned. Since


Figure 13. a) Model derived from tomographic inversion of Love waves at 150 s taken from Trampert

and Woodhouse (1996), with sources represented by red squares and stations by green triangles. b)

Spectrum of the model in panel a (red curve) and reconstructed solution (blue curve).

from Figure 12 it can be seen that LS and NNLS solution are similar, in the next sections we

shall show only the results of NNLS inversions.

7.2 Smooth spectra

We proceed as follows to generate a “realistic” synthetic σ2(Θ,∆). We perform a tomographic

inversion of ∼ 16,000 travel-time delay observations for Love waves at 150 s taken from Tram-

pert and Woodhouse (1996). We employ the algorithm described in Boschi and Dziewonski

(1999), with the regularization requirement that the norm of the model be minimum; we

choose the solution following the L-curve criterion Hansen (1992) and picking the solution

corresponding to the maximum curvature of the L-curve Hansen and O’ Leary (1993). We

then obtain a set of values of harmonic coefficients Alm up to the highest harmonic degree

L = 40; the associated phase velocity distribution is plotted in Figure 13a, and is compatible

with the bottom left panel of Figure 1 of Trampert and Woodhouse (1996).

We then use the obtained values of Alm to perform the procedure described in Section

6.1, first calculating a set of ∼ 16,000 travel-time delays (whose associated source-station

distribution is that of Figure 13a), and then deriving σ2(Θ,∆) for them.

We invert the generated data with our algorithm, and compare the resulting spectrum

with that obtained from tomography (Figure 13b). Our algorithm does not distinguish the

two peaks at l = 2 and l = 5, but the amplitudes now are better reconstructed and also

the position of the maximum is well recreated. We infer that our algorithm is not able to

reproduce unrealistically sharp spectra, but it may work with realistic smoother spectra.


Figure 14. a) As in Figure 13, but noise has been added to the data. b) Same as in panel a, but the

maximum harmonic degree resolved in the inversion is L = 40.

We repeat the exercise after adding noise to the synthetic data. To generate the noise,

we select the standard deviation sδΦ = 0.194 for the phase delay δΦ of the first class of Love

waves data at 150s in Ekstrom et al. (1997). We add gaussian noise calculating a set of phase

delay errors εδΦ. To convert the phase delay errors to time delay errors εδt, we employ the

formula

εδt =εδΦ · T

2π, (88)

where T is the period of the wave. We finally add each value of εδt to the set of δt formerly

generated, build the input datum σ2(Θ,∆) from this new database and invert for the spectrum

(Figure 14).

Comparing Figures 14a and b with 13b, we notice that adding noise to the synthetic data

alters the behaviour of the solution around the maximum harmonic degree: a spurious peak

appears, more pronounced for the inversions with L = 40 (Figure 14a); increasing the largest

harmonic degree L in the inversion reduces the noise (Figure 14b). Since the noise that we

employed is uncorrelated with the model properties, it adds high harmonic degree oscillations

to the recovered models. Assuming that the applied procedure is realistic, we might expect

this effect also in the inversion of real data.

8 INVERSION OF REAL DATA

We apply the algorithm to real observations, and compare the results with those obtained

from a tomographic procedure as the one described in Section 7.2.

We show the results of the inversion of a data set of ∼ 65, 000 travel-time delays for


Figure 15. a) Phase velocity map derived by tomography from Rayleigh waves observations at 50 s

taken from Ekstrom et al. (1997). Red squares represent sources and green triangles represent stations.

b) Spectrum of the tomographic model obtained in panel a and solution obtained with NNLS (blue

curve).

Rayleigh waves at 50 s (Figure 15) and for a catalogue of ∼ 37, 000 recordings of Love waves

at 100 s (Figure 16). Both the databases are taken from Ekstrom et al. (1997). We obtain a

map for Rayleigh waves at 50 s in agreement with plate 1.b of Ekstrom et al. (1997).

The NNLS inversions shown in figures 15 and 16 are in qualitative agreement with the

spectrum obtained by means of tomography: the region where the spectrum is nonzero is

the same in both cases, but the reconstructed amplitudes are smaller than those obtained by

tomography; we also see the same spurious peak that we have found in Section 7.2 due to

noise.

We encounter the same features in the inversion of other travel-time delays taken from

Ekstrom et al. (1997).

Figure 16. Same as Figure 15, but for 100 s Love-wave data from Ekstrom et al. (1997).


9 DISCUSSION AND CONCLUSIONS

We have developed an algorithm for the linearized inversion of the local variance of global

seismic data, to identify the spherical harmonic spectrum of planetary structure. Direct in-

versions for spectrum (i. e. the level of complexity of planetary structure), rather than 3-D

structure, involve an order-of-magnitude reduction in the size of the solution space: they are

therefore convenient for their limited cost and the possibility, in principle, of turning a mixed-

or under-determined problem into an over-determined one. Our method should in theory be

effective even in the presence of a very limited number of stations. So far as the Earth is con-

cerned, robust measurements of its spectrum would constitute an important new constraint

to understand its dynamics.

Our algorithm is based on the linearized, ray-theory-based approach of GDC90. We con-

ducted an independent theoretical derivation, showing in detail the approximations that the

linearized approach involves, and their implications on the results. We provided a quantitative

assessment of the method’s stability and resolution, which was missing from GDC90 and the

subsequent study of Davies et al. (1992). We also evaluated, independent of the inversion

algorithm, whether measures of seismic data variance are a priori sufficient to constrain the

planet’s complexity. Unlike GDC90, we focused on upper-mantle structure and surface-wave

data: a simpler, and better understood problem than that of whole-mantle imaging.

In the presence of a station-earthquake coverage similar to that found on Earth, we found

that the resolution of our new method is lower than that achieved by tomography (i. e.,

mapping 3-D structure by models): compare, e. g., Figure 13 here with plates 2 and 3 of Boschi

and Dziewonski (1999). In general, our synthetic inversions identify the harmonic degrees of

highest spectral energy, but tend to underestimate spectral power, and cannot distinguish

neighbouring spectral peaks. A comparison of GDC90’s results (see in particular their Figure

17) and typical global tomography spectra (Becker and Boschi 2002, for a review) confirms

this conclusion. Particularly for the upper mantle, the robustness of the Earth’s spectrum up

to degree ca. 12 is proved by, e. g., Carannante and Boschi (2005), who found highly correlated

results from completely independent databases. The lack of resolution of our algorithm may

be ascribed (i) to the definition of variance σ2 or (ii) to the approximations required to obtain

the linear relationship (79) between σ2 and the Earth’s spectrum.

To address issue (i) we have studied the sensitivity of σ2 to the Earth’s spectrum, monitor-

ing the changes in σ2 (expressed as a function of epicentral distance and the geographic extent

of the source-station bins over which variance is calculated) caused by changes in the planet’s

complexity. We found that measuring σ2 is insufficient to satisfactorily discriminate between


different monochromatic spectra, unless spectral power is concentrated at relatively low har-

monic degrees (figures 7 and 8). Earth models characterized by smoother, non-monochromatic

spectra, can also be distinguished from one another based on the associated σ2 (figures 10

and 11). In future work we shall investigate whether changes in the definition of σ2, or more

generally of the statistics of seismic observations that we use to constrain planetary spectrum,

could help enhancing the method’s resolution.

We addressed issue (ii) studying the performance of the algorithm when applied to syn-

thetically generated data. No synthetic test had been conducted by GDC90 or Davies et al.

(1992). The results of Section 7 confirm that the algorithm has not enough resolving power to

distinguish sharp spectra (Figure 12); it can instead reconstruct smoother and more realistic

models from both noisy (Figure 14) and non-noisy synthetic data (Figure 13). The resolving

power is lower than that of classical tomography (e. g., Figure 13): we suggest that this could

be a consequence of the parallel-ray approximation introduced in Section 4.1 in analogy with

GDC90. This approximation is critical for ray bundles associated to large values of Θ (see

Appendix C for an attempt to overcome this problem). It is necessary in order for the inverse

problem to be linearized, but could be abandoned by tackling the inverse problem in a fully

nonlinear way: this is made possible by the relatively small size of the solution space, and will

be, again, the subject of our future work.

In conclusion, our algorithm in its current form may be used effectively to directly invert for

the main features of planetary spectra in a cheaper way than tomography. The considerable

reduction in the size of the solution space makes the algorithm useful in situations of bad

data coverage. Future applications of this or similar methods in planetary seismology (e. g.,

Lognonne 2008; Wieczorek 2009), where coverage is particularly poor, can be envisaged. In

the near future, we plan to focus our research on methodological improvements that are likely

to enhance spectral resolution: replacement of σ2 with a better statistics of global seismic

data; development of a fully nonlinear inversion algorithm.

REFERENCES

Acton, F. S., 1966, Analysis of straight-line data, Dover Publications, New York.

Becker, T. W. and Boschi, L., 2002, A comparison of tomographic and geodynamic mantle models,

Geochem. Geophys. Geosyst. 3, 1003, doi:10.1029/2001GC000168

Boschi, L. and Dziewonski A. M., 1999, High and low resolution images of the Earth’s mantle -

Implications of different approaches to tomographic modeling, J. Geophys. Res., 104 (B11), 25,567-

25,594, doi:10.1029/1999JB900166.


Boschi, L., Becker, T. W. and Steinberger, B., 2007, Mantle plumes: dynamic models and seismic

range, Geochem. Geophys. Geosyst., 8, Q10006, doi:10.1029/2007GC001733.

Boschi, L., Becker, T. W. and Steinberger, B., 2008, On the statistical significance of correlations

between synthetic mantle plumes and tomographic models, Phys. Earth Planet. Int., 167, 230-238,

doi:10.1016/j.pepi.2008.03.009.

Buffett, B., Gable, C. W., and O’ Connel, R. J., 1994, Marginal stability of a layered fluid with mobile

surface plates, J. Geophys. Res., 99 (B10), 19,885-19,900, doi:10.1029/94JB01556.

Bunge, H.-P. and Grand, S. P., 2000, Mesozoic plate-motion history below the northeast Pacific Ocean

from seismic images of the subducted Farallon slab, Nature, 405, 337-340, doi:10.1038/35012586.

Bunge, H.-P., Hagelberg, C. R. and Travis B. J., 2003, Mantle circulation models with variational

data assimilation: inferring past mantle flow and structure from plate motion histories and seismic

tomography, Geophys J. Int., 152, 280-301, doi: 10.1046/j.1365-246X.2003.01823.x.

Bunge, H.-P., Richards M. A., Lithgow-Bertelloni C., Baumgardner, J. R., Grand, S. P., and Ro-

manowicz, B., 1998, Time scales and heterogeneous structure in geodynamic earth models, Science,

280, 91-95, doi: 10.1126/science.280.5360.91.

Bunge, H.-P., Richards, M. A. and Baumgardner, J. R., 1996, Effect of depth-dependent viscosity on

the planform of mantle convection, Nature, 379, 436-438, doi:10.1038/379436a0.

Carannante, S., and Boschi,L., 2005. Databases of surface wave dispersion. Annals of Geophysics, 48,

945-955.

Cheney, W., 2001, Analysis for applied mathematics, Springer, New York.

Chernov, L.A., 1960, Wave propagation in a random medium, McGraw-Hill, New York.

Conrad, C. P. and Gurnis, M., 2003, Seismic tomography, surface uplift, and the breakup of Gond-

wanaland: integrating mantle convection backwards in time, Geochem., Geophys., Geosyst., 4, 1031,

doi: 10.1029/2001GC000299.

Dahlen, F.A., and Tromp J., 1998, Theoretical Global Seismology, Princeton University Press, Prince-

ton.

Davies, J.H., Gudmundsson, O. and Clayton R.W., 1992, Spectra of mantle shear wave velocity

structure, Geophys. J. Int., 108,865-882, doi: 10.1111/j.1365-246X.1992.tb03476.x.

Ekstrom, G., Tromp, J. and Larson, E. W. F., 1997, Measurements and global models of surface wave

propagation, J. Geophys. Res., 102(B4), 8137-8157, doi: 10.1029/96JB03729.

Foley, B. and Becker, T. W., 2009, Generation of plate-like behavior and mantle heterogene-

ity from a spherical, visco-plastic convection model, Geochem., Geophys., Geosyst., 10, doi:

10.1029/2009GC002378.

Gudmundsson, O., Davies, J. H. and Clayton R. W., 1990, Stochastic analysis of global traveltime

data: mantle heterogeneity and random errors in the ISC data, Geophys. J. Int., 102, 25-43, doi:

10.1111/j.1365-246X.1990.tb00528.x.

Hansen, P. C., 1992, Analysis of Discrete Ill-Posed Problems by Means of the L-Curve, SIAM Review,


34, 561-580, doi: 10.1137/1034115.

Hansen, P. C. and O’ Leary, D. P., 1993, The use of the L-curve in the regularization of discrete

ill-posed problems, SIAM J. Sci. Comput., 14, 1487-1503, doi: 10.1137/0914086.

Hildebrand, F. B., 1956, Introduction to Numerical Analysis. McGraw-Hill, New York

Lawson, C. L. and Hanson, B. J. 1974, Solving Least Squares Problems, Prentice-Hall (Englewood

Cliffs, NJ).

Levenberg, k., 1944, A method for the solution of certain non-linear problems in least squares, Quart.

Appl. Math., 2, 164-168.

Lithgow-Bertelloni, C. and Richards, M.A. 1998, The dynamics of Cenozoic and Mesozoic plate

motions, Rev. Geophys., 36, 27-78, doi: 10.1029/97RG02282.

Lognonne, P., 2008, Seismology on Mars with the SEIS/HUMBOLD experiment on the ExoMars

Mission, 37th COSPAR Scientific Assembly p. 1822.

Marquardt, D., 1963, An algorithm for least-squares estimation of nonlinear parameters, SIAM Jour-

nal on Applied Mathematics, 11, 431-441, doi: 10.1137/0111030

McNamara, A. K. and Zhong, S., 2005, Thermochemical piles under Africa and the Pacific, Nature,

437, 1136-1139, doi: 10.1038/nature04066

Menke, W., 1989, Geophysical Data Analysis: Discrete Inverse Theory, rev. ed., Academic, San Diego,

California.

Megnin, C., Bunge, H.-P., Romanowicz B. and Richards, M. A, 1997, Imaging 3-D spherical convection

models: what can seismic tomography tell us about mantle dynamics?, Geophys. Res. Lett., 24, 1299-

1302, doi: 10.1029/97GL01256

Nakagawa, T., Tackley, P. J., Deschamps, F. and Connolly, J. A. D., 2009, Incorporating self-

consistently calculate mineral physics into thermo-chemical mantle convection simulations in a

3D spherical shell and its influence on seismic anomalies in Earth’s mantle Geochem., Geophys.,

Geosyst., 10 ,Q03004, doi: 10.1029/2008GC002280.

Nowack, R. L. and Lyslo, J.A., 1989, Frechet derivatives for curved interfaces in the ray approximation,

Geophys. J. Int., 97, 497-509, doi: 10.1111/j.1365-246X.1989.tb00519.x.

Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P., 2001, Numerical Recipes in

Fortran 77: The Art of Scientific Computing, Cambridge University Press, Cambridge, 2nd edition.

Steinberger, B., 2000, Slabs in the lower mantle - results of dynamic modelling compared with

tomographic images and the geoid, Phys. Earth Planet. Int., 118, 241-257, doi: 10.1016/S0031-

9201(99)00172-7.

Tackley, P. J., 2002, Strong heterogeneity caused by deep mantle layering, Geochemistry, Geophysics,

Geosystems, 3, doi: 10.1029/2001GC000167.

Tackley, P. J., 1996, On the ability of phase transitions and viscosity layering to induce long-

wavelength heterogeneity in the mantle, Geophys. Res. Lett., 23, 1985-1988, doi:10.1029/96GL01980.

Tackley, P. J., Nakagawa, T. and Hernlund, J. W., 2007 Influence of the post-perovskite transition


on thermal and thermo-chemical mantle convection, in Hirose, K., Brodholt, J., Lay, T., and Yuen,

D. A., editors, The Last Phase Transition, AGU Geophysical Monograph. AGU, Washington, D.C.

Tan, E., Gurnis, M. and Han, L., 2002, Slabs in the lower mantle and their modulation of plume

formation. Geochem., Geophys., Geosyst., 3, doi: 10.1029/2001GC000238.

Tanimoto, T., 1990, Modelling curved surface wave paths: membrane surface waves synthetics, Geo-

phys. J. Int., 102, 89-100, doi: 10.1111/j.1365-246X.1990.tb00532.x.

Thomas, G. B. Jr. and Finney, R. L., 1996, Calculus and Analytic Geometry, 8th edition, Reading,

MA: Addison-Wesley, p. 919.

Trefethen, L. N., and Bau, D. III, 1997, Numerical Linear Algebra, SIAM, Philadelphia, Pennsylvania.

Tromp, J., Tape, C. and Liu, Q., 2005, Seismic tomography, adjoint methods time reversal and

banana-doughnut kernels, Geophys. J. Int., 160, 195-216, doi: 10.1111/j.1365-246X.2004.02453.x.

Trampert, J. and Woodhouse J. H., 1996, High resolution global phase velocity distributions, Geo-

physical Research Letters 23(1), 21-24, doi:10.1029/95GL03391.

Van Heck, H. J. and Tackley, P. J., 2008, Planforms of self-consistently generated plates in 3D spherical

geometry, Geophys. Res. Lett., 35.

Weisstein, Eric W., 2009, Double Factorial. From MathWorld–A Wolfram Web Resource.

http://mathworld.wolfram.com/DoubleFactorial.html

Wieczorek, M., 2009, The interior structure of the Moon: what does geophysics have to say?, Elements,

5(1), 35-40, doi: 10.2113/gselements.5.1.35.

Whittaker, E. T. and Watson, G. N., 1927, A course of modern Analysis. 4th edition, Cambridge

University Press, Cambridge

Yoshida, M., 2008, Mantle convection with longest wavelength thermal heterogeneity in a 3-D spher-

ical model: Degree one or two?, Geophys. Res. Lett., 35, L23302, doi:10.1029/2008GL036059.

Zhong, S., Zhang, N., Li, Z.-X. and Roberts, J. H., 2007, Supercontinent cycles, true polar

wander, and very long wavelength mantle convection, Earth Planet. Sci. Lett., 261, 551-564,

10.1016/j.epsl.2007.07.049.

Zhou, Y., Dahlen, F. A. and Nolet, G., 2004, Three-dimensional sensitivity kernels for surface waves

observables, Geophys. J. Int., 158, 142-168, doi:10.1111/j.1365-246X.2004.02324.x.


APPENDIX A: ANALYTICAL INTEGRATION OF γL(D)

We have seen in Section 4.1 that an important part of the algorithm is the calculation of γl(d).

If l = 0 then equation (66) becomes

γ0(d) =∫ π

d

ρ√ρ2 − d2

P0(cos(ρ))dρ =

=∫ π

d

ρ√ρ2 − d2

dρ = (A.1)

=[√

ρ2 − d2]ρ=π

ρ=d=

=√π2 − d2.

If l 6= 0, after integration by parts of equation (66),

γl(d) =[√

ρ2 − d2Pl(cos ρ)]ρ=π

ρ=d

−∫ π

d

√ρ2 − d2

dPl(cos ρ)dρ

dρ

= (−1)l√π2 − d2 −

∫ π

d

√ρ2 − d2

dPl(cos ρ)dρ

dρ. (A.2)

The integral in (A.2) can be calculated with the approximated formula

∫ π

d

√ρ2 − d2

dPl(cos ρ)dρ

'W∑w=1

√ρ2w − d2

dPl(cos ρ)dρ

∣∣∣∣ρ=ρw

δρ, (A.3)

where

δρ =π − dW

ρw = d+ (w − 1)δρ+δρ

2= d+

(w − 1

2

)δρ = d+

(w − 1

2

)π − dW

. (A.4)

Equation (A.3) is a good approximation of (A.2) if W is large enough. The oscillations ofdPl(cos ρ)

dρincrease with l, so W must also increase with l if we want equation (A.3) to be a

good approximation of (A.2).

From Whittaker and Watson (1927),

Pl(cos ρ) =l∑

k=0

(2k − 1)!!(2k)!!

[2(l − k)− 1]!![2(l − k)]!!

cos[(l − 2k)ρ]. (A.5)

Using the formula

n!! = 2[1+2n−(−1)n]

4 π[−1]n−1

4 Γ(

1 +n

2

), (A.6)


(Weisstein et al. 2009), we find

(2k − 1)!! =2kΓ(k + 1

2)√π

(2k)!! = 2kΓ(1 + k) (A.7)[2(l − k)− 1

]!! =

2(l−k)Γ(l − k + 12)

√π[

2(l − k)]!! = 2(l−k)Γ(l − k + 1),

where Γ(z) denotes the Euler’s gamma function (Whittaker and Watson 1927)

Γ(z) =∫ +∞

0e−ttz−1dt. (A.8)

Replacing (A.7) into (A.5), the latter equation reduces to

Pl(cos ρ) =1π

l∑k=0

Γ(k + 12)

Γ(k + 1)Γ(l − k + 1

2)Γ(l − k + 1)

cos[(l − 2k)ρ]. (A.9)

After deriving equation (A.9) with respect to ρ and substituting the result into (A.2)

γl(d) = (−1)l√π2 − d2 + (A.10)

+1π

l∑k=0

(l − 2k)Γ(k + 1

2)Γ(k + 1)

Γ(l − k + 12)

Γ(l − k + 1)

∫ π

d

√ρ2 − d2 sin[(l − 2k)ρ]dρ.

In the following, we shall compact the notation by defining

hl−2k(d) def=∫ π

d

√ρ2 − d2 sin[(l − 2k)ρ]dρ. (A.11)

Combining equations (A.1) and (A.10) we find the following expression for the integral (66):

γl(d) =

√π2 − d2 if l = 0

(−1)l√π2 − d2 + 1

π

∑lk=0(l − 2k)Γ(k+ 1

2)

Γ(k+1)

Γ(l−k+ 12

)

Γ(l−k+1) hl−2k(d) if l 6= 0. (A.12)

Finally, equation (71) requires the calculation of limρ→0

γl(ρ). From the definition of γl(ρ),

limρ→0

γl(ρ) = limρ→0

[∫ π

ρ

τ√τ2 − ρ2

Pl(cos τ)dτ

]

=∫ π

0Pl(cos τ)dτ. (A.13)


Using the expression (A.9) for Pl(cos ρ), this reduces to

limρ→0

γl(ρ) =1π

l∑k=0

Γ(k + 12)

Γ(k + 1)Γ(l − k + 1

2)Γ(l − k + 1)

∫ π

0cos[(l − 2k)τ ]dτ

=l∑

k=0

Γ(k + 12)

Γ(k + 1)Γ(l − k + 1

2)Γ(l − k + 1)

δl,2k (A.14)

=

[

Γ( l+12

)

Γ( l+22

)

]2

l even

0 l odd.

APPENDIX B: NUMERICAL IMPLEMENTATION - VALIDATION OF OUR

ANALYTICAL EXPRESSION FOR γL

Before the inverse problem (83) is solved, we must calculate the numerical values of the matrix

entries Fnl. This involves the calculation of the function γl according to its analytically derived

expression (A.12).

We validate our analytical integration of equation (66), carried out in Section A, by

comparing its result (A.10) with the values of γl found from equation (A.1) (l = 0) and by

direct numerical integration of equation (A.2). We do not integrate equation (66) numerically

because it is singular at ρ = d. Substituting equation (A.3) in (A.2),

γl(d) = (−1)l√π2 − d2 −

W∑w=1

√ρ2w − d2

dPl(cos ρ)dρ

∣∣∣∣ρ=ρw

δρ. (B.1)

In our approach, we substitute in equation (A.12) the definition of hl−2k(d), obtaining

γl(d) =

pπ2 − d2 if l = 0

(−1)lpπ2 − d2 +

1

π

lXk=0

(l− 2k)Γ(k+ 1

2 )Γ(k+1)

Γ(l−k+ 12 )

Γ(l−k+1)R πd

pρ2 − d2 sin[(l− 2k)ρ]dρ if l 6= 0

. (B.2)

Finally we calculate numerically hl−2k(d) with the approximated formula

hl−2k(d) 'J∑j=1

√ρ2j − d2 sin[(l − 2k)ρj ]δρ, (B.3)

where

δρ =π − dJ

ρj = d+ (j − 1) +δρ

2= d+ δρ

(j − 1

2

)= d+

π − dJ

(j − 1

2

), (B.4)

so that the resulting expression of γl(d) is

γl(d) =

pπ2 − d2 if l = 0

(−1)lpπ2 − d2 +

1

π

lXk=0

(l− 2k)Γ(k+ 1

2 )Γ(k+1)

Γ(l−k+ 12 )

Γ(l−k+1)

JXj=1

qρ2j − d

2 sin[(l− 2k)ρj ]δρ if l 6= 0. (B.5)


Figure A1. Validation of our analytical expression (A.12) for γl; values of γl (left panel) with the

associated numerical error defined as the difference between a numerical and an analytical (equation

(A.12)) implementation of (A.1)-(A.2).

The advantage of our approach is that the termdPl(cos ρ)

dρoscillates much more intensively

then sin [(l − 2k) ρ] and, because of this, it gives less numerical problems to integrate the latter

function rather than the former.

The result of this comparison is summarized in Figure A1: it can be seen that the error has

the maximum values in the same points as the exact function as from equations (A.1)-(A.2),

and looking at the colour scales of the two plots it is evident that the error is approximately

six orders of magnitude smaller than the exact value.

APPENDIX C: WEIGHTED VARIANCE

We stated in sections 3.3 and 4.1 that an important assumption for our algorithm is that rays

propagating in the same ray bundle be parallel, but it may happen, especially for large values

of Θ, that diverging or converging rays be collected in the same bundle. In this Section we

give a new definition of σ2(Θ,∆) to overcome this problem.

We calculate an average azimuth αk for the k-th ray bundle, calculated on the nk azimuths

αi, given by


Figure A2. a) As in Figure 13b, but the variance is defined according to equation (C.3); b) Same as

in panel a, but input model and data are those used in Section 7.1

αk =1nk

nk∑i=1

αi, (C.1)

and then calculate a weighted average δtk

δtk =

(nk∑i=1

wi

)−1 nk∑i=1

wiδti (C.2)

and a weighted variance σ2k

σ2k =

(nk∑i=1

wi

)−1 nk∑i=1

wi(δti − δtk

)2, (C.3)

with the weights wi defined by wi = 1(αi−αk)2

.

We insert this definition of σ2k in equation (5) and take the result as the input data of our

algorithm, to verify if giving more importance to the data associated to almost parallel rays

improves the resolution of our method.

We apply this new definition of σ2k to the synthetic data set used in sections 7.1 and 7.2:

the obtained results are shown in Figure A2.

Comparing Figure A2 with Figure 12 and 13 no significant improvement can be seen, on

the contrary in Figure A2b the NNLS solution is worse than the one in Figure 13. We can

infer that the procedure of data weighting that we introduced does not remarkably improve

the final result.

Heterogeneity spectrum inversion from a stochastic...

Documents

Transcript of Heterogeneity spectrum inversion from a stochastic...