Two Sample Problem for Functional Data -...

28
Introduction Testing Equality of Mean Functions Testing Equality of Covariance Operators Bibliography Two Sample Problem for Functional Data Radek Hendrych Stochastic Modelling in Economics and Finance 1 November 25, 2013 Radek Hendrych Two Sample Problem for Functional Data

Transcript of Two Sample Problem for Functional Data -...

Page 1: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Two Sample Problem for Functional Data

Radek Hendrych

Stochastic Modelling in Economics and Finance 1

November 25, 2013

Radek Hendrych Two Sample Problem for Functional Data

Page 2: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Outline

1 Introduction

2 Testing Equality of Mean Functions

3 Testing Equality of Covariance Operators

4 Bibliography

Radek Hendrych Two Sample Problem for Functional Data

Page 3: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Two Sample Problem for Functional Data

testing the equality of the means in two independent samples

testing the equality of the covariance operators in two independentsamples

Asymptotic procedures will be introduced.

Radek Hendrych Two Sample Problem for Functional Data

Page 4: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Outline

1 Introduction

2 Testing Equality of Mean Functions

3 Testing Equality of Covariance Operators

4 Bibliography

Radek Hendrych Two Sample Problem for Functional Data

Page 5: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Testing Equality of Mean Functions

Model

Consider two samples X1, . . . ,XN and X ∗1 , . . . ,X∗M satisfying the model

Xi (t) = µ(t) + εi (t), i = 1 . . . ,N, (1)

X ∗j (t) = µ∗(t) + ε∗j (t), j = 1 . . . ,M. (2)

Assumptions

(A1) the two samples are independent

(A2) ε1, . . . , εN are i.i.d. with Eε1(t) = 0 and E||ε1||4 <∞(A3) ε∗1 , . . . , ε

∗M are i.i.d. with Eε∗1(t) = 0 and E||ε∗1 ||4 <∞

The ε1 and ε∗1 do not have to follow the same distribution.

Radek Hendrych Two Sample Problem for Functional Data

Page 6: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Main Goal

Testing Hypothesis

H0 : µ = µ∗ in L2 against H1 : ¬H0

Radek Hendrych Two Sample Problem for Functional Data

Page 7: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Method I

XN(t) =1

N

N∑i=1

Xi (t) and X ∗M(t) =1

M

M∑j=1

X ∗j (t)

are unbiased estimators for µ(t) and µ∗(t), respectively.

It is natural to reject the null hypothesis if

UN,M =NM

N + M

∫ 1

0

(XN(t)− X ∗M(t)

)2dt (3)

is large.

Radek Hendrych Two Sample Problem for Functional Data

Page 8: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Convergence of UN,M under H0

Theorem

If H0 and the assumptions (A1), (A2) and (A3) hold, and

N

N + M→ θ, for some θ ∈ [0, 1], as N →∞,

then

UN,Md−→∫ 1

0

Γ2(t)dt, N,M →∞, (4)

where {Γ(t), t ∈ [0, 1]} is a Gaussian process satisfying EΓ(t) = 0 and

E [Γ(t)Γ(s)] = (1− θ)c(t, s) + θc∗(t, s),

with c(t, s) = cov(X1(t),X1(s)) and c∗(t, s) = cov(X ∗1 (t),X ∗1 (s)).

Proof. See [1].

Radek Hendrych Two Sample Problem for Functional Data

Page 9: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

The limit distribution of UN,M in (4) depends on the unknown covariancefunctions c and c∗.

According to the Karhunen-Loeve expansion, one can suppose that

Γ(t) =∞∑k=1

τ1/2k Nkϕk(t),

where Nk , k ∈ N, are independent N(0, 1) random variables,τ1 ≥ τ2 ≥ . . . and ϕ1, ϕ2, . . . are the eigenvalues and eigenfunctions ofthe operator determined by (1− θ)c + θc∗.

Radek Hendrych Two Sample Problem for Functional Data

Page 10: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Since ∫ 1

0

Γ2(t)dt =∞∑k=1

τkN2k ,

to provide a reasonable approximation for∫ 1

0Γ2(t)dt, one only need to

estimate τk .

This can be done using τk , the eigenvalues of the empirical covariancefunction

zN,M(t, s) =M

N + M

1

N

N∑i=1

(Xi (t)− XN(t)

) (Xi (s)− XN(s)

)+

N

N + M

1

M

M∑j=1

(X ∗j (t)− X ∗M(t)

) (X ∗j (s)− X ∗M(s)

).

Thus, the sum∑d

k=1 τkN2k offers an approximation to the limit

distribution in (4) if d is large enough.

Radek Hendrych Two Sample Problem for Functional Data

Page 11: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Asymptotic Consistency of Method I

Theorem

If the assumptions (A1), (A2) and (A3) hold,

N

N + M→ θ, for some θ ∈ [0, 1], as N →∞,

and ∫ 1

0

(µ(t)− µ∗(t))2 dt > 0,

then UN,MP−→∞, as N,M →∞.

Proof. See [1].

Radek Hendrych Two Sample Problem for Functional Data

Page 12: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Method II

The method is a projection version of the first procedure based on UN,M .

It does not require the numerical evaluation of the integral in thedefinition of UN,M (⇒ an easier implementation).

Consider projections onto the space determined by the leadingeigenfunctions of the operator Z = (1− θ)C + θC∗.

In particular, assume that the eigenvalues of Z satisfy

τ1 > τ2 > · · · > τd > τd+1. (5)

One wants to project observations onto the space spanned by ϕ1, . . . , ϕd ,i.e. the corresponding eigenfunctions.

Radek Hendrych Two Sample Problem for Functional Data

Page 13: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

In fact, the functions ϕ1, . . . , ϕd are unknown.

The corresponding eigenfunctions of ZN,M , denoted by ϕi , are used.

This delivers a projection of XN − X ∗M into the linear space spanned byϕ1, . . . , ϕd . Let

a = (a1, . . . , ad)> , where ai = 〈XN − X ∗M , ϕi 〉.

Under the conditions of the first theorem, it can be shown that√NM/(N + M)a has approximately d-variate normal distribution (up to

some random signs) with the asymptotic variance Q = {Q(i , j)}di,j=1,

Q(i , j) = (1−θ)E〈X1−µ, ϕi 〉〈X1−µ, ϕj〉+θE〈X ∗1 −µ∗, ϕi 〉〈X ∗1 −µ∗, ϕj〉.

Thus, Q(i , i) = τi and Q(i , j)i 6=j= 0.

Radek Hendrych Two Sample Problem for Functional Data

Page 14: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

According to the mentioned facts, testing procedures can be based on

T(1)N,M =

NM

N + M

d∑k=1

a2kτk, (6)

T(2)N,M =

NM

N + M

d∑k=1

a2k . (7)

Radek Hendrych Two Sample Problem for Functional Data

Page 15: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Convergence of T(1)N,M and T

(2)N,M under H0

Theorem

If H0, the assumptions (A1), (A2), (A3) and (5) hold, and

N

N + M→ θ, for some θ ∈ [0, 1], as N →∞,

then

T(1)N,M

d−→ χ2d and T

(2)N,M

d−→d∑

k=1

τkN2k , N,M →∞, (8)

where N1, . . . ,Nk are independent standard normal random variables.

Proof. See [1].

T(2)N,M is a projection version of UN,M , where only first d terms in L2 expansion

of XN − X ∗M are used. The statistic T

(1)N,M is an asymptotically distribution free

modification of T(2)N,M (and hence of UN,M).

Radek Hendrych Two Sample Problem for Functional Data

Page 16: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Asymptotic Consistency of Method II

Theorem

If the assumptions (A1), (A2), (A3) and (5) hold,

N

N + M→ θ, for some θ ∈ [0, 1], as N →∞,

and µ− µ∗ is not orthogonal to the linear span of ϕ1, . . . , ϕd , then

T(1)N,M

P−→∞ and T(2)N,M

P−→∞, as N,M →∞.

Proof. See [1].

Radek Hendrych Two Sample Problem for Functional Data

Page 17: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Empirical Example

Data

The data set consists of egg-laying trajectories of Mediterraneanfruit flies (medflies).

Consider 534 egg-laying curves of flies who lived at least 30 days.

Each function is defined over an interval [0, 30].

Its value on day t ∈ [0, 30] is the number of eggs laid by the fly i onthat day.

The 534 medflies are classified into:

256 short-lived (died before the end of the 44th day after birth),278 long-lived (lived longer than 44 days).

Two Samples

Xi (t), t ∈ [0, 30], i = 1, . . . , 256 (for short-lived medflies)

X ∗j (t), t ∈ [0, 30], j = 1, . . . , 278 (for long-lived medflies)

Radek Hendrych Two Sample Problem for Functional Data

Page 18: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Figure: Randomly selected egg-laying curves for short- and long-lived medflies.

Radek Hendrych Two Sample Problem for Functional Data

Page 19: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Figure: The estimated mean functions for the medfly data: short-lived by thesolid line, long-lived by the dashed line.

Radek Hendrych Two Sample Problem for Functional Data

Page 20: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

d 1 2 3 4 5 6 7 8 9

T(1)256,278 1.0 2.2 3.0 5.7 10.3 15.3 3.2 2.7 5.0

T(2)256,278 1.0 1.0 1.0 1.1 1.1 1.0 1.0 1.1 1.1

Table: The p-values (in %) of the tests based on the statistics T(1)256,278 and

T(2)256,278 applied to the medfly data.

The p-values for T(2)256,278 are much more stable.

The test based on T(1)N,M is easier to apply (due to the standard

chi-squared critical values), the test based on T(2)N,M may be more reliable.

Radek Hendrych Two Sample Problem for Functional Data

Page 21: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Outline

1 Introduction

2 Testing Equality of Mean Functions

3 Testing Equality of Covariance Operators

4 Bibliography

Radek Hendrych Two Sample Problem for Functional Data

Page 22: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Testing Equality of Covariance Operators

Model Assumptions

Consider two samples X1, . . . ,XN and X ∗1 , . . . ,X∗M satisfying:

(A1) the two samples are independent,

(A2) X1, . . . ,XN are i.i.d. elements of L2 with EX1(t) = 0,

(A3) X ∗1 , . . . ,X∗M are i.i.d. elements of L2 with EX ∗1 (t) = 0.

The X1 and X ∗1 do not have to follow the same distribution.

Radek Hendrych Two Sample Problem for Functional Data

Page 23: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Main Goal

Testing Hypothesis

Suppose the variance operators

C (x) = E[〈X , x〉X ], C∗(x) = E[〈X ∗, x〉X ∗], x ∈ L2,

where X has the same distribution as X1, and X ∗ has the samedistribution as X ∗1 .

One wants to test

H0 : C = C∗ against H1 : ¬H0.

Radek Hendrych Two Sample Problem for Functional Data

Page 24: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Let R be the empirical covariance operator of the pooled data, i.e.

R(x) =1

N + M

N∑i=1

〈Xi , x〉Xi +M∑j=1

〈X ∗j , x〉X ∗j

= θC (x)+(1− θ)C∗(x),

where C and C∗ are the empirical counterparts of C and C∗, x ∈ L2, andθ = N

N+M .

The operator R has N + M eigenfunctions (denoted by φk).

Set

λk =1

N

N∑i=1

〈Xi , φk〉2 and λ∗k =1

M

M∑j=1

〈X ∗j , φk〉2.

These are the sample variances of the Fourier coefficients of X and X ∗

with respect to the orthonormal system {φk , k = 1, . . . ,N + M}.

Radek Hendrych Two Sample Problem for Functional Data

Page 25: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Convergence of T under H0 and Gaussianity

The test statistic T is defined by

T =N + M

2θ(1− θ)

N+M∑i,j=1

〈(C − C∗)φi , φj〉2(θλi + (1− θ)λ∗i

)(θλj + (1− θ)λ∗j

) .Theorem

Suppose X and X ∗ are Gaussian elements of L2 such that E||X ||4 <∞and E||X ∗||4 <∞. Suppose also that θ → θ ∈ (0, 1), as N →∞. If H0

holds, then

Td−→ χ2

(N+M)(N+M+1)/2, N,M →∞. (9)

Proof. See [1].

Radek Hendrych Two Sample Problem for Functional Data

Page 26: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Outline

1 Introduction

2 Testing Equality of Mean Functions

3 Testing Equality of Covariance Operators

4 Bibliography

Radek Hendrych Two Sample Problem for Functional Data

Page 27: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Bibliography

Horvath, L., Kokoszka, P.: Inference for Functional Data withApplications. New York: Springer, 2012. Chapter 5.

Radek Hendrych Two Sample Problem for Functional Data

Page 28: Two Sample Problem for Functional Data - …msekce.karlin.mff.cuni.cz/~vorisek/Seminar/1314z/1314z_Hendrych.pdf · Introduction Testing Equality of Mean Functions Testing Equality

IntroductionTesting Equality of Mean Functions

Testing Equality of Covariance OperatorsBibliography

Thank you for your attention.

Radek Hendrych Two Sample Problem for Functional Data