Functional ANOVA with random functional effects: an application to ...

22
STATISTICS IN MEDICINE Statist. Med. 2006; 25:3718–3739 Published online 22 December 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2464 Functional ANOVA with random functional eects: an application to event-related potentials modelling for electroencephalograms analysis C eline Bugli 1; ; and Philippe Lambert 1; 2 1 Institut de Statistique; Universit e catholique de Louvain; B-1348 Louvain-la-Neuve; Belgium 2 Unit e d’ epid emiologie; Biostatistique et m ethodes op erationnelles; Facult e de M edecine; Universit e catholique de Louvain; B-1200 Bruxelles; Belgium SUMMARY The dierential eects of basic visual or auditory stimuli on electroencephalograms (EEG), named event related potentials (ERPs), are often used to evaluate the impact of treatments on brain performances. In the present paper, we propose a P-splines based model that can be used to evaluate treatment eect on the timing and the amplitude of some peaks of the ERPs curves. Functional ANOVA is an adaptation of linear model or analysis of variance to analyse functional observations. The changes in the functional of interest eects are generally described using smoothing splines. Eilers and Marx proposed to work with P-splines, a combination of B-splines and dierence penalties on coecients. We dene a P- splines model for ERPs curves combined with random eects. In particular, we show that it is a useful alternative to classical strategies requiring the visual and usually imprecise localization of specic ERP peaks from curves with a low signal-to-noise ratio. Copyright ? 2005 John Wiley & Sons, Ltd. KEY WORDS: EEG; ERP; P300; mixed model; P-splines; regression; functional ANOVA 1. INTRODUCTION The analysis of longitudinal curve data is a methodological and computational challenge for statisticians. Such data are often generated in biomedical studies. Most of time, the statisti- cal analysis focuses on simple summary measures, thereby discarding potentially important information. One way to model these curves is to use non-parametric regression techniques Correspondence to: C. Bugli, Institut de Statistique, Universit e catholique de Louvain, Voie du Roman Pays, 20, B-1348 Louvain-la-Neuve, Belgium. E-mail: [email protected] Contract=grant sponsor: Federal Oce for Scientic, Technical and Cultural Aairs; contract=grant number: P5=24 Contract=grant sponsor: Fonds National pour la Recherche Scientique’ (FNRS), Belgium Contract=grant sponsor: Eli Lilly and Company Received 24 October 2005 Copyright ? 2005 John Wiley & Sons, Ltd. Accepted 24 October 2005

Transcript of Functional ANOVA with random functional effects: an application to ...

Page 1: Functional ANOVA with random functional effects: an application to ...

STATISTICS IN MEDICINEStatist. Med. 2006; 25:3718–3739Published online 22 December 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2464

Functional ANOVA with random functional e�ects:an application to event-related potentials modelling for

electroencephalograms analysis

C�eline Bugli1;∗;† and Philippe Lambert1;2

1Institut de Statistique; Universit�e catholique de Louvain; B-1348 Louvain-la-Neuve; Belgium2Unit�e d’�epid�emiologie; Biostatistique et m�ethodes op�erationnelles; Facult�e de M�edecine; Universit�e catholique

de Louvain; B-1200 Bruxelles; Belgium

SUMMARY

The di�erential e�ects of basic visual or auditory stimuli on electroencephalograms (EEG), named eventrelated potentials (ERPs), are often used to evaluate the impact of treatments on brain performances. Inthe present paper, we propose a P-splines based model that can be used to evaluate treatment e�ect onthe timing and the amplitude of some peaks of the ERPs curves. Functional ANOVA is an adaptationof linear model or analysis of variance to analyse functional observations. The changes in the functionalof interest e�ects are generally described using smoothing splines. Eilers and Marx proposed to workwith P-splines, a combination of B-splines and di�erence penalties on coe�cients. We de�ne a P-splines model for ERPs curves combined with random e�ects. In particular, we show that it is a usefulalternative to classical strategies requiring the visual and usually imprecise localization of speci�c ERPpeaks from curves with a low signal-to-noise ratio. Copyright ? 2005 John Wiley & Sons, Ltd.

KEY WORDS: EEG; ERP; P300; mixed model; P-splines; regression; functional ANOVA

1. INTRODUCTION

The analysis of longitudinal curve data is a methodological and computational challenge forstatisticians. Such data are often generated in biomedical studies. Most of time, the statisti-cal analysis focuses on simple summary measures, thereby discarding potentially importantinformation. One way to model these curves is to use non-parametric regression techniques

∗Correspondence to: C. Bugli, Institut de Statistique, Universit�e catholique de Louvain, Voie du Roman Pays,20, B-1348 Louvain-la-Neuve, Belgium.

†E-mail: [email protected]

Contract=grant sponsor: Federal O�ce for Scienti�c, Technical and Cultural A�airs; contract=grant number: P5=24Contract=grant sponsor: Fonds National pour la Recherche Scienti�que’ (FNRS), BelgiumContract=grant sponsor: Eli Lilly and Company

Received 24 October 2005Copyright ? 2005 John Wiley & Sons, Ltd. Accepted 24 October 2005

Page 2: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3719

based on splines. Because of its simplicity and �exibility, penalized spline regression is apopular smoothing technique. It uses a �xed spline basis and is associated with a penalty formodel complexity. This penalty is often based on the second derivative of the �tted function[1]. As an alternative, Eilers and Marx [2] proposed to use P-splines: a combination of B-splines and di�erence penalties on coe�cients. We shall present this approach when the curvesto model are extracted from electroencephalogram (EEG) data.ECG is an important diagnostic tool in clinical neurophysiology. However, EEGs are not

often used in clinical studies because of intrinsic problems like the huge quantity of data toanalyse. In this paper, we shall describe statistical tools to detect and to quantify the e�ectof drugs on the brain.We use data kindly made available by the pharmaceutical company Eli Lilly (Lilly Clinical

Operations S.A., Louvain-la-Neuve, Belgium). These ECGs were recorded during a cross-over study designed to observe the e�ects of a drug (a benzodiazepine known for its negativeaction on memory) on such signals.The paper is organized as follows. In Section 2, we start by introducing notations for

P-splines smoothing and P-splines regression. Next, we apply functional ANOVA withP-splines to the analysis of EEG. A description of the study is given in Section 3, where aninteresting characteristic of EEG signals named event related potential (ERP) and in partic-ular the P300 peak are introduced. In Section 4, we introduce a mixed model with randomsubject e�ect to analyse the ERP curves. Estimation of the �xed e�ects, prediction of therandom e�ects and speci�cation of the smoothing parameters are described. Pointwise andsimultaneous con�dence bands are obtained with model selection. Some computational issuesare also discussed. Finally, we apply functional ANOVA with P-splines to the analysis ofEEG in Section 5.

2. SPLINE SMOOTHING AND REGRESSION

2.1. Literature review

There are several competing approaches to non-parametric modelling: serial-based smoothers,including wavelets [3]; kernel methods [4, 5], including local regression [6, 7]; and splinesmethods, including smoothing splines (e.g. References [8–10]), regression splines [11, 12]and B-splines [13, 14]. All these methods can be used e�ciently. The nature of the datashould play a role in the choice among them.There is growing interest in modelling longitudinal data with curves, in the style of ANOVA.

Ramsay and Silverman [15] discuss functional ANOVA in their book. Brumback and Rice [16]and Rice an Wu [17] present interesting applications. All these papers use spline regression.The approach to regression used in this paper is based on B-splines.

2.2. P-spline smoothing and functional ANOVA with P-splines

B-splines are attractive for non-parametric modelling, but the choice of the optimal numberand positions of knots can be di�cult. A solution is to use a large number of equidistantknots and a penalty to control the smoothness of the �tted curve (see References [18, 19]).Eilers and Marx [2] proposed to use B-splines with a di�erence penalty on the splinecoe�cients.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 3: Functional ANOVA with random functional effects: an application to ...

3720 C. BUGLI AND P. LAMBERT

Suppose that we observe a smooth function y(t) at t1; t2; : : : ; tN , yielding N pairs of obser-vations {(tn; yn) : n=1; : : : ; N}. We approximate y(t) by

S∑s=0�s�s(t) (1)

where �0; �1; : : : ; �S are known functions of t and �=(�0; : : : ; �S)T an appropriate coe�cientsvector. Examples of the �s might be B-splines, tensor products of B-splines, thin plate regres-sion spline basis functions, the truncated power basis for cubic splines, or some radial basisfunctions (see Reference [20] for examples). In this paper, we use B-spline basis functions.A B-spline of degree q is a piecewise polynomial function with q+ 1 pieces of degree q.

At the q joining knots, derivatives up to order q−1 are continuous (see References [13, 14]).The coe�cients � in equation (1) can be estimated by minimizing:

N∑n=1

(yn −

S∑s=0�s�s(tn)

)2+ PEN(�)

where PEN(�) is a roughness penalty depending on a smoothing parameter �.Examples of this penalty are the thin plate spline penalty, the integrated square of sec-

ond derivative penalty for cubic spline and the various di�erence penalties used in P-splinesmoothing. Eilers and Marx introduced a penalized B-spline regression model with equidistantknots associated to a convenient roughness penalty and called P-splines. It is based on thedi�erence of adjacent B-spline coe�cients yielding the penalized likelihood:

N∑n=1

(yn −

S∑s=0�s�s(tn)

)2+ �

S∑s=d+1

(�d�s)2

where �d is the di�erence operator of order d.In the case of longitudinal data, we have several curves yi(t) i=1; : : : ; I observed at multiple

occasions, say t=(t1; t2; : : : ; tN ). We shall now use multiple regression to model these curvesand their association to factors. For example, suppose that P binary covariates (xi1; : : : ; xiP)are associated with the ith curve. (The model can be extended to non-binary variables.) Eachcurve can be decomposed as the sum of several curves: one reference curve f(t) that we shallname the ‘mean curve’ and P corrections curves gp(t) (p=1; : : : ; P) (named ‘e�ect curves’)summarizing the e�ect of each factor:

E(Yi(tn))=�ni =f(tn) +P∑p=1bipgp(tn)

These curves can be modelled using B-splines (see Reference [21]).As with spline smoothing, the evolution of the curvature of a B-spline �t is strongly depen-

dent on the number and the positions of the knots. Instead of trying to optimize the numberand the position of the knots (which is a di�cult numerical problem), we use a relativelylarge number of equidistant knots and avoid over�tting by adding a di�erence penalty on thecoe�cients of adjacent B-splines. This is the P-spline approach recommended by Eilers [22].The full model for a B-splines basis {�s(t) : s=1; : : : ; S is, for curve yi(t)}:

E(yi(t))=f(t) +P∑p=1bpgp(t)=

S∑s=1�0s�s(t) +

P∑p=1

(bip

S∑s=1�ps�s(t)

)(2)

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 4: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3721

If Y is the matrix such that: Yin=yi(tn), then equation (2) can be written as

E(Y )=XB�T

where X is the matrix indicating presence or absence of factors of interest (p=1; : : : ; P+ 1)for each curve (i=1; : : : ; I), � is the matrix associated to the B-spline basis (such that�ns=�s(tn), s=1; : : : ; S; n=1; : : : ; N ) and B is the matrix of the B-spline coe�cients (suchthat Bks=�k−1; s, k=1; : : : ; P+1; s=1; : : : ; S). In the following, we shall transform the nota-tions of the above problem to obtain an index notation which conveys the same informationbut which is intended for mathematical convenience rather than machine calculation. Wede�ne the vectorial form of this model by using the vec(:) operator de�ned by: x=vec(X )such that x( j−1)N+i=Xij (i=1; : : : ; I ; j=1; : : : ; J ).For the above spline regression model, we have

vec(E(Y )T)=vec((XB�T)T)=vec(�BTX T)

Henderson and Searle [23] showed that

vec(�BTX T)= (X ⊗�)vec(BT)

where (X ⊗ �) is the Kroenecker product of the matrices X and �. Therefore, equation (2)can be rewritten as

E(y)= X b

with (X ⊗�)= X and vec(BT)= b.If D is the di�erence matrix associated to the di�erence operator of order d such that∑Ss=d+1 (�

d�ps)2 = RTpDTDRp, than the estimates of the spline parameters will be obtained byminimizing

(y − X b)T(y − X b) +P∑p=0�pRTpDTDRp

where �p is the penalty parameter associated to pth curve. The penalty term in that penalizedlikelihood can be rewritten as bT(�f ⊗DTD)b where �f=diag{�0; �1; : : : ; �P}. Conditionallyon the penalty parameters, we get the spline parameter estimators:

b=(XTX +�f ⊗DTD)−1X Ty

and the �tted values

y= X b= X (XTX +�f ⊗DTD)−1X Ty=Hy

The model was presented in a least square setting, to keep the explanation simple. Because ofits regression structure, it can be extended straightforwardly to a generalized additive model(see Reference [22]).

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 5: Functional ANOVA with random functional effects: an application to ...

3722 C. BUGLI AND P. LAMBERT

3. STUDY PRESENTATION

3.1. Study design

The data of interest come from a randomized, double-blind, placebo-controlled cross-overstudy performed with 15 healthy male subjects. These data were kindly made available bythe pharmaceutical company Eli Lilly (Lilly Clinical Operations S.A., Louvain-la-Neuve, Bel-gium). It is designed to assess the e�ects of a benzodiazepine (Lorazepam) administeredversus placebo on the cognitive functions through the analysis of EEGs. Lorazepam is exten-sively used as a sedative and anti-anxiety agent in clinical practice [24]. High concentrationsof Lorazepam cause disorders of the memory [25]. Two periods are scheduled, separated by awash-out period of at least 7 days: in each period, one of the two randomized treatments willbe administered once a day to each of the 15 volunteers. For each treatment and each subject,12 EEGs (corresponding to 12 periods in the day) are recorded during 3min. EEG recordingsstart 1:5; 1 and 0:5 h before the drug administration and 0:5; 1:5; 2:5; 3:5; 4:5; 5:5; 6:5; 7:5 and25:5 h after the Lorazepam (or placebo) administration.For each EEG recording, 28 EEG leads are recorded using an ear linked reference (see

Reference [26] for details about leads positions). Each EEG is recorded with a samplingfrequency of 250 Hz for the 28 electrodes.EEG were recorded while the subjects were submitted to auditory stimuli and asked to

perform some task in response to the stimuli. This is the standard auditory ‘oddball’ paradigm(see Reference [27]). Subjects have to listen to a series of stimuli involving two types oftones: frequent tones at 500 Hz and infrequent tones at 2000 Hz. Subjects are asked to countinfrequent tones. The tones are presented as a randomized sequence with the infrequent tonesrepresenting 15 per cent of the 130 submitted stimuli.The goal of our work is to detect and to quantify the e�ects of the drug on the brain through

the analysis of the recorded ECGs. More precisely, we shall use the analysis of event-relatedpotentials to detect a treatment e�ect.

3.2. Event-related potentials

EEG activity is present in a spontaneous way. It is a�ected by external stimuli (e.g. toneor light �ash). The alteration of the ongoing EEG due to stimuli is named an event relatedpotential (ERP). In this section, we shall introduce some notions about ERP. The amplitudeof an ERP is low compared to the ongoing EEG. ERP can be visualized during a short periodfollowing the stimulation time, with a response pattern which is more or less predictable undersimilar conditions. We name an ERP episode, the EEG signals measured during a few secondsafter a stimulation. It is common practice to average several ERP episodes observed after thesame stimulation to increase the signal to noise ratio of the evoked activity.The averaged ERP episodes present some well-known peaks. The peaks usually pointed

are the P100 or P1 (approximately observed 100 ms after stimulation), the P200 or P2(∼ 200 ms), and, when the stimulus is a cognitive task, the P300 or P3 (∼ 300 ms) peak(see Figures 1(a) and (b)). The P100 and P200 peaks can be observed after frequent andinfrequent stimulations. Since these peaks do not depend on the task, they have relativelyshort latency. P300 is only observed after the infrequent stimulations because it is relatedto a cognitive task. Since this response is task dependent and has a long latency, it istraditionally related to cognitive processes such as signal matching, decision making,

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 6: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3723

Figure 1. (a) Typical peaks in ERP; and (b) example of time averaged of P300 curve for electrode Cz.

attention, memory updating, etc. See References [28, 29] for more details about event-relatedpotentials.The P300 peak is an indicator of brain performance. It is often studied by neurophysiologists

as amplitude changes or delay in the occurrence of the peak (this delay is named latency:see Figure 1(a)) are signs of memory problems (like with Alzheimer’s disease) or indicationthat a drug is a�ecting the brain. We shall focus on the analysis of the P300 peaks extractedfrom EEG signals to quantify treatment e�ect on the brain.P300 can be considered as an expression of the central nervous system (CNS) activity

involved in the processing of new information when attention is solicited to update memoryrepresentations.

3.3. Results and limitations of a standard analysis

The classical approach consists in summarizing the P300 data of each subject into one statisticlike the amplitude or the time of occurrence of the peak. The corresponding summary dataare then treated using analysis of (co)variance models to compare groups after correctionfor important covariates. In Reference [30], we have used an ANOVA model with randome�ects [31] to account for subject heterogeneity and to analyse the e�ects of treatments onthe amplitudes and latencies of the P300 peaks. The considered �xed e�ects correspond totreatment, period of recording and the interaction ‘treatment—period of recording’. We foundthe same qualitative results for electrodes Fz and Cz with a signi�cant decrease of the meanamplitude and a non-signi�cant increase of the associated P300 peak under Lorazepam.That analysis also pointed a signi�cant evolution of these mean quantities with the time

elapsed since treatment administration (= period of recording or shortly period) but no sig-ni�cant interaction between period and treatment.The analysis of the coe�cients shows that the amplitude (the latency) is lowest (largest)

when Lorazepam reaches its peak plasmatic concentration and largest (lowest) at the end ofthe experiment.The increase (decrease) of the mean amplitude (latency) begin approximately after theoret-

ical Lorazepam concentration starts decreasing in the plasma (i.e. 3 h after administration).The problem with this method is that the amplitude and latency of the P300 peak are

di�cult to compute because the signal is discrete and noisy.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 7: Functional ANOVA with random functional effects: an application to ...

3724 C. BUGLI AND P. LAMBERT

Here, we propose to use P-splines in a functional ANOVA framework to model the P300curves under placebo and under Lorazepam. The e�ect of Lorazepam will be assessed fromthe whole curve and not only from the amplitude and latency of the P300 peak.

4. MODELLING RANDOM PATIENT EFFECTS

As shown in Section 2, each curve could be modelled as the sum of a mean curve and someother e�ect curves associated to P di�erent factor levels. We shall also allow the functionsto vary with the individuals seen as a random factor. We propose the following functionalANOVA model, for each curve i:

E(Yi(t))=�i=f(t) +P∑p=1bipgp(t) + hd(i)(t)

where i indexes curves, p indicates the factor levels and d(i) indicates the subject associatedto curve i (d(:)=1; 2; : : : ; � subjects).The functions gp(:) are associated to �xed factor levels; hd(i) correct the reference functional

f(t) to give the reference curve for subject d(i). Semiparametric models for subject-speci�ccurves are available in the literature. For example, Durban et al. [32] suggested individualcurves modelled as penalized splines with random intercept. In this paper, the random e�ectis a curve (hd(i)) instead of a random intercept. We adopt linear combinations of B-splines todescribe these functionals. The vector of B-splines coe�cients associated to hd(i) is assumedto have mean zero as in classical random coe�cients models.Therefore, using the notations introduced in Section 2, we propose the linear mixed model

formulation:

y= X b+ Zu+ U

with (u

U

)∼ N

[(0

0

);

(G 0

0 R

)]

where Zu corresponds to the individual random curves.

4.1. Estimation of �xed e�ects

One can show that the marginal distribution of y is

y ∼ N(X b; V )

where V = ZGZT+ R.

A direct minimization of (y − X b)TV−1(y − X b) to obtain estimates for b is an ill-posedproblem because X is not full rank (see Appendix C for example). A simple solution is to adda penalty �bTb with a small value for � (say 10−13). Combined with the roughness penaltiesfor each of the constituting curves, we end up with the penalized log-likelihood

(y − X b)TV−1(y − X b) + bT(�f ⊗DTD)b+ �bTb

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 8: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3725

where �f=diag{�0; �1; : : : ; �P} is the diagonal matrix of the smoothing parameters for the�xed e�ects curves. The corresponding estimates for b are

b=(XTV−1X +�f ⊗DTD+ �I)−1X TV−1y

and the associated �tted curves

y= X b= X (XTV−1X +�f ⊗DTD+ �I)−1X TV−1y=Hy

4.2. Prediction of random e�ects

We know that

y | u∼N(X b+ Zu; R)u∼N(0; G)

The corresponding conditional log-likelihood function is

− 12[log(|R|) + (y − X b− Zu)TR−1

(y − X b− Zu)]

combined with the normal density of the random e�ects u, we obtain the joint likelihood

− 12[log(|R|) + (y − X b− Zu)TR−1

(y − X b− Zu) + log(|G|) + uTG−1u]

from which one can derive the prediction for the random e�ects:

u=(ZTR

−1Z + (�r ⊗DTD) + �I + G−1

)−1ZTR

−1(y − X b)

where �r = �(r)I(�) and �(r) is the common smoothing parameters for the random subjectse�ects curves.

4.3. Selection of the penalty parameters

One common measure of ‘goodness of �t’ is the residual sum of squares (RSS). However,since RSS is minimized at the interpolants, minimization of this criterion to select the penaltyparameters will lead to the spline �t that is closest to interpolation. For P-splines smooth-ing, this corresponds to zero for the smoothing parameters. Generalized cross-validation§ getsaround this problem. One can select the P + 2 smoothing parameters

�= (�0; �1; : : : ; �P; �(r); : : : ; �(r)︸ ︷︷ ︸� elements

)

by minimizing the generalized cross-validation criterion:

GCV(�)=I∑i=1

N∑n=1

(yi(tn)− yi(tn;�)ny − trace(H (�))

)2

§Leave-one-out cross-validation also gets around this problem. However, with huge quantity of data, the computationof the CV criterion is quite long. To decrease computational time, Ruppert et al. [20] proposed to useGCV instead of the CV.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 9: Functional ANOVA with random functional effects: an application to ...

3726 C. BUGLI AND P. LAMBERT

We use the Matlab function fminsearch to minimize GCV(�). It relies on the simplexsearch method (see Reference [33]). This is a direct search method that does not use numericalor analytic gradients.The parameters involved in matrices G and R are estimated using their maximum like-

lihood estimators, as obtained by a numerical optimization of the log-likelihood where the�xed and random e�ects are replaced by their conditional maximum likelihood estimators,see Sections 4.1 and 4.2. That conditional log-likelihood depends on the smoothing parame-ters and GCV on the current estimates of the parameters in the G and R variance–covariancematrices. So, a good practice is to alternate the estimation of the smoothing and of the varianceparameters until convergence is attained. That procedure can be time consuming dependingon the initial values.

Algorithm

• Initial conditions: Estimate the parameters in the G and R variance–covariance matricesfor smoothing parameters equal to zero. That means that we de�ne �(0) = 0 and computecorresponding G

(0)and R

(0). Set i=1.

At iteration i,• Step 1: Find the optimal smoothing parameters �(i) by minimizing GCV for the currentvalues of the variance–covariance matrices G

(i−1)and R

(i−1).

• Step 2: Estimate the parameters in the G(i) and R(i) variance–covariance matrices forgiven smoothing parameters �(i).

Iterate until convergence.

In the examples that are considered, less than 10 iterations were required to reach conver-gence. See details about computation time for our application in Section 5.2.

4.4. Con�dence bands

The distribution for y is

y ∼ N(X b+ Z u; var(X b) + var(Z u))

with [var(X b)

var(Z u)

]≈ C(CTR−1

C + BG + (�⊗DTD) + �I)−1CT

where C=[X ; Z], �=diag(�0; �1; : : : ; �p; �(r); : : : ; �(r)︸ ︷︷ ︸� elements

) and

BG=

⎡⎣0(S(P+1)) 0

0 G−1

⎤⎦

(see Appendix A for details).

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 10: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3727

So, a 100(1− �) per cent con�dence interval for E(Yi(tn)) isy(N(i − 1) + n)± z(1− �=2)� (3)

where � is the standard deviation of yi(tn) estimated by

�2 = cni (CTR

−1C + BG + (�⊗DTD) + �I)−1cnTi

where cni is the (N(i − 1) + n)th row of C.One can also compute con�dence bands for each e�ect curve gp associated to any level

p of a factor: simply use equation (3) by substituting to � the standard deviation of gp(tn)estimated by

�=

√cngp(C

TR−1C + BG + (�⊗DTD) + �I)−1cnTgp

where cnTgp is the nth row of the matrix (e(p+1) ⊗ �) with e(p+1) a column vector of length(P+1+ �) with 1 at the (p+1)th position and zeros elsewhere. Then, a 100(1− �) per centsimultaneous con�dence band for yi(tn) is

yi(tn)±m1−��where m1−� is the (1− �) quantile of the random variable:

suptn; n=1;:::; N

∣∣∣∣ yi(tn)− yi(tn)�

∣∣∣∣ ≈ maxn=1;:::; N

∣∣∣∣∣ (xni [b− b])n�

∣∣∣∣∣ (4)

where xni is the (N(i − 1) + n)th row of X .That quantile is obtained by simulating b− b using:

b− b ∼ N(0; (X TV−1X +�f ⊗DTD+ �I)−1)in combination with equation (4). This process is repeated a large number of times, say10 000. These simulated values are sorted from the smallest to the largest, and the one withrank (1− �)N is used to approximate m1−�.Con�dence intervals presented here are conditional on the choice of the smoothing param-

eters. A way to overcome this problem is to use bootstrap. However, this solution is notpossible for many examples because of computational time. Another way to take into accountthe uncertainty of the smoothing parameters is to use Bayesian methods [20].

4.5. Model selection

One can select between two competing models using an F-test:

Fobs =(RSS1 − RSS2)=(2 − 1)

RSS2=(N − 2) ∼ F2−1 ; N−2

where RSS indicates the residuals sum of squares, N is the number of observations and

j=2 trace(H)− trace(HHT)

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 11: Functional ANOVA with random functional effects: an application to ...

3728 C. BUGLI AND P. LAMBERT

The corresponding p-value is de�ned by

p-value=1− P(X6Fobs)where X ∼ F2−1 ; N−2 .This can be improved by applying a correction to both the numerator and the denominator

of Fobs (see Reference [21]):

Fobs =(RSS1 − RSS2)=(1)

RSS2=(�1)∼ F21=2 ; �21=�2

where

Rj = (I −Hj)T(I −Hj) (j=1; 2)

�1 = trace(R2)

�2 = trace(R22)

1 = trace(R1 − R2)2 = trace(R1 − R2)2

Notice that if the two corrections 1=2 and �1=�2 are equal to one, then we recover the�rst approximation in equation (1). These corrections are di�cult to compute, but can beapproximated (see Appendix B).

4.6. Computational details

The sizes of the data vector y and of the matrices generated to implement the model (matricesX and Z) are huge in EEG analysis. Indeed, we have I curves to model (I =360 in ourapplication), each with N time points (N =128 in our application). That means that thevector y is of length ny= I × N . Matrices X and Z have the same number of rows. Inaddition, we have P factor levels and � subjects. That means that we obtain a design matrixX with columns containing P blocks of B-spline bases with S elements for each time pointtn; n = 1; : : : ; ny and a design matrix Z with � × S columns. These numbers of rows andcolumns are huge in EEG analysis. Consequently, we decided to use Matlab to be able todeal with large matrices. Matlab provides functions to manipulate sparse matrix like X andZ using less memory, a decisive element to make the implementation of the above presentedmodels feasible here. Moreover, Eilers [22] presented formulas to exploit the sparse structureof the equations.For the generalized cross-validation criterion (see Section 4.3) and the degree of freedom

in the con�dence bands computation (see Section 4.4), we have to compute the trace of thematrix:

H =CT(CTR−1C + BG + (�⊗DTD) + �I)−1CTR−1

such that [b

u

]=CT(CTR

−1C + BG + (�⊗DTD) + �I)−1CTR−1y = Hy

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 12: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3729

As trace(AB)= trace(BA) (for comfortable matrices), it is computationally advantageous to use:

trace(H) = trace(CT(CTR−1C + BG + (�⊗DTD) + �I)−1CTR−1)

= trace((CTR−1C + BG + (�⊗DTD) + �I)−1CTR−1C)

The latter expression involves a matrix with smaller dimensions than H since matrix C hasmore rows (ny= length of y=N ∗ I) than columns ((P + 1 + �) blocks× the number ofelements in the B-spline basis (= (P + 1 + �) ∗ S)). Thus C is a (ny × (P + 1 + �)S) matrixyielding a H matrix of dimension (ny×ny) whereas (CTR−1

C+BG+(�⊗DTD)+�I)−1CTR−1Cis only ((P + 1+ �)S × (P + 1+ �)S) with (P + 1+ �)S� ny.

5. APPLICATION TO EVENT-RELATED POTENTIALS MODELLING

5.1. Practical issues and model selection

We have �=15 subjects receiving two treatments (placebo and Lorazepam) with recordingsat 12 periods. We obtained I =2 ∗ 15 ∗ 12=360 observed curves at N =128 time points. Thus,the length of y is ny=46080. We have S=34 elements in the B-spline basis of degree 3corresponding to 32 knots (= 1

4 of the 128 time points).In our event-related potentials analysis, we have two �xed factors: treatment and period of

recording after the treatment administration. With the placebo and the �rst period as reference,the corresponding model is

E(Yi(t))=�i=f(t) +12∑p=1bipgp(t) + hd(i)(t)

where i=1; : : : ; 360 (observed curves), d(:)=1; : : : ; � (15 subjects). g1 is associated to theLorazepam e�ect and g2; g3; : : : ; g12 are, respectively, associated to period 2; 3; : : : ; 12. Thatmeans that the ‘mean curve’ f(t) corresponds to the curve under placebo recorded at period1 (for a ‘typical’ subject). The sum of the curves f(t)+g1(t) corresponds to the curve underLorazepam recorded at period 1. To obtain the estimated curves at period p, we must still addthe e�ect curve gp(t). Testing the above model, we saw that period e�ect is not signi�cant.Moreover, as the drug concentration in the brain is evolving over time (i.e. with period),we also expect a treatment–period interaction. This treatment–period interaction is signi�cantin spite of the non-signi�cant period e�ect because there is an evolution over time underLorazepam only (there is no signi�cant period e�ect under placebo).The three �rst periods (recorded before treatment administration) correspond to the same

level of concentration of the drug. The nine following periods correspond to nine other levelsof concentration of the drug in the brain. To facilitate the interpretation of the results, weshall use the concentration of Lorazepam as covariate instead of period in the above model.With the placebo and the �rst level of concentration of Lorazepam as reference, the modelbecomes:

E(Yi(t))=�i=f(t) +11∑p=1bipgp(t) + hd(i)(t)

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 13: Functional ANOVA with random functional effects: an application to ...

3730 C. BUGLI AND P. LAMBERT

where g1 is associated to the Lorazepam e�ect and g2; g3; : : : ; g11 are, respectively, associatedto the 10 di�erent levels of concentration of Lorazepam. So, we have P=1+ 10=11 factorlevels. The ‘mean curve’ f(t) corresponds to the curve under placebo recorded at each period.The curve g1(t) corresponds to the e�ect of Lorazepam. To obtain the estimated curvesfor level p of concentration of Lorazepam, we must still add the e�ect curve gp+1(t). SeeAppendix C for construction of the design matrices.The following block-diagonal forms are taken for G and R:

G=(I(�) ⊗G)

where

G=�2uI(S) =�2uI(34)

and

R=(II ⊗ R)= (I360 ⊗ R)

where R has an ‘AR(1)’ structure:

rij=�2� �|j−i|

This structure allows for correlation between measurements coming from the same curve.Finally, we decided to use three smoothing parameters for the �xed e�ects and 1 smoothing

parameter for the random subject e�ect (see Reference [34]). For the �xed e�ects, �f isde�ned by

�f=diag{�0; �1; �2; �2; : : : ; �2︸ ︷︷ ︸10 elements

}

with the three smoothing parameters �0, �1 and �2 used to add a penalty to the ‘mean curve’,the e�ect curves corresponding to the e�ect of Lorazepam and the levels of concentration ofLorazepam, respectively. We obtain:

(�f ⊗DTD)=

⎡⎢⎢⎣�0DTD 0 0

0 �1DTD 0

0 0 �2(I(10) ⊗DTD)

⎤⎥⎥⎦

For the random subject e�ect, �r is de�ned by

�r =diag{�; �; : : : ; �︸ ︷︷ ︸� elements

}

with the same smoothing parameter � for all the �=15 subjects. We have

(�r ⊗DTD)= �(I(�=15) ⊗DTD)

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 14: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3731

5.2. Computation time and initial conditions

The problem with our application is the huge quantity of data to manage. Indeed, computationtime increases with the size of the involved matrices. We took initial conditions (that meansinitial values of smoothing parameters � and of parameters in the G and R variance–covariancematrices) equal to zero. Less than 10 iterations of the algorithm presented in Section 4.3 wererequired to reach convergence. The determination of the parameters takes less than 2 h. Thisleads to quite acceptable computation times for moderately sized data sets. Moreover, thechoice of better initial conditions can probably decrease computation time.To verify e�cacy of the o�-the shelf optimizer provided by Matlab, we tested di�erent

initial conditions. We computed GCV (or, respectively, likelihood) on a grid. We tested 10values of each parameter that means 104 (or respectively 103) combinations of parameters. Itallows to test di�erent initial values for the optimization with the Matlab function fminsearch.For our example, the determination of the parameters takes approximately 24 h and providessame values as before using zero initial values. That proves the accuracy of the optimizer.

5.3. Results

We model the P300 curve for the frontal electrode Fz for the 15 subjects under placeboand under Lorazepam during the 12 periods of recordings. Using model in equation (1), weobtained the following estimated values of the parameters:

�2u=1:5125; �=0:9677; �2� =4:3307

5.3.1. Signi�cant Lorazepam e�ect. For the above model, we tested the necessity to have acorrection curve for treatment e�ect using the test presented in Section 4.5 and we obtaineda correction factor 1=2 higher than 1. That means that the uncorrected 95 per cent F-test hasa higher signi�cance level that the desired one. The sum of square di�erences between groupsequals 16 875 with degree of freedom equal to 42:55 and the sum of square di�erences withingroups equals 214 500 with degree of freedom equal to 42 640. We obtained a value of Fobsequal to 78:83 which is much larger than the 95 per cent quantile 1:33. This corresponds to ap-value smaller than 0:001 suggesting that the Lorazepam e�ect is signi�cant. The correctionfactors 1=2 and �1=�2 equal 1:3128 and 0:8774, indicating that the correct signi�cant levelis 96 per cent. All the included e�ects were found to be highly signi�cant.

5.3.2. Simultaneous con�dence bands. For example, we obtained m0:95 ≈ 2:7001 for the meancurve of the above model. That means that the simultaneous con�dence bands are 2:7001

1:96 = 1:38times wider than the pointwise intervals. The m0:95 quantile can be approximated very accu-rately. Based on 100 independent simulations of 1000 draws each, we obtained the resultssummarized for some e�ect curves in Table I.Values of m0:95 are similar for other time e�ect curves and other subject e�ect curves. For

each �tted curve y, m0:95 is always approximately 2:82. That means that the simultaneouscon�dence bands are 2:82

1:96 = 1:44 times wider than the pointwise intervals.

5.3.3. Estimated e�ect curves. We obtain the results summarized in Figure 2. The curve inFigure 2(a) is the mean curve under placebo as the non-active treatment was taken to be thereference. We retrieve the speci�c pattern of an ERP curve with three characteristic positive

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 15: Functional ANOVA with random functional effects: an application to ...

3732 C. BUGLI AND P. LAMBERT

Table I. Mean of the estimations of m0:95 for some of the e�ect curves, stan-dard deviation and approximated ratio between simultaneous and pointwise

con�dence bands.

Standard deviationE�ect Mean of m0:95 of m0:95 Mean ratio: m0:951:96

Mean 2.7001 0.0487 2:70011:96 = 1:38

Treatment 2.7083 0.0388 2:70831:96 = 1:38

Period 6 2.7292 0.0429 2:72921:96 = 1:39

Period 7 2.7178 0.0504 2:71781:96 = 1:39

Subject 1 2.7763 0.0360 2:77631:96 = 1:42

0 0.1 0.2 0.3 0.4 0.5−6

−4

−2

0

2

4

6

time (sec)

0 0.1 0.2 0.3 0.4 0.5−6

−4

−2

0

2

4

6

time (sec)

0 0.1 0.2 0.3 0.4 0.5−6

−4

−2

0

2

4

6

time (sec)(a) (b) (c)

Figure 2. (a) Mean curve under placebo (solid line), pointwise con�dence bands (light grey) andsimultaneous con�dence bands (dark grey); (b) curve representing treatment e�ect (solid line), pointwisecon�dence bands (light grey) and simultaneous con�dence bands (dark grey); and (c) mean curve undertreatment (solid line), pointwise con�dence bands (light grey) and simultaneous con�dence bands (dark

grey). We must still add the concentration e�ect.

peaks (P100, P200 and P300). The treatment e�ect is modelled by the curve in Figure 2(b).So, the mean curve under (the active) Lorazepam in Figure 2(c) is the sum of the mean curvein Figure 2(a) and the curve representing the treatment e�ect in Figure 2(b). This mean curveunder Lorazepam is the average of the curves under Lorazepam for all the concentrations.We observe a signi�cant decrease of the amplitude of the P300 peak and an increase ofthe P200 peak under Lorazepam. To obtain the mean curve under Lorazepam for a period,we must still add the concentration e�ect curve corresponding to the level of concentrationfor this period (see Figure 3 for an illustration): the decrease of the amplitude of the P300

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 16: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3733

0 0.1 0.2 0.3 0.4 0.5−6

−4

−2

0

2

4

6

time (sec)

0 0.1 0.2 0.3 0.4 0.5−6

−4

−2

0

2

4

6

time (sec)

0 0.1 0.2 0.3 0.4 0.5−6

−4

−2

0

2

4

6

time (sec)(a) (b) (c)

Figure 3. (a) Mean curve under treatment; (b) concentration e�ect curves for level 4 (corresponding tothe concentration of Lorazepam for period 6, i.e. 2:5 h after Lorazepam administration, thin solid line)and level 5 (corresponding to the concentration of Lorazepam for period 7 i.e. 3:5 h after Lorazepamadministration, bold solid line) and the corresponding pointwise con�dence bands; and (c) the sum ofthe mean curve in (a) and the concentration e�ect curves in (b) gives the mean curve under Lorazepamfor concentration level 4 (thin solid line) and 5 (bold solid line) and the pointwise con�dence bands

(dark grey for level 4 and light grey for level 5): the P300 peak almost disappear.

peak under Lorazepam is most important for level 5 (corresponding to the concentration ofLorazepam for period 7, i.e. 3:5 h after Lorazepam administration).The last recording (period 12, level of concentration 10) is performed 25:5 h after the

Lorazepam administration. At that time, the concentration of Lorazepam (level 10) in theplasma is almost the same as during periods 1; 2 and 3 (level of concentration 1). Thisis con�rmed in Figure 4 that shows very close concentration e�ects for levels 1 and 10(corresponding to periods 1; 2; 3 and 12).Figure 3 suggests that Lorazepam e�ect decreases the amplitude of the P300 peak but

also increases its latency. This e�ect on latency was not visible on the treatment e�ect curvebecause the decrease of amplitude on this curve correspond exactly to the time of the P300peak on the mean curve. This e�ect of Lorazepam on latency requires further analysis becausethis e�ect is not included in the treatment main e�ect curve, but is only visible in the periodby treatment e�ect curve.

5.4. Diagnostics

One can check the assumptions underlying our model. The Q–Q plot in Figure 5(a) suggests aslight deviation from normality. Plots of the residuals against the predicted values do not showmajor abnormalities associated with any factor level: there is no evidence of non-homogeneityof variance. Figure 5(b) shows some examples of these plots. There is no structure associatedto any factor level.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 17: Functional ANOVA with random functional effects: an application to ...

3734 C. BUGLI AND P. LAMBERT

0 0.1 0.2 0.3 0.4 0.5

−6

−4

−2

0

2

4

6

time (sec)

0 0.1 0.2 0.3 0.4 0.5

−6

−4

−2

0

2

4

6

time (sec)

0 0.1 0.2 0.3 0.4 0.5

−6

−4

−2

0

2

4

6

time (sec)(a) (b) (c)

Figure 4. (a) Mean curve under treatment; (b) concentration e�ect curves for level 1 (corresponding tothe concentration of Lorazepam for period 1; 2 and 3, i.e. before Lorazepam administration, thin solidline) and level 10 (corresponding to the concentration of Lorazepam for period 12, i.e. 25:5 h afterLorazepam administration, bold solid line) and the corresponding pointwise con�dence bands; and (c)the sum of the mean curve in (a) and the concentration e�ect curves in (b) gives the mean curveunder Lorazepam for concentration level 1 (thin solid line) and 10 (bold solid line) and the pointwisecon�dence bands (dark grey for level 1 and light grey for level 10): the curve is very similar to the

mean curve under placebo (pointwise con�dence interval in black).

−5 0 5−10

−8

−6

−4

−2

0

2

4

6

8

Standard Normal Quantiles

Qua

ntile

s of

Inpu

t Sam

ple

QQ Plot of Sample Data versus Standard Normal

−5 0 5−10

−5

0

5

10

fitted

resi

dual

s

−5 0 5−10

−5

0

5

10

fitted

resi

dual

s

−5 0 5−10

−5

0

5

10

fitted

resi

dual

s

−5 0 5−10

−5

0

5

10

fitted

resi

dual

s

(a) (b)

Figure 5. (a) Q–Q plot of the residuals: slight deviation from normality; and (b) plots of the residualsagainst predicted values for subject 1; 8; 10 and 12: no evidence of non-homogeneity of variance.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 18: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3735

Moreover, we wondered if there are outliers curves. Considering a curve containing anoutlier point as an outlier curve would probably be too restrictive. The adaptation to functionalANOVA method of outliers detection criteria would require supplementary developments.However, we found (using leave one out) that no simple data curve had an important e�ecton the �tted curves. The �tted curves were not hardly a�ected.

5.5. Interpretation and discussion

The reduction of the amplitude of the P300 peak under Lorazepam is not surprising. Indeed,the amplitude of the P300 peak is related to the process of revision of the representationskept in working memory. So, the amplitude decreases when there is lesser update of theworking memory (see Reference [35]). In that sense, Lorazepam would bring about cognitiveimpairment. This result was already con�rmed in another analysis of these data where theamplitude and the latency of the peak were directly modelled [30].We can conclude that functional ANOVA with P-splines is a powerful tool to model curves.

Indeed, the model can be easily and clearly stated. In the �eld of EEG analysis, it allows tobring information about the modi�cation of shape of the peak and not only about some speci�ccharacteristics like amplitude or latency of one particular peak that are di�cult to identify fromnoisy EEG signals. Moreover, as mentioned in Reference [36], P-spline regression present theadvantage that all the parameters can be estimated simultaneously through penalized likelihoodcontrary to methods like back�tting algorithm which is time consuming [37].Functional ANOVA using spline is already use to model curves with individual behaviour.

For example, Durban et al. [32] suggested individual curves modelled as penalized splineswith random intercept. The originality of our work is that the random e�ect is a curve insteadof a random intercept.We are currently working on improving selection of the smoothing parameters. Bayesian

estimation of � using linear mixed models (see Reference [38]) is probably the solution totake into account e�ciently the uncertainty of the smoothing parameters in the computationof the con�dence intervals. These results will be reported elsewhere.

APPENDIX A: VARIANCE OF THE FIXED AND RANDOM EFFECTS

var(X b)= X var (b)XT

Substituting the inverse Fisher information matrix to var (b), one gets

var(X b) ≈ X (X TV−1X + (�f ⊗DTD) + �I)−1X T

Using similar arguments,

var(Z u) = Z var(u)ZT

≈ Z(ZTR−1Z + (�r ⊗DTD) + �I + G−1

)−1ZT

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 19: Functional ANOVA with random functional effects: an application to ...

3736 C. BUGLI AND P. LAMBERT

Finally,

[var(X b)

var(Z u)

]≈ C(CTR−1C + BG + (�⊗DTD) + �I)−1CT

where C=[X ; Z], �=diag(�0; �1; : : : ; �p; �(r); : : : ; �(r)︸ ︷︷ ︸� elements

) and

BG=

⎡⎣0(S(P+1)) 0

0 G−1

⎤⎦

APPENDIX B: COMPUTATION OF THE CORRECTION FACTORS IN F-TEST

Hastie and Tibshirani [21] showed using bootstrap that one can estimate trace(HHT) by0:75 ∗ trace(H) + 0:5 for a symmetric matrix H . Thus, we have:

�1 =N − 2 trace(H2) + trace(H2HT2 )

�2 =N − 4 trace(H2) + 6 trace(H2HT2 )

− 4 trace(H2HT2 H2) + trace(H2H

T2 H2H

T2 )

1 =N − 2 trace(H1) + trace(H1HT1 )− �1

2 = 0:751 + 0:5

where

trace(HiHTi ) = 0:75 trace(Hi) + 0:5

trace(HiHTi HiH

Ti ) = 0:75 trace(HiH

Ti ) + 0:5

Using bootstrap, we obtained the following additional approximation:

trace(HiHTi Hi)≈ − 0:0043 trace(HiHT

i )2

+0:9157 trace(HiHTi )− 0:0002

thereby showing that only trace(Hi) must be calculated to obtain the correcting factors.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 20: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3737

APPENDIX C: CONSTRUCTION OF THE DESIGN MATRICES FORTHE ERP EXAMPLE

We shall now show how to construct the design matrices X and Z . Consider, for example, onlytwo subjects (�=2 instead of �=15), two treatments (placebo, as reference, and Lorazepam)and three levels of concentration of the Lorazepam (instead of 10).We get for equation (1) in matrix notation:

�=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 0 1 0 0

1 0 0 1 0

1 0 0 0 1

1 1 1 0 0

1 1 0 1 0

1 1 0 0 1

1 0 1 0 0

1 0 0 1 0

1 0 0 0 1

1 1 1 0 0

1 1 0 1 0

1 1 0 0 1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⊗�

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

vec(BT) +

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 0

1 0

1 0

1 0

1 0

1 0

0 1

0 1

0 1

0 1

0 1

0 1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⊗�

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

vec(U T)

=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

� 0 � 0 0

� 0 0 � 0

� 0 0 0 �

� � � 0 0

� � 0 � 0

� � 0 0 �

� 0 � 0 0

� 0 0 � 0

� 0 0 0 �

� � � 0 0

� � 0 � 0

� � 0 0 �

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

R(mean)

R(Lorazepam)

R(level=1)

R(level=2)

R(level=3)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦+

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

� 0

� 0

� 0

� 0

� 0

� 0

0 �

0 �

0 �

0 �

0 �

0 �

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

[u(subject=1)

u(subject=2)

]

= X b+ Zu

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 21: Functional ANOVA with random functional effects: an application to ...

3738 C. BUGLI AND P. LAMBERT

where each block � denotes the matrix of B-spline bases with (N =128 rows and S=24columns):

�=

⎡⎢⎢⎢⎢⎢⎢⎣

�1(t1) �2(t1) : : : �34(t1)

�1(t2) �2(t2) : : : �34(t2)

...

�1(t128) �2(t128) : : : �34(t128)

⎤⎥⎥⎥⎥⎥⎥⎦

and

B=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

R(mean)T

R(Lorazepam)T

R(level=1)T

R(level=2)T

R(level=3)T

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

�(mean)1 : : : �(mean)34

�(Lorazepam)1 : : : �(Lorazepam)34

�(level=1)1 : : : �(level=1)34

�(level=2)1 : : : �(level=2)34

�(level=3)1 : : : �(level=3)34

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

U =

⎡⎣u(subject=1)Tu(subject=2)

T

⎤⎦ =

[u(subject=1)1 : : : u(subject=1)34

u(subject=2)1 : : : u(subject=2)34

]

Consequently, in the matrix X , each column is in fact a block of columns because eachelement � is a matrix. The �rst block of columns describe the mean curve. The second blockof columns correspond to Lorazepam e�ect. The blocks of columns 3; 4 and 5 correspond tothe 3 levels of concentration of Lorazepam. The blocks of rows 1; 2; 3; 7; 8 and 9 correspond tocurves under placebo and the others blocks of rows to curves under Lorazepam. The blocks ofrows 1; 4; 7 and 10 correspond to curves under the �rst level of concentration of Lorazepam,the blocks of rows 2; 5; 8 and 11 correspond to curves under the second level of concentrationof Lorazepam and the others blocks of rows to curves under the third level of concentrationof Lorazepam. Matrix Z allows speci�c subject pattern.

ACKNOWLEDGEMENTS

We thank Eli Lilly for making the data set available to us. We also thank Prof J.-M. Gu�erit and Prof P.Eilers for helpful discussions. Philippe Lambert thanks the IAP network No. P5=24 of the Belgian state(Federal O�ce for Scienti�c, Technical and Cultural A�airs). Moreover, C�eline Bugli thanks Eli Lillyfor �nancial support through a patronage research grant and the ‘Fonds National pour la RechercheScienti�que’ (FNRS), Belgium for �nancial support through a FRIA research grant.

REFERENCES

1. Schumaker L. Spline Functions: Basic Theory. Wiley: New York, 1981.2. Eilers P, Marx B. Flexible smoothing with B-splines and penalties. Statistical Science 1996; 11:89–121 (withdiscussion).

3. Tarter ME, Lock MD. Model-Free Curve Estimation. Chapman & Hall: New York, 1993.4. Silverman BW. Density Estimation for Statistics and Data Analysis. Chapman & Hall: London, 1986.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739

Page 22: Functional ANOVA with random functional effects: an application to ...

FUNCTIONAL ANOVA WITH RANDOM FUNCTIONAL EFFECTS 3739

5. Hardle W. Smoothing Techniques With Implementation in S. Springer: New York, 1990.6. Wand MP, Jones MC. Kernel Smoothing. Chapman & Hall: New York, 1995.7. Fan J, Gijbels I. Local Polynomial Modelling and its Applications. Chapman & Hall: London, 1996.8. Eubank RL. Spline Smoothing and Nonparametric Regression. Marcel Dekker: New York, 1988.9. Wahba G. Spline Models for Observation Data. SIAM: Philadelphia, PA, 1990.10. Green P, Silverman BW. Nonparametric Regression and Generalized Linear Models. Chapman & Hall: London,

1994.11. Friedman J. Multivariate adaptive regression splines. Annals of Statistics 1991; 19(1):1–141.12. Stone C, Hansen MH, Kooperberg C, Truong Y. Polynomial splines and their tensor products in extended linear

modeling. Annals of Statistics 1997; 25(4):1371–1470.13. de Boor C. A practical Guide to Splines. Springer: New York, 2001.14. Dierckx P. Curve and Surface Fitting With Splines. Clarendon: Oxford, 1995.15. Ramsay JO, Silverman BW. Functional Data Analysis. Springer, 1997.16. Brumback B, Rice J. Smoothing spline models for the analysis of nested and crossed samples of curves. Journal

of the American Statistical Association 1998; 93:961–976.17. Rice J, Wu C. Nonparametric mixed e�ects models for unequally sampled noisy curves. Biometrics 2001;

57:253–259.18. O’Sullivan F. A statistical perspective on ill-posed inverse problems. Statistical Science 1986; 1:505–527.19. O’Sullivan F. Fast computation of fully automated log-density and log-hazard estimators. SIAM Journal on

Scienti�c and Statistical Computing 1988; 9:363–379.20. Ruppert D, Wand MP, Carroll R. Semiparametric Regression. Cambridge University Press: New York, 2003.21. Hastie T, Tibshirani R. Generalized Additive Models for Medical Research. Chapman & Hall: London, 1991.22. Eilers P. Curve and spline analysis of variance. Statistical Modeling: Proceedings of the 14th International

Workshop on Statistical Modeling, 1999.23. Henderson H, Searle S. The vec-permutation matrix, the vec operator and Kroenecker product: A review. Linear

and Multilinear Algebra 1981; 9:271–288.24. Sally S, Roach RN. Introductory Clinical Pharmacology. Lippincott Williams and Wilkins: Baltimore,

MD, 2003.25. Danion JM, Peretti S, Grange D, Bilik M, Imbs JL, Singer L. E�ects of chlorpromazine and Lorazepam on

explicit memory, repetition priming and cognitive skill learning in healthy volunteers. Psychopharmacology1992; 108(3):345–351.

26. Jasper HH. The 10–20 system of the international federation. Electroencephalography and ClinicalNeurophysiology 1958; 10:371–375.

27. Naatanen R. Attention and Brain Function. Lawrence Erlbaum Associates: Hove, NJ, 1992.28. Campbell RJ, Hinsie LE. Psychiatric Dictionary. Oxford University Press: New York, 1989.29. Gu�erit JM. Les potentiels �evoqu�es. Masson: Paris, 1993.30. Bugli C, Lambert P, Boulanger B, Ledent E, Pereira A, Nardone P. Statistical analysis of electroencephalograms.

Discussion Paper Number DP0403, Institut de Statistique, Universit�e catholique de Louvain, B-1348 Louvain-la-Neuve, Belgium, 2004.

31. Searle S. Linear Models. Wiley: New York, 1971.32. Durban M, Harezlak J, Wand MP, Carroll RJ. Simple �tting of subject-speci�c curves for longitudinal data.

Statistics in Medicine 2005; 24:1153–1167.33. Lagarias JC, Reeds JA, Wright MH, Wright PE. Convergence properties of the Nelder–Mead simplex method

in low dimensions. SIAM Journal of Optimisation 1998; 9(1):112–147.34. Bugli C, Lambert P. Functional ANOVA with random functional e�ects: an application to event-related potentials

modelling for electroencephalograms analysis. Discussion Paper Number DP0428, Institut de Statistique,Universit�e catholique de Louvain, B-1348 Louvain-la-Neuve, Belgium, 2004.

35. Polich J, Kok A. Cognitive and biological determinants of P300: an integrative review. Biological Psychology1995; 41:103–146.

36. Eilers P, Marx B. Generalized linear additive smooth structures. Journal of Computational and GraphicalStatistics 2002; 11(4):758–783.

37. Dominici F, McDerMott A, Hastie T. Improved semiparametric time series models of air pollution and mortality.Journal of the American Statistical Association 2004; 99:938–948.

38. Wand MP. Smoothing and mixed models. Computational Statistics 2003; 18:223–249.

Copyright ? 2005 John Wiley & Sons, Ltd. Statist. Med. 2006; 25:3718–3739