Functional Principal Component Analysis of Financial Time...

28
Functional Principal Component Analysis of Financial Time Series G. Damiana Costanzo Dipartimento di Economia e Statistica, Universit ` a della Calabria 87036 Arcavacata di Rende (CS), Italy [email protected] Cnam - Paris, November 23rd 2005 Summary 1. Introduction 2. Functional data vs. Multidimensional data modeling 3. Functional PCA 4. The e.e.v. MIB30 dataset 5. The statistical analysis 6. Conclusions and perspectives

Transcript of Functional Principal Component Analysis of Financial Time...

Page 1: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Functional Principal Component Analysisof Financial Time Series

G. Damiana CostanzoDipartimento di Economia e Statistica, Universita della Calabria

87036 Arcavacata di Rende (CS), [email protected]

Cnam - Paris, November 23rd 2005

Summary

1. Introduction

2. Functional data vs. Multidimensional data modeling

3. Functional PCA

4. The e.e.v. MIB30 dataset

5. The statistical analysis

6. Conclusions and perspectives

Page 2: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Ouverture

The problem (methodological perspective):

• Dimensional reduction of a functional data set with

homogeneous piecewise components

The datasets:

• Daily quantities (prices and e.e.v.) of the shares constitu ting the

MIB30 basket in the period: January 3rd, 2000 - December 30th ,

2002 (courtesy of the Research & Development DBMS (Borsa Ita l-

iana)).

The statistical method:

• Functional principal component analysis

Page 3: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Why Functional Data

FdA is a generalization of classical MvA when data are func-

tions, curves or trajectories.

Such data arise quite naturally in different fields.

For example in phenomena where measurements come from an

automated on-line collection process (on-line sensing and moni-

toring equipments):

in economic analysis, statistical quality control of manufacturing pro-

cess, shape analysis and natural science: seismology, meteorology,

physiology and medicine (see recent paper on gait data by Preda &

Saporta, 2005)

Page 4: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

MD vs. FD modeling

Given a set of n units ω1, . . . , ωn.

Multidimensional Data Functional Data

Data set:

X: set of points in Rp

...

...

x1x2

xn

XT : set of functions on T

...

...

x1(t)x2(t)

xn(t)

# variables p < ∞ p = ∞

Vector space: Euclidean space Rp Hilbert space H

see e.g. Ramsay (1982), Saporta (1985).

Page 5: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Rationale

Observed data functions must be thought as single entities r ather

than a sequence of individual observations: the term functi onal

refers to the intrinsec structure of the data rather than to t heir ex-

plicit form. In fact, from a practical point of view, functio nal data

are usually observed and recorded discretely.

Let {ω1, . . . , ωn} be a set of n units and let yi = (yi(t1), . . . , yi(tp))

be a sample of measurements of a variable Y taken at p times

t1, . . . , tp ∈ T = [a, b] in the i-th unit ωi, (i = 1, . . . , n). Such data

yi (i = 1, . . . , n) are regarded as functional so they are called raw

functional data .

Page 6: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

What is new then ?

Owing to the functional nature of the data it is assumed that t here

is a thrue function underlying the (discretely) observed da ta.

=⇒

The first step is to convert raw functional data into a suitabl e

functional form and thus a smooth function xi(t) is assumed to

lie behind yi which is referred to as the true functional form ;

this implies, in principle, that we can evaluate x at any point

t ∈ T and in addition, we can evaluate any derivative x(m)(t)

that exists at t up to some m-th order. Finally, the set XT =

{x1(t), . . . , xn(t)}t∈T is called the functional dataset .

Page 7: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

A Remark

Though functional data analysis often deals with temporal d ata,

its scope and objectives are quite different from time serie s anal-

ysis. The last focuses mainly on modelling data, or in predic ting

future observations, the techniques in FDA are essentially ex-

ploratory in nature: the emphasis is on trajectories and sha pes.

Moreover by adopting a functional approach:

a) unequally-spaced observations can be considered with miss -

ing values;

b) in some cases full description of data involves the study of

certain derivatives (i.e. velocity and accelleration ).

Page 8: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Estimation Strategy 1/3

The definition of the FDA of a set of data which are functional e ither in factor in principle usually implies the following tasks:

1. choice of function space in which the analysis has to take p lace;

2. specification of the analysis in functional analytic term s;

3. determination of how a finite dimensional observation vec tor has to bemapped into function space;

4. description of what the FDA of the functional representat ions of the finiteobservations means in terms of analysing the observations t hemselves.

Page 9: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Estimation Strategy 2/3

We distinguish two cases depending on whether data yi are assumed to beerrorless or not. In the first case function xi(t) should satisfy the constraints:

xi(tj) = yi(tj) j = 1, . . . , p . (1)

When observational errors are assumed to be present in the ra w data, the con-version from yi to function xi(t) may involve a smoothing procedure and inmodelling terms we write:

yi(tj) = xi(tj) + ǫj j = 1, . . . , p (2)

where the error term ǫj contributes a roughness to the raw data. The standardassumption requires that the ǫj ’s are i.i.d., with zero mean and common finitevariance.

Page 10: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Estimation Strategy 3/3

A number of strategies can be considered to convert raw funct ional data intothe true functional form based on different approaches, see e.g. Simonoff,1996.Here we considered the roughness penalty or regularization approach basedon spline smoothing.

This method estimates x from observations of the form (2) by making explicittwo possible aims in curve estimation:

a) we wish to ensure that the estimated curve gives a good fit to th e data,for ex. in terms of the residual sum of squares

j[yj − x(tj)]2;

b) we do not want the fit to be too good if this results in a curve x that is ex-cessively irregular (for ex. by the smoothing we can gain inf ormations aboutderivates of the thrue functions).

Page 11: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Functional PCA 1/2

The objective in principal component analysis of functiona l data is the orthog-onal decomposition of the variance (kernel) function:

v(t, u) :=1

n

n∑

i=1

{xi(t) − x(t)}{xi(u) − x(u)}

(which is the counterpart of the covariance matrix of a multi dimensional dataset)in order to isolate the dominant components of functional va riation, see e.g.Ramsay & Silverman (1997, 2002), James et al. (2000).

In the functional space H, the role of the covariance matrix is played by thecovariance operator V defined by:

V ξ :=∫

v(·, t)ξ(t)dt for any function ξ ∈ H.

In analogy with the multivariate case, the functional PCA pr oblem leads to theeigenequation:

V ξ = λξ

where now ξ is an eigenfunction, rather than an eigenvector, and λ is theeigenvalue.

Page 12: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Functional PCA 2/2

Functional PCA is characterized by the decomposition of the variance func-tion:

v(t, u) =∑

j

λjξj(t)ξj(u)

where the eigenvalues:

λj :=

Tξj(t)v(t, u)ξj(u)dt du

are positive and non decreasing while the eigenfunctions mu st satisfy theconstraints:

Tξ2j (t)dt = 1 and

Tξjξi(t)dt = 0 (i < j).

The ξj ’s are usually called principal component weight functions .

Finally the principal component scores (of ξ(t)) of the units in the datasetare the values wi given by:

wi :=

Tξ(t)xi(t)dt .

Page 13: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The MIB30 dataset 1/2

Data considered here consist of the total value of the exchan ged equivalentvalue (e.e.v.) of the 30 shares composing the MIB30 index in t he period ”Jan-uary 3rd, 2000 - December 30th”, 2002, see e.g. Costanzo (200 3). The datamatrix is 30 × 758 (note that p ≫ n).

An important characteristic of this basket is that it is ”ope n” since its compo-sition is normally updated twice a year, in the months of Marc h and September(ordinary revisions). However, in response to extraordina ry events, or for tech-nical reasons ordinary revisions may be brought forward or p ostponed withrespect to the scheduled date, see www.borsaitalia.it for details.

In our datased they are 21 companies which remain during the t hree yearsand 23 companies sharing the other 9 places in the basket (sin ce they remainin the basket only for one or more short periods): they have be en denoted byT1, . . . , T9. Such mixed trajectories will be called here homogeneous piece-wise components of the functional data set.

Page 14: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The MIB30 dataset 2/2

Example of homogeneous piecewise components T1, T2, T3.

Date T1 T2 T303/01/2000 AEM Banca Commerciale Italiana Banca di Roma04/04/2000 AEM Banca Commerciale Italiana Banca di Roma18/09/2000 AEM Banca Commerciale Italiana Banca di Roma02/01/2001 AEM Banca Commerciale Italiana Banca di Roma19/03/2001 AEM Italgas Banca di Roma02/05/2001 AEM Italgas Banca di Roma24/08/2001 AEM Italgas Banca di Roma24/09/2001 AEM Italgas Banca di Roma18/03/2002 Snam Rete Gas Italgas Banca di Roma01/07/2002 AEM Italgas Capitalia15/07/2002 AEM Italgas Capitalia23/09/2002 Banca Antonveneta Italgas Capitalia04/12/2002 Banca Antonveneta Italgas Capitalia

Page 15: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The statistical analysis 1/5

(see Ingrassia and Costanzo, 2004)Examples of two trajectories : we set up a B-spline basis with 150 knots (ap-proximately one knot for each week) and order 6 and make the fu nctional data

0 200 400 600

Day

05

00

00

00

00

10

00

00

00

00

15

00

00

00

00

e.e

.v.

Enel

0 200 400 600

Day

05

00

00

00

00

10

00

00

00

00

15

00

00

00

00

e.e

.v.

Eni

Page 16: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The statistical analysis 2/5

The entire functional dataset :

0 200 400 600

Day

020

0000

000

4000

0000

060

0000

000

8000

0000

0

e.e.

v.

Page 17: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The statistical analysis 3/5

Summary statistics

0 200 400 600

Day

20

00

00

00

60

00

00

00

10

00

00

00

01

40

00

00

00

Me

an

e.e

.v.

Titles Mean

0 200 400 600

Day

50

00

00

00

10

00

00

00

01

50

00

00

00

Std

. D

ev.

e.e

.v.

Titles Standard Deviation

Page 18: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The statistical analysis 4/5

PCA results : The first two components explain respectively 88.9% and 7.1 %of the functional variability.

0 200 400 600

PCA function 1 (Percentage of variability 88.9 )

-50

00

00

00

50

00

00

00

15

00

00

00

0

++

+++

++

+

+

+

+

+

+++

++++++++++++++

+++

+++++

++++

+

++

+

++

+

+

++

+++++++++

+++++++

+

++

++++

+

++++

+

++++

++++++++

+

++

++++++

++++++

++++

+

+++++++++++

+++++++++++

++

+

++++++++

++

++

+

--

---

--

-

-

-

-

-

---

--------------

---

-----

----

-

--

-

--

-

-

--

---------

-------

-

--

----

-

----

-

----

-----

----

--

------

------

----

-

-------

----

----------

---

-

--------

--

--

-

0 200 400 600

PCA function 2 (Percentage of variability 7.1 )

05

00

00

00

01

00

00

00

00

15

00

00

00

0

+

+

+++

++

+

+

+

+

+

+++

++

+

+

+

+

++++++++

+++

++++

+

++

++

+

++

+

++

+

+

++

+++++++++

+++

+

+++

+

++

+++

+

+

+

++++

++++

+++++

+++

+

++

++++

++

+

++++

++

+++

+

+++

++

++

++++

+

+

+++

+

+

++++

++

+

++++

++++

++

+

+

+

-

-

---

--

-

-

-

-

-

---

--

-

-

-

-

--------

---

----

-

--

--

-

--

-

--

-

-

--

---------

---

-

---

-

--

---

-

-

-

----

----

-----

---

-

-

-

------

------

-

---

-

--

-

--

--

----

-

-

-

--

-

------

-

-

--------

--

-

-

-

Page 19: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

The statistical analysis 5/5

PC scores on the two first harmonics .

-2000000000 0 2000000000 4000000000 6000000000 8000000000

Scores on Harmonic 1

-200

0000

000

-100

0000

000

010

0000

0000

Sco

res

on H

arm

onic

2Alleanza

Autostrade

B.FideraumMontePaschi

B.N.L

Enel Eni

FiatFinmecc.Generali

Mediaset

MedioB.Mediolanum

Olivetti

PirelliRas SanPaolo

SeatP.Gialle

Telecom

Tim

Unicred.It.

T1

T2

T3

T4

T5T6T7T8

T9

1. companies with large positive (negative) values on the fir st harmonic presenta larger (smaller) value than the mean during the entire cons idered period;

2. companies with large positive (negative) values on the se cond harmonicshow a large decrement (increment) after September 11th, 20 01 (Day=431).

Page 20: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

FPCA on standardized data 1/2

An insightful understanding comes from the PC analysis of th e daily standard-izedraw functional data: zij =

xij−xi

si(i = 1, . . . ,758, j = 1, . . . ,30)

0 200 400 600

PCA function 1 (Percentage of variability 89.4 )

-0.5

0.0

0.5

++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

----------------------------------------------------------------------------------------------

------

0 200 400 600

PCA function 2 (Percentage of variability 6.9 )

-0.4

-0.2

0.0

0.2

0.4

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

----------------------------------------

-------

------

-----

----

----

----

----

----

----

----

------

-----

---

The second PC highlights the shock of September 11th, 2001; s o it can beconsidered as the shock component (Day=431).

Page 21: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

What about the 3rd and 4th FunctionalCurves?Just have a look:

0 200 400 600

PCA function 3 (Percentage of variability 2.5 )

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3 +

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++++++++++++++++

+++++

++

++

++

++

++

++

++

+

+

+

+

+

+

+

+

+

+

++

+

++

+

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

-- - - - - - - - - - - - - - - - -

- --

--

--

- --

--

--

--

--

-

-

-

-

-

-

-

-

-

- -

--

-

-

-

0 200 400 600

PCA function 4 (Percentage of variability 0.9 )

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

+

+

+

+

+

+

+

+

+

+

+

+

++

++

++

++++++++++++++++

++

++

++

++

++

++

++

++++++++++++++++++

++++++

+++

++

++++++++++++++++

+

++

++

+

+

-

-

-

-

-

-

-

-

-

-

-

-

--

--

--

- - - - - - - - - - - - - - - --

--

--

--

--

--

--

--

- - - - - - - - - - - - - - - - - - - - - - -- -

- - --

- - - - - - - - - - - - - --

- -

- -

-

-

-

They account for small proportions of the functional variab ility, but they showother modes of variation in the curves dataset.

Page 22: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

FPCA on standardized data 2/2

PC scores on the two first harmonics .

-20 0 20 40 60

Scores on Harmonic 1

-20

-10

010

Sco

res

on H

arm

onic

2

Autostrade

EnelEni

Generali

Mediaset

Olivetti

SanPaolo

SeatP.Gialle

Telecom

Tim

Unicred.It.

T3

T4

T9

Page 23: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Back to raw data:Analysis of the 1st PCConsider the Companies with the largest minimum standardiz ed values overthe three years:

mini=1,...,758zij Company Score

0.3294 Eni w1 > 400.0579 Telecom w1 > 400.0566 Tim w1 > 40

-0.2197 Enel 10 < w1 < 40-0.2770 Generali 10 < w1 < 40-0.5360 Olivetti 10 < w1 < 40-0.5554 Unicredito 10 < w1 < 40-0.6404 T4 near 0-0.6552 Mediaset near 0-0.7511 Seat Pagine Gialle near 0

The correlation r ( w1i, zij) = 0.96

Page 24: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Analysis of the 2nd PCLet xBi the mean value of the e.e.v. of the ith company over the days: 1,...,431(i.e. before September 11th, 2001) and xAi the corresponding mean value afterSeptember 11th, 2001. Let us consider the variation per cent :

δi =xAi − xBi

xBi

100%

δi Company Score

-80.20% Seat Pagine Gialle w2 > 10-58.50% Olivetti w2 > 10-47.08% Enel 0 < w2 < 10

63.21% Unicredito −20 < w2 < −1083.10% Autostrade −10 < w2 < 0

133.79% T9 w2 < −20

The companies with large positive (negative) scores on the 2 nd PC presentthe largest decrements (increments) after September 11th, 2001.The correlation r ( w2i, zij) = 0.84

Page 25: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Conclusions and perspectives for fu-ture researches

1. Functional PCA looks an interesting tools in order to gain insight in func-tional dataset.

2. Does it open methodological perspectives for the constru ction of new fi-nancial indeces ?

Some existing stock market indices have been criticized (e. g. Elton andGruber, 1973, 1995): the famous U.S. Dow Jones presents some statisticalflaws.In Italy MIB30 basket is summarized by the MIB30 index=⇒

analysis of the MIB30 index within the FDA framework: ′how′ and howmuch is it statistical representative of the basket?

Page 26: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

Year

MIB

30 In

dex

2000.0 2000.5 2001.0 2001.5 2002.0 2002.5 2003.0

1520

2530

35

in fact MIB30 is calculated according to the formula:

MIB30 = 10000

30∑

i=1

pit

pi0wiT

rT (3)

where the weight of the i-th share in the basket (i.e. the weight of each com-pany in the index) is:

wiT =pi0qi0

∑30i=1 pi0qi0

Page 27: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

REFERENCES

- Costanzo, G. D. (2003), A graphical analysis of the dynamics of

the MIB30 index in the period 2000-2002 by a functional data a p-

proach. SIS 2003, Napoli.

- Elton, E.J. and Gruber, M. J. (1973), Estimating the dependence

structures of share prices. Implications for Portfolio. Jo urnal of

Finance, 1203-1232.

- Elton, E.J. and Gruber, M. J. (1995), Modern Portfolio Theory and

Investment Analysis. John Wiley and Sons, New York.

- Ingrassia, S. and Costanzo, G. D. (2004), Functional principal

component analysis of financial time series, New Developmen ts

in Classification and Data Analysis, Spriger-Verlag, Berli n.

Page 28: Functional Principal Component Analysis of Financial Time ...cedric.cnam.fr/~saporta/Fpca_Costanzo.pdf · Functional Principal Component Analysis of Financial Time Series G. Damiana

- Preda C. and Saporta G. (2005), PLS discriminant analysis for

functional data. XI ASMDA Symposium, Brest, May 2005.

- Ramsay, J. O.(1982), When the data are functions, Psychome-

trika, 47.

- Ramsay, J. O. and Silverman, B. W. (1997), Functional Data Anal-

ysis, Springer-Verlag, New York.

- Ramsay, J. O and Silverman, B. W. (2002), Applied Functional

Data Analysis, Springer-Verlag, New York.

- Saporta, G. (1985), Data Analysis For Numerical and Categori-

cal Individual Time Series, Applied Stochastic Models and D ata

Analysis, 1.

- Simonoff, J. S. (1996), Smoothing Methods in Statistics, Springer-

Verlag, New York.