Dynamical analysis of clustering on financial market … analysis of clustering on financial market...

13
Dynamical analysis of clustering on financial market data Nicolò Musmeci, PhD student, Department of Mathematics, King's College London [email protected] Supervisor: Tiziana Di Matteo Econophysics and Networks Across Scales 27-31 May 2013, Leiden

Transcript of Dynamical analysis of clustering on financial market … analysis of clustering on financial market...

Dynamical analysis of clustering on financial

market data

Nicolò Musmeci, PhD student, Department of Mathematics, King's College London [email protected]: Tiziana Di Matteo

Econophysics and Networks Across Scales 27-31 May 2013, Leiden

2

Overview and motivations

Stocks time series PMFG DBHT hierarchical clusters partition of stocks

DBHT: based on the topological properties of PMFG; it is deterministic, requires no a-priori parameters and it does not need any expert supervision.

Planar Maximally Filtered Graph

Correlation-based network

Cluster = subset of stocks with a significant cross-correlation

. Questions: – How do DBHT clusters evolve with time? – Measures of cluster persistence? – What is the relation between clusters and industrial sector classification? – How does this relation evolve with time?

. Motivations:– Description of market cycles in terms of clusters;– Portfolio diversification : do clusters perform better than industrial sectors?

[1] Won-Min Song, T. Di Matteo, T. Aste, "Hierarchical information clustering by means of topologically embedded graphs", PLoS One 7(3) (2012) e31929.

[1]

3

Dataset and analyses procedure

N = 342 stocks in US equity market, daily prices 1997-2012 (T=4026 trading days)

Moving time window of 3 years (750 trading days) . correlation matrix of log returns (with exponential smoothing) . PMFG . DBHT clusters

97 98 99 00 01 02 03 04

● Data from Bloomberg

…..

…..

PMFG 1

Time window 1

T

n = 100 time windows 100 PMFGs 100 clustering partitions

4

Industrial sectors and clusters What is the relation between clusters and industrial classification?

Cluster Size % Finance

%Healthcare

%Technology

%Basic Materials

%Industrial Goods

%Utilities

%Services

%Consumer Goods

%Conglomerates

Total

1 142 26 % 5.6 % 6.3 % 2.1 % 19 % 0.7 % 23.9 % 13.4 % 2.8 % 100 %

2 11 90.9 % 9.1 % 100 %

3 22 100 % 100 %

4 23 8.7 % 78.2 % 4.34 % 4.34 % 4.34 % 100 %

5 27 7.4 % 77.7 % 11.1 % 3.7 % 100 %

6 25 32 % 28 % 8 % 24 % 8 % 100 %

7 34 38.2 % 8.8 % 52.9 % 100 %

8 25 92 % 8 % 100 %

9 14 7.1 % 7.1 % 78.6 % 7.1 % 100 %

10 19 84.2 % 10.5 % 5.2 % 100 %

Time window number 1 (02/01/1997 – 27/12/1999)

E.g.

There is a strong, although not complete, similarity between clusters and sectors

5

Industrial sectors and clusters What is the relation between clusters and industrial classification? How does it evolve with time?

Adjusted rand index (ARI) [2]

Overall similarity:

It is a measure of the similarity between two data clusterings X and Y. It can yield a value between -1 and +1 .

[2] Hubert L., Arabie P. : “Comparing partitions”. Journal of Classification 2 (1): 193–218 (1985)

6

Industrial sectors and clusters What is the relation between clusters and industrial classification? How does it evolve with time?

Adjusted rand index (ARI) [2]

Overall similarity:

[2] Hubert L., Arabie P. : “Comparing partitions”. Journal of Classification 2 (1): 193–218 (1985)

It is a measure of the similarity between two data clusterings X and Y. It can yield a value between -1 and +1 .

7

Clusters dynamicsHow do DBHT clusters evolve with time? How do we measure clusters persistence?

Patterns persistence

Adjacency correlation function

∀ stock i

Clusters

Sectors

[3] J. Tang, S. Scellato, M. Musolesi, C. Mascolo, and V. Latora. “Small-world behavior in time-varying graphs”. Phys. Rev. E, 81:055101, 2010

[3]

a(i , j ,t) : entry i , j of the adjacency matrix at time t

8

Clusters dynamicsPatterns persistence

Intra-cluster patterns are more robust than intra-sectors and global ones

9

Clusters dynamics

Cluster A

Cluster B

Cluster C

Cluster α

Cluster β

Cluster γ

Time window 1

Time window 10

nA nαnA ,α

N

Cluster α (time window = 10)

Cluster A (time window = 1)

All stocks

Hypergeometric test [4]

Can the stocks in common between α and Α be explained only in terms of chance?

If the hypergeometric hypotesis is rejected it means the two clusters show a

reciprocal overexpression.

They are the same cluster

[4] M. Tumminello, S. Miccichè, F. Lillo, J. Varho, J. Piilo, R. N. Mantegna: “Community characterization of heterogeneous complex systems”. J. Stat. Mech., P01019, (2011).Level of significance: 1%

Clusters persistence

10

Clusters dynamicsClusters persistence

Cluster number 2 (Financial) Cluster number 3 (Utilities)

Clusters at time window 1: analyses of persistence

Cluster disappears (Hyperg.hypothesisnot rejected)

Whenever the test shows overexpression, we plot the fraction of common stocks

Level of significance: 1%

11

Clusters dynamicsClusters persistence

High persistence. Each cluster persistence is strongly dependent on the industrial sector that the cluster overexpresses.

Cluster number 4 (Technology) Cluster number 8 (Basic Materials)

Level of significance: 1%

12

Conclusions

We have performed a set of dynamical analyses on the DBHT clusters, throughout a time period of 15 years.

Patterns persistence varies with time. Intra-cluster patterns are more robust than intra-sector patterns.

In general clusters show high persistence. Each cluster persistence is strongly dependent on the industrial sector that the cluster overexpresses.

Similarity clusters/industrial sectors varies with time; in particular it drops during the 2007-08 crisis

Next steps...

DBHT clusters as a stock selection criterion for improve portfolio diversification

● Investment simulations: 10% decrease in portfolio volatility compared to industrial sector diversification

Clusters dynamics

13

Thanks for your attention

[email protected]