Break-points detection with atheoretical regression trees

39
Break-points detection with atheoretical regression trees Marco Reale University of Canterbury Universidade Federal do Parana, 27 th November 2006

description

Marco Reale University of Canterbury Universidade Federal do Parana, 27 th November 2006. Break-points detection with atheoretical regression trees. Acknowledgements. The results presented are the outcome of joint work with: Carmela Cappelli and William Rea. Structural Breaks. - PowerPoint PPT Presentation

Transcript of Break-points detection with atheoretical regression trees

Page 1: Break-points detection with atheoretical regression trees

Break-points detection with atheoretical regression trees

Marco RealeUniversity of Canterbury

Universidade Federal do Parana, 27th November 2006

Page 2: Break-points detection with atheoretical regression trees

Acknowledgements

The results presented are the outcome of joint work with:

Carmela Cappelli

and

William Rea

Page 3: Break-points detection with atheoretical regression trees

Structural Breaks

• A structural break is a statement about parameters in the context of a specific model.

• A structural break has occurred if at least one of the model parameters has changed value at some point (break-point).

• We consider time series data.

Page 4: Break-points detection with atheoretical regression trees

Relevance

Their detection is important for:

• forecasting (latest update of the DGP);

• Analysis.

With regard to this point a recent debated issue is fractional integration vs structural breaks.

Page 5: Break-points detection with atheoretical regression trees

Milestones: Chow 1960

• Test for an a priori candidate break-point.

• Splits the sample period in two subperiods and test the equality of the parameter sets with an F statistic.

• It cannot be used for unknown dates: misinformation or bias.

Page 6: Break-points detection with atheoretical regression trees

Milestones: Quandt 1960

• We can compute Chow statistics for all possible break-points.

• If the candidate breakpoint is known a priori, then a Chi-square statistics can be used.

Page 7: Break-points detection with atheoretical regression trees

Milestones: CUSUM 1974

• Proposed by Brown, Durbin and Evans.

• It checks the cumulative sum of the residuals.

• It tests the null of no breakpoints against one or more breakpoints.

Page 8: Break-points detection with atheoretical regression trees

Milestones: Andrews 1993

It exploits the Quandt statistics for a priori unknown break-points.

Page 9: Break-points detection with atheoretical regression trees

Bai and Perron 1998, 2003

• It finds multiple breaks at unknown times.• Application of Fisher algorithm (1958) to find

optimal exhaustive partitions.• It requires prior indication of number of breaks.• Applied recursively after positive indication

provided by CUSUM.• Use of AIC to decide the number of breaks.

Page 10: Break-points detection with atheoretical regression trees

Fisher’s algorithm

Page 11: Break-points detection with atheoretical regression trees

Examples with G=2,3 and m=1

Page 12: Break-points detection with atheoretical regression trees

Example with G=3 and m=2

Page 13: Break-points detection with atheoretical regression trees

Bai, Perron and Fisher

• Eventually Fisher selects the partition with the minimum deviance.

• It is a global optimizer, but was computationally feasible only for very small n and G (even with today's computers).

• Using later results in dynamic programming Bai and Perron can use the Fisher algorithm reasonably fast for n=1000 and any G and m.

• Fisher’s algorithm is related to regression trees.

Page 14: Break-points detection with atheoretical regression trees

Trees (1)

• Trees are particular kinds of directed acyclic graphs.

• In particular we consider binary trees.

• Splits to reduce heterogeneity.

Page 15: Break-points detection with atheoretical regression trees

Trees (2)

Node 1 is called root.

Node 5 is called leaf.

The other nodes are called branches.

Page 16: Break-points detection with atheoretical regression trees

Regression Trees (1)

• Regression trees are sequences of hierarchical dichotomous partitions with maximum homogeneity of y projected by partitions of explanatory variables.

• y is a control or response variable.

Page 17: Break-points detection with atheoretical regression trees

Regression trees (2)

Page 18: Break-points detection with atheoretical regression trees

Regression trees optimality

Regression trees don't provide necessarily optimal partitions

Page 19: Break-points detection with atheoretical regression trees

Atheoretical Regression Trees

• Any artificial strictly ascending or descending sequence as a covariate, e.g. {1,2,3,4...} would do all the optimal dichotomous partitions.

• It also works as a counter.• It is not a theory based covariate so the name,

Atheoretical regression trees ....yes it's ART.• ART is not a global optimizer.

Page 20: Break-points detection with atheoretical regression trees

Pruning the tree

Trees tend to oversplit so the overgrown tree needs a pruning procedure:

• Cross Validation, is the usual procedure in regression tree, not ideal in general for time series;

• AIC (Akaike, 1973) tends to oversplit• BIC (Schwarz, 1978) very good

All the information criteria robust for non normality, especially BIC.

Page 21: Break-points detection with atheoretical regression trees

Single break simulations

Page 22: Break-points detection with atheoretical regression trees

Noisy square simulations

Page 23: Break-points detection with atheoretical regression trees

CUSUM on noisy square

Page 24: Break-points detection with atheoretical regression trees

ART on noisy square

Page 25: Break-points detection with atheoretical regression trees

Some comments

• The simulations show an excellent performance.

• However ART performs better in long regimes.

• With short regimes it tends to find spurious breaks but the performance can be sensibly improved with an enhanced pruning technique (ETP).

Page 26: Break-points detection with atheoretical regression trees

Bai and Perron on noisy square

Page 27: Break-points detection with atheoretical regression trees

Some comments

• BP tends to find breaks any time the CUSUM rejects the null.

• It unlikely finds spurious breaks.

but

• It tends to underestimate the number of breaks.

Page 28: Break-points detection with atheoretical regression trees

Application to Michigan-Huron

• The Michigan-Huron lakes play a very important role in the U.S. economy and hence they are regularly monitored.

• In particular we consider the mean water level (over one year) time series from 1860 to 2000.

Page 29: Break-points detection with atheoretical regression trees

Michigan-Huron (2)

Page 30: Break-points detection with atheoretical regression trees

Michigan-Huron (3)

Page 31: Break-points detection with atheoretical regression trees

Michigan-Huron (4)

Page 32: Break-points detection with atheoretical regression trees

Campito Mountain

• We applied ART to the Campito Mountain Bristlecone Pine data which is an unbroken set of tree ring widths covering the period 3435BC to1969AD. A series of this length can be analyzed by ART in a few seconds. BPP was applied to the series and took more than 200 hours of CPU time to complete.Tree ring data are used as proxies for past climatic conditions.

Page 33: Break-points detection with atheoretical regression trees

Campito Mountain (2)

Page 34: Break-points detection with atheoretical regression trees

Campito Mountain (3)

Page 35: Break-points detection with atheoretical regression trees

The four most recent periods…

…are:• 1863-1969: Industrialization and global

warming.• 1333-1862: The Little Ice Age. • 1018-1332: The Medieval Climate

Optimum.• 862-1017: Extreme drought in the Sierra

Nevadas.

Page 36: Break-points detection with atheoretical regression trees
Page 37: Break-points detection with atheoretical regression trees

Niceties of ART

• Speed: Art has O(n(t)) while BP O(nng).• Simplicity: it can be easily implemented or run

with packages implementing regression trees. • Feasibility: it can be used without almost any

limitation on either the number of observations or the number of segments.

• Visualization: it results in a hierarchical tree diagram that allows for inputation of a priori knowledge.

Page 38: Break-points detection with atheoretical regression trees

…and

... and of course you can say you're doing

ART

Page 39: Break-points detection with atheoretical regression trees

Dedicated to Paulo