Modeling Dependency with Copula: Implications to Engineers and Planners

31
Modeling Dependency with Copula: Implications to Engineers and Planners Haizhong Wang, Ph.D, Assistant Professor School of Civil and Construction Engineering Oregon State University, Corvallis, OR [email protected] Seminar at Portland State University Portland, OR September 28, 2012 Haizhong Wang (OSU) Copula Modeling September 28, 2012 1 / 31

description

Haizhong Wang, Oregon State University

Transcript of Modeling Dependency with Copula: Implications to Engineers and Planners

Page 1: Modeling Dependency with Copula: Implications to Engineers and Planners

Modeling Dependency with Copula: Implicationsto Engineers and Planners

Haizhong Wang, Ph.D, Assistant ProfessorSchool of Civil and Construction Engineering

Oregon State University, Corvallis, [email protected]

Seminar at Portland State UniversityPortland, OR

September 28, 2012

Haizhong Wang (OSU) Copula Modeling September 28, 2012 1 / 31

Page 2: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Modeling Philosophy - By George Box

Essentially, all models are wrong,but some are useful.

In Empirical Model-Building and Responses Surfaces (1987), GeorgeE. P. Box and Norman R. Draper, p242, ISBN: 0471810339.

In Box’s paper, Robustness in the Strategy of Scientific ModelBuilding, in Robustness in Statistics: Proceedings of a Workshop(May 1979) edited by Launer and GN Wilkinson.

Haizhong Wang (OSU) Copula Modeling September 28, 2012 2 / 31

Page 3: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Deterministic vs. Stochastic

Haizhong Wang (OSU) Copula Modeling September 28, 2012 3 / 31

Page 4: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Monte Carlo Simulation

Source: “A Practical Guide to Monte Carlo Simulation”, by Jon Wittwer

Haizhong Wang (OSU) Copula Modeling September 28, 2012 4 / 31

Page 5: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Dependence measure

How do we measure dependence between random variables?

Correlation coefficient: a measure of linear dependence betweenrandom variables

Concordance: if “large” values of one random variable tend to beassociated with “large” values of the other and “small” values of onewith “small” values of the other.

Discordance: vice versa

Haizhong Wang (OSU) Copula Modeling September 28, 2012 5 / 31

Page 6: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Measure of Dependence

Concordance

Kendall’s tau

Spearman’s rho

Linear correlation: nonelliptical distributions

Haizhong Wang (OSU) Copula Modeling September 28, 2012 6 / 31

Page 7: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Dependence of random variables

Spatial dependencies: the dependence between a number of variablesat the same time

Temporal dependencies: the inter-temporal dependence structure of aprocess

The fact: Covariance only captures the linear dependence relationships forspecial classes of distributions such as normal distribution

The question: Is there a possibility to capture the whole dependencestructure without any disturbing effects coming from the marginaldistributions?

Haizhong Wang (OSU) Copula Modeling September 28, 2012 7 / 31

Page 8: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Copula

What is a Copula?

A Latin noun that means “a link, tie, bond”

Copulas are used to describe dependence between random variables

An Introduction to Copulas, Second Edition, Roger Nelson, 2005.

Haizhong Wang (OSU) Copula Modeling September 28, 2012 8 / 31

Page 9: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Statistics about Copula

Google 2003: Copula → 10,000 results

Google 2005: Copula → 650,000 results

Google 2012: Copula → 1950,000 results

Copulas: Tales and Facts, Thomas Mikosch, 2005. Citation: 140

Haizhong Wang (OSU) Copula Modeling September 28, 2012 9 / 31

Page 10: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

The first appearance

1940/1941: Hoeffding studied nonparametric measures of associationsuch as Spearman’s rho in multivariate distributions

1959: The word copula appears for the first time (Sklar, 1959)

1999: Introduced to financial applications (Embrechts et al., 1999)

2008: Widely adopted in insurance, finance, energy, hydrology,survival analysis, etc.

Source: Daniel Berg, Using Copulas: an Introduction toPractitioners

Haizhong Wang (OSU) Copula Modeling September 28, 2012 10 / 31

Page 11: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Foundation

Definition and Sklar Theorem (1959)

Sklar theorem describes “join together one-dimensional distributionfunctions to form multivariate distribution functions”

Let H be a joint distribution function with margins F1, . . . ,Fd . Then thereexists a copula C : [0, 1]d → [0, 1] such that

H(x1, . . . , xd) = C (F1(x1), . . . ,Fd(xd))

Theoretically, C captures all aspects of dependence and Fi captures allaspects of marginal distributions

Haizhong Wang (OSU) Copula Modeling September 28, 2012 11 / 31

Page 12: Modeling Dependency with Copula: Implications to Engineers and Planners

Philosophy

Applications

Civil engineering- reliability of analysis of highway bridges

Climate and weather related research

Analysis of extrema in financial assets and returns

Failure of paired organs in health science

Human mortality in insurance (actuarial science)

Mortalities of spouses

Mortalities of parents and children twins (identical or nonidentical)

Haizhong Wang (OSU) Copula Modeling September 28, 2012 12 / 31

Page 13: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Applications - Choice Modeling

Chandra R. Bhat and Naveen Eluru (2009), A Copula-Based Approachto Accommodate Residential Self-Selection Effects in Travel BehaviorModeling, Transportation Research Part B, Vol. 43, No. 7, pp.749-765.

Erisa Spissu, Abdul R. Pinjari, Ram M. Pendyala, Chandra R. Bhat(2009),A Copula-based Joint Multinomial Discrete-Continuous Modelof Vehicle Type Choice and Miles of Travel, Transportation, Vol. 36,No. 4, pp. 403-422.

Naveen Eluru, Rajesh Paleti, Ram M. Pendyala, Chandra R. Bhat(2010), Modeling Injury Severity of Multiple Occupants of Vehicles:Copula-Based Multivariate Approach, Transportation ResearchRecord, Vol. 2165, pp. 1-11.

Haizhong Wang (OSU) Copula Modeling September 28, 2012 13 / 31

Page 14: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Applications - Behaviour Modeling

, Ipek N. Sener, Chandra Bhat (2011), A Copula-Based SampleSelection Model of Telecommuting Choice and Frequency,Environment and Planning A, Vol. 43, No. 1, pp. 126-145.

Jeffrey J. LaMondia, Chandra R. Bhat (2012), A Conceptual andMethodological Framework of Leisure Activity Loyalty Accommodatingthe Travel Context, Transportation, Vol. 39, No. 2, pp. 321-349.

A. Portoghese, E. Spissu, C. R. Bhat, N. Eluru, and I. Meloni (2010),A Copula-Based Joint Model of Commute Mode Choice and Numberof Non-Work Stops during the Commute, Technical Report.

Haizhong Wang (OSU) Copula Modeling September 28, 2012 14 / 31

Page 15: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Applications - Hydrology

G. Salvadori and C. De Michele, On the Use of Copula in Hydrology:Theory and Practice, Journal of Hydrology Engineering, Vol. 12, No.4, July 1, 2007.

Amir AghaKouchak, Andras Bardossy and Emad Habib, Copula-basedUncertainty Modeling: Application to Multisensor Precipitationestimates, Hydrological Processes, 24, pp. 2111 - 2124 (2010).

Pranesh Kumar, Copula Functions: Characterizing Uncertainty inProbabilistic Systems, Applied Mathematical Sciences, Vol. 5, 2011,no. 30, 1459 - 1472.

Haizhong Wang (OSU) Copula Modeling September 28, 2012 15 / 31

Page 16: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Copula Model and Simulation

Create a copula model for the distribution of (X1, · · · ,Xd) generally takestwo steps

Model

Set a model for marginal distribution Fi

Set a model for copula C

C is the cdf of a random vector (U1, · · · ,Ud) with uniform margins

Simulation

Draw a sample (U1, · · · ,Ud) ≈ C

Set (X1, · · · ,Xd) = (F−11 (U1), · · · ,F−1

d (Ud)

Haizhong Wang (OSU) Copula Modeling September 28, 2012 16 / 31

Page 17: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Gaussian Copula

Figure: Bivariate Gaussian copula with varying parameters ρ

Haizhong Wang (OSU) Copula Modeling September 28, 2012 17 / 31

Page 18: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Student t Copula

Figure: Bivariate Student copula with varying parameters ρ and ν

Haizhong Wang (OSU) Copula Modeling September 28, 2012 18 / 31

Page 19: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Entrance Ramp Flow Dependency

Haizhong Wang (OSU) Copula Modeling September 28, 2012 19 / 31

Page 20: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Entrance Ramp Flow Dependency

Haizhong Wang (OSU) Copula Modeling September 28, 2012 20 / 31

Page 21: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Morning Peak Hour

4005- 101 102 103 104 105 005 006 008 009 010

101 1.0000 0.0503 -0.1529 0.0680 0.0209 -0.0284 0.0767 0.2414 -0.0061 -0.4539

102 0.0503 1.0000 -0.0108 0.1021 0.1678 0.0901 0.1178 -0.0376 0.1488 -0.0591

103 -0.1529 -0.0108 1.0000 -0.0548 0.2614 0.0699 0.0104 -0.2420 0.0231 0.1339

104 0.0680 0.1021 -0.0548 1.0000 0.2458 0.1749 0.1910 0.1267 0.0948 -0.0287

105 0.0209 0.1678 0.2614 0.2458 1.0000 0.3052 -0.0164 0.0759 0.5157 0.0111

005 -0.0284 0.0901 0.0699 0.1749 0.3052 1.0000 0.0297 0.1201 0.1346 0.1041

006 0.0767 0.1178 0.0104 0.1910 -0.0164 0.0297 1.0000 -0.0344 -0.1562 0.0034

008 0.2414 -0.0376 -0.2420 0.1267 0.0759 0.1201 -0.0344 1.0000 -0.0087 0.0309

009 -0.0061 0.1488 0.0231 0.0948 0.5157 0.1346 -0.1562 -0.0087 1.0000 -0.0006

010 -0.4539 -0.0591 0.1339 -0.0287 0.0111 0.1041 0.0034 0.0309 -0.0006 1.0000

Haizhong Wang (OSU) Copula Modeling September 28, 2012 21 / 31

Page 22: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Afternoon Peak Hour

4005- 101 102 103 104 105 005 006 008 009 010

101 1.0000 0.4030 0.2727 0.0914 0.0827 0.3679 0.3815 -0.0474 0.0044 0.0204

102 0.4030 1.0000 0.4563 0.1567 0.0937 0.5894 0.5233 0.0439 -0.1558 0.4893

103 0.2727 0.4563 1.0000 0.2309 -0.0606 0.4834 0.3606 -0.0407 -0.2626 0.3998

104 0.0914 0.1567 0.2309 1.0000 0.0008 0.2485 0.1592 -0.0808 -0.0927 0.1064

105 0.0827 0.0937 -0.0606 0.0008 1.0000 0.0024 0.1417 -0.1339 0.2076 -0.0106

005 0.3679 0.5894 0.4834 0.2485 0.0024 1.0000 0.5952 0.0249 -0.2089 0.5994

006 0.3815 0.5233 0.3606 0.1592 0.1417 0.5952 1.0000 -0.0083 0.0249 0.5276

008 -0.0474 0.0439 -0.0407 -0.0808 -0.1339 0.0249 -0.0083 1.0000 -0.0616 0.1039

009 0.0044 -0.1558 -0.2626 -0.0927 0.2076 -0.2089 0.0249 -0.0616 1.0000 -0.0787

010 0.0204 0.4893 0.3998 0.1064 -0.0106 0.5994 0.5276 0.1039 -0.0787 1.0000

Haizhong Wang (OSU) Copula Modeling September 28, 2012 22 / 31

Page 23: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Entrance Ramp Flow Dependency

(a) 4005101/102 (b) 4005102/103

(c) 4005103/104 (d) 4005104/105

Figure: The joint probability density contour through a Gaussian copulas for themorning peak hour (01/02/2003) dependency among entrance-ramps southboundof GA400Haizhong Wang (OSU) Copula Modeling September 28, 2012 23 / 31

Page 24: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Entrance Ramp Flow Dependency

(a) (b)

(c) (d)

Figure: The morning peak (01/02/2003) dependency surface throughnonparametric bivariate copulas for entrance-ramps southbound of GA400including 4005101 to 4005105Haizhong Wang (OSU) Copula Modeling September 28, 2012 24 / 31

Page 25: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Example - Day-to-Day Analysis

(a) 4005101/102 (b) 4005102/103

(c) 4005103/104 (d) 4005104/105

Figure: The joint probability density contour through a 2d student t copulas forthe morning peak hour (01/07/2003) dependency among entrance-rampssouthbound of GA400Haizhong Wang (OSU) Copula Modeling September 28, 2012 25 / 31

Page 26: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Student-t Copula

(a) 4005101/102 (b) 4005102/103

(c) 4005103/104 (d) 4005104/105

Figure: The joint probability density contour through a 2d student t copulas forthe morning peak hour (01/07/2003) dependency among entrance-rampssouthbound of GA400Haizhong Wang (OSU) Copula Modeling September 28, 2012 26 / 31

Page 27: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Result Analysis - Student t copula

(a) 4005005/006 (b) 4005006/008

(c) 4005008/009 (d) 4005009/010

Figure: The joint probability density contour through a 2d student t copulas forthe afternoon peak hour (01/09/2003) dependency among entrance-rampsnorthbound of GA400Haizhong Wang (OSU) Copula Modeling September 28, 2012 27 / 31

Page 28: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Simulation

(a) (b)

Figure: The dependency structure between ramp flow in 2 and 3 dimensions

Haizhong Wang (OSU) Copula Modeling September 28, 2012 28 / 31

Page 29: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Computing/Fitting with Copula

Matlab - Built-in-functions

R - Copula package

Haizhong Wang (OSU) Copula Modeling September 28, 2012 29 / 31

Page 30: Modeling Dependency with Copula: Implications to Engineers and Planners

Implications to Planners

Summary - Attractive Features (Daniel Berg (2008))

The copula contains all the information about the dependencebetween random variablesCopulas provide an alternative and often more useful representation ofmultivariate distribution functions compared to traditional approachessuch as multivariate normalityMost traditional representations of dependence are based on thelinear correlation coefficient - restricted to multivariate ellipticaldistributions. Copula representations of dependence are free of suchlimitations.Copulas enable us to model marginal distributions and thedependence structure separatelyCopulas provide greater modeling flexibility, given a copula we canobtain many multivariate distributions by selecting different marginsA copula is invariant under strictly increasing transformationsMost traditional measures of dependence are measures of pairwisedependence. Copulas measure the dependence between all d randomvariablesHaizhong Wang (OSU) Copula Modeling September 28, 2012 30 / 31

Page 31: Modeling Dependency with Copula: Implications to Engineers and Planners

Q & A

Questions and Comments?

Thanks!

Jia Li at University of California Davis

Haizhong Wang (OSU) Copula Modeling September 28, 2012 31 / 31