Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss...

32
Hierarchical Models for Loss Reserving CAS Annual Meeting Jim Guszcza, FCAS, MAAA Washington, DC November, 2010 Deloitte Consulting

Transcript of Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss...

Page 1: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Hierarchical Models for Loss Reservingg

CAS Annual Meeting Jim Guszcza, FCAS, MAAAWashington, DCNovember, 2010

Deloitte Consulting

Page 2: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Agenda

Motivation Five Essential Features of Loss Reserving

Background Hierarchical Modeling Theory

Case Study Nonlinear Empirical Bayes Hierarchical Modely p y

Next Steps A Few Words on the Hierarchical Bayesian Approach

1Copyright © 2008 Deloitte Development LLC. All rights reserved.

Page 3: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Backgroundg

Models vs MethodsNeed for Variability Estimates

Page 4: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Loss Reserving and its Discontents

• Much loss reserving practice is “pre-theoretical” in nature.• Techniques like chain ladder, BF, and Cape Cod aren’t performed in a statistical

modeling framework.

• Traditional methods aren’t necessarily optimal from a statistical POV.• Potential of over-fitting small datasets.• Difficult to assess goodness of fit compare nested models etc• Difficult to assess goodness-of-fit, compare nested models, etc.• Often no concept of out-of-sample validation or diagnostic plots.

• Related point: traditional methods produce point estimates only.p p p y Reserve variability estimates in practice are often ad hoc.

• Stochastic reserving: build statistical models of loss development.• Attempt to place loss reserving practice on a sound scientific footing.• Field is developing rapidly.• Today: explore non-linear hierarchical models (aka “nonlinear mixed effects

models”) as natural, parsimonious models of the loss development process.

3Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Initially motivated by Dave Clark’s paper [2003] as well as nonlinear mixed effects model [NLME] theory.

Page 5: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

What Do You See?

• Here is an actual Schedule P cumulative loss triangle.

• We would like to do stochastic reserving the “right” way• We would like to do stochastic reserving the right way.

• What considerations come to mind?

Cumulative Losses in 1000'sAY premium 12 24 36 48 60 72 84 96 108 120AY premium 12 24 36 48 60 72 84 96 108 120

1988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,0361989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,9871990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,9191991 2,609 338 753 1,029 1,195 1,326 1,395 1,44699 ,609 338 53 ,0 9 , 95 ,3 6 ,395 , 61992 2,077 257 569 754 892 958 1,0071993 1,703 193 423 589 661 7131994 1,438 142 361 463 5331995 1,093 160 312 408

4Copyright © 2008 Deloitte Development LLC. All rights reserved.

1996 1,012 131 3521997 976 122

Page 6: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Five Essential Features of Loss Reserving

• Repeated measures• The dataset is inherently longitudinal in nature.

Cumulative Losses in 1000'sAY premium 12 24 36 48 60 72 84 96 108 120

1988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,0361989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,9871990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,9191991 2,609 338 753 1,029 1,195 1,326 1,395 1,4461992 2,077 257 569 754 892 958 1,0071993 1,703 193 423 589 661 7131994 1,438 142 361 463 5331995 1,093 160 312 4081996 1,012 131 3521997 976 122

• A “Bundle” of time series• A loss triangle is a collection of time series that are “related” to one another…• … but no guarantee that the same development pattern is appropriate to each one

• Unbalanced data• We are doing forecasting The time series are necessarily of different lengths.

• Non-linear• Each year’s loss development pattern in inherently non-linear• Ultimate loss (ratio) is an asymptote( ) y p

• Incomplete information• Few loss triangles contain all of the information needed to make forecasts

5Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Most reserving exercises must incorporate judgment and/or background information Loss reserving is inherently Bayesian

Page 7: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Towards a More Realistic Stochastic Reserving Framework

• How many stochastic loss reserving techniques reflect all of these considerations?1. Repeated Measures (Isn’t loss reserving a type of longitudinal data analysis?) p ( g yp g y )

2. Multiple Time Series3. Unbalanced4. Non-linear (Are GLMs really appropriate?)

5. Incomplete information (“Bayes or Bust”!)

1-3 We need to build hierarchical models4 O hie a chical models sho ld be non linea (g o th c es)4 Our hierarchical models should be non-linear (growth curves)5 Our non-linear hierarchical models should be Bayesian

(Thi t ti ill 1 4 W ill 5 )(This presentation will cover 1-4… Wayne will cover 5.)

6Copyright © 2008 Deloitte Development LLC. All rights reserved.

Cumulative Losses in 1000'sAY premium 12 24 36 48 60 72 84 96 108 120

1988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,0361989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,9871990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,9191991 2,609 338 753 1,029 1,195 1,326 1,395 1,4461992 2,077 257 569 754 892 958 1,0071993 1,703 193 423 589 661 7131994 1,438 142 361 463 5331995 1,093 160 312 4081996 1,012 131 3521997 976 122

Page 8: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Components of Our Approach

• Growth curves to model the loss development process (Clark 2003) • Parsimony; obviates need for tail factors

• Loss reserving treated as longitudinal data analysis (Guszcza 2008)• Parsimony; similar approach to non-linear mixed effects models used in biological

and social sciences

• Further using the hierarchical modeling framework to simultaneously model multiple loss triangles (Zhang-Dukic-Guszcza 2010)

• “Borrow strength” from other loss reserving triangles• Similar in spirit to credibility theory

• Building a fully Bayesian model by assigning prior probability distributions to all hyperparameters (Zhang-Dukic-Guszcza 2010)distributions to all hyperparameters (Zhang-Dukic-Guszcza 2010)

• Provides formal mechanism for incorporating background knowledge and expert opinion with data-driven indications.

• Results in full predictive distribution of all quantities of interestC t l d t B i di t t d t fi d d t

7Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Conceptual advantages: Bayesian paradigm treats data as fixed and parametersare randomly varying

Page 9: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Hierarchical Modeling Theoryg y

Hierarchical Data StructuresHierarchical ModelsMotivating Samples

Page 10: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

What is Hierarchical Modeling?

• Hierarchical modeling is used when one’s data is grouped in some important way.

• Claim experience by state or territory• Claim experience by state or territory• Workers Comp claim experience by class code• Income by profession• Claim severity by injury type• Churn rate by agency• Multiple years of loss experience by policyholder.• Multiple observations of a cohort of claims over time

• Often grouped data is modeled either by:• Building separate models by group• Pooling the data and introducing dummy variables to reflect the groups

• Hierarchical modeling offers a “middle way”.fl b h ’ d l h h

9Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Parameters reflecting group membership enter one’s model through appropriately specified probability sub-models.

Page 11: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

What’s in a Name?

• Hierarchical models go by many different names• Mixed effects models• Random effects models• Random effects models• Multilevel models• Longitudinal models• Panel data models

• We prefer the “hierarchical model” terminology because it k h d l h d l d flevokes the way models-within-models are used to reflect

levels-within-levels of ones data.

• An important special case of hierarchical models involves multiple observations through time of each unit.

• Here group membership is the repeated observations belonging to each i di id l

10Copyright © 2008 Deloitte Development LLC. All rights reserved.

individual.• Time is the covariate.

Page 12: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Common Hierarchical Models

• Notation: • Data points (Xi, Yi)i=1…N

• j[i]: data point i belongs to group j.j[ ] p g g p j

• Classical Linear Model Yi = α + βXi + εi• Equivalently: Yi ~ N(α + βXi, σ2)• Same α and β for every data point

• Random Intercept Model Yi = αj[i] + βXi + εiWh N( 2 ) & N(0 2)• Where αj ~ N(μα, σ2

α) & εi ~ N(0, σ2)• Same β for every data point; but α varies by group

• Random Intercept and Slope Model Yi = αj[i] + βj[i]Xi + εii j[i] j[i] i i• Where (αj, βj) ~ N(Μ, ) & εi ~ N(0, σ2)• Both α and β vary by group

2

11Copyright © 2008 Deloitte Development LLC. All rights reserved.

( )

Σ

⋅+ 2

22

][][ ,,~,~βαβ

αβα

β

α

σσσσ

μμ

βα

σβα NwhereXNYj

jiijiji

Page 13: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Example: PIF Growth by Region

• Simple example: Change in PIF by region from 2007-10 region1 region2 region3 region4

PIF Growth by Region

• 32 data points• 4 years• 8 regions

2400

2600

• 8 regions

• But we could as easily have 80 or 800 regions 2000

2200

regions• Our model would not

change

W i th d t t

2000

region5

2600

region6 region7 region8

• We view the dataset as a bundle of very short time series

2200

2400

12Copyright © 2008 Deloitte Development LLC. All rights reserved.2007 2008 2009 2010

2000

2007 2008 2009 2010

Page 14: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Classical Linear Model

• Option 1: the classical linear model

region1 region2 region3 region4

PIF Growth by Region),(~ 2σβα ii XNY +

• Sweep region under the carpet

β2400

2600

• Yi = α + βXi + εi– Or: Yi ~ N(α + βXi, σ2)– Same α and β for every

data point2000

2200

• This obviously doesn’t cut it

2000

region5

2600

region6 region7 region8

2200

2400

13Copyright © 2008 Deloitte Development LLC. All rights reserved.2007 2008 2009 2010

2000

2007 2008 2009 2010

Page 15: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Randomly Varying Intercepts

• Option 2: random intercept model

region1 region2 region3 region4

PIF Growth by Region ),(~),(~ 22][ αα σμασβα NXNY jiiji +

• Yi = αj[i] + βXi + εi

• This model has 9 2400

2600

parameters: {α1, α2, …, α8, β}

d 2000

2200

• And it contains 4 hyperparameters:{μα, β2, σ, σα}.

2000

region5

2600

region6 region7 region8

• This is a major improvement

2200

2400

14Copyright © 2008 Deloitte Development LLC. All rights reserved.2007 2008 2009 2010

2000

2007 2008 2009 2010

Page 16: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Randomly Varying Intercepts and Slopes

• Option 3: random slope and intercept model region1 region2 region3 region4

PIF Growth by Region( )

Σ

⋅+ 2

22

][][ ,,~,~βαβ

αβα

β

α

σσσσ

μμ

βα

σβα NwhereXNYj

jiijiji

• Yi = αj[i] + βj[i]Xi + εi2400

2600

• This model has 18 parameters: {α1, α2, …, α8,

β β β } 2000

2200

β1, β2, …, β8,}

• And it contains 6 hyperparameters:

2000

region5

2600

region6 region7 region8

hyperparameters:{μα, μβ, σ, σα, σβ, σαβ}

2200

2400

15Copyright © 2008 Deloitte Development LLC. All rights reserved.2007 2008 2009 2010

2000

2007 2008 2009 2010

Page 17: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Compromise Between Complete Pooling & No Pooling

εβα ++= tPIF

No Pooling

{ } 8,..,2,1=++= kkkk tPIF εβα

Complete Pooling• Estimating one model for each

group• Ignore group structure

altogether

Compromise

Hierarchical ModelE ti t t• Estimates parameters using a compromise between complete pooling and no pooling.

16Copyright © 2008 Deloitte Development LLC. All rights reserved.

( )

Σ

⋅+ 2

22

][][ ,,~,~βαβ

αβα

β

α

σσσσ

μμ

βα

σβα NwhereXNYj

jiijiji

Page 18: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

A Credible Approach

• Recall the random intercept model:

),(~),(~ 22][ αα σμασβα NXNY jiiji +

• This model can contain a large number of parameters: {α1, α2, …, αJ, β}.• And it contains 4 hyperparameters: {μ β σ σ }

),(),( ][ ααμβ jiiji

• And it contains 4 hyperparameters: {μα, β2, σ, σα}.

• Here is how the hyperparameters relate to the parameters:

2ˆ)1()(ˆ α σμβα

+=⋅−+−⋅= j

jjjjjjn

nZwhereZxyZ

D thi f l l k f ili ?

2ασ

σ+jn

17Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Does this formula look familiar?• Credibility theory is a special case of hierarchical models!

Page 19: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

The Middle Way

• This makes precise the sense in which the random intercept model is a compromise between the pooled-data model (option 1) and the separate models for each region (option 2).od o a g o (op o )

2ˆ)1()(ˆ α σμβα

+=⋅−+−⋅= j

jjjjjjn

nZwhereZtyZ

• As σα0, the random intercept model complete pooling

2ασ

σ+jn

As σα0, the random intercept model complete pooling• As σα∞ , the random intercept model separate models by group

• In principle it’s always appropriate to use hierarchical modelsIn principle it s always appropriate to use hierarchical models• Rather than a judgment call, the data tells us the degree to which the groups should

be fit using separate models or a single common model• It is no longer an all-or-nothing decision

18Copyright © 2008 Deloitte Development LLC. All rights reserved.

Page 20: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Hierarchical Growth Curve Loss Reserving Model

Page 21: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Hierarchical Modeling for Loss Reserving

• Here is our Schedule P loss triangle:Cumulative Losses in 1000's

AY premium 12 24 36 48 60 72 84 96 108 120 CL Ult CL LR CL res1988 2 609 404 986 1 342 1 582 1 736 1 833 1 907 1 967 2 006 2 036 2 036 0 78 01988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,036 2,036 0.78 01989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,987 2,017 0.75 291990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,919 1,986 0.77 671991 2,609 338 753 1,029 1,195 1,326 1,395 1,446 1,535 0.59 891992 2,077 257 569 754 892 958 1,007 1,110 0.53 1031993 1,703 193 423 589 661 713 828 0.49 1151994 1,438 142 361 463 533 675 0.47 1421995 1,093 160 312 408 601 0.55 1931996 1,012 131 352 702 0.69 3501997 976 122 576 0.59 454

chain link 2 365 1 354 1 164 1 090 1 054 1 038 1 026 1 020 1 015 1 000 12 067 1 543

L t’ d l thi l it di l d t t

chain link 2.365 1.354 1.164 1.090 1.054 1.038 1.026 1.020 1.015 1.000 12,067 1,543chain ldf 4.720 1.996 1.473 1.266 1.162 1.102 1.062 1.035 1.015 1.000growth curve 21.2% 50.1% 67.9% 79.0% 86.1% 90.7% 94.2% 96.6% 98.5% 100.0%

• Let’s model this as a longitudinal dataset.• Grouping dimension: Accident Year (AY)

W b ild i i li d l th t d

20Copyright © 2008 Deloitte Development LLC. All rights reserved.

• We can build a parsimonious non-linear model that uses random effects to allow the model parameters to vary by accident year.

Page 22: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Growth Curves

• We want to build that reflects the non-linear nature of loss development.

Weibull Growth Curve vs Chain Ladder LDFs

• GLM shows up a lot in the stochastic loss reserving literature.

• But… are GLMs natural models for loss triangles? 0.

81.

0

emodels for loss triangles?

• Growth curves• 2-parameter curves

θ scale( )ωθθω )/(exp1),|( xxG −−=

0.6

cent

of U

ltim

ate

• θ = scale• ω = shape• See Clark [2003]

• Heuristic idea:

( )0.

4

Cum

ulat

ive

Per

c

• Heuristic idea:• We fit these curves to the

LDFs • Add random effects • Allows ultimate loss (ratio) 0.0

0.2C

Weibull Growth CurveChain Ladder 1/LDF

21Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Allows ultimate loss (ratio) and/or θ and/or ω to vary randomly by year. 20 40 60 80 100

Development Months

Page 23: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Baseline Model: Heuristics

• Basic intuition is familiar: (CLAY,t) * (LDF) = Ult loss

CL = (Ult loss )*(1 / LDF ) 0.8

1.0

te

Weibull and Loglogistic Growth Curves

CLAY,t = (Ult lossAY)*(1 / LDFt)

CLAY,t = (Ult lossAY)*Gω,θ(t) + error 0.2

0.4

0.6

Age in Months

Cum

ulat

ive

Per

cent

of U

ltim

at

12 36 60 84 108 132 156 18012 36 60 84 108 132 156 180

WeibullLoglogistic

( )[ ]( )

devAYAYAYdevAY

NLRdevpremLRCumLoss

2,, )/(exp1* +−−= εθ ω

( )devAYdevAYdevAY

LRLRAY

aNLR

,1,,

2,~+= −φεεσμ

• The “growth curve” part comes in by using G(t) instead of LDFs.• Think of LDF’s as a rough piecewise linear approximation to a G(t)

22Copyright © 2008 Deloitte Development LLC. All rights reserved.

• The “hierarchical” part comes in because we can let LRAY, θ, and/or ωvary by AY (using sub-models).

Page 24: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Other “Random Effects”

• Our model so far:( )[ ]

( )LRLRAY

devAYAYAYdevAY

NLRdevpremLRCumLoss

2,,

,~)/(exp1* +−−=

φσμ

εθ ω

devAYdevAYdevAY a ,1,, += −φεε

• What if we want to include other random effects in the model?• It’s easily done: ( )[ ] devAYAYdevAY

ULTdevULTCumLoss

2,, )/(exp1

+−−=σσμ

εθ ω

devAYdevAYdevAY

ULT

ULTULT

ULT

ULT

AY

AY

a

NULT

,1,,

2,

,2

,,~

+=

Σ

−φεεσσ

σσθμ

θ ωθ

θ

• Here we add a “random scale” effect to let θ vary by AY.• This is analogous to letting slope vary in a linear model

devAYdevAYdevAY ,1,,

23Copyright © 2008 Deloitte Development LLC. All rights reserved.

• It is unlikely that a “random warp” (ω) effect will be significant.

Page 25: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Baseline Model Performance( )[ ]

( )devAYdevAYdevAY

LRLRAY

devAYAYAYdevAY

aNLR

devpremLRCumLoss

,1,,

2,,

,~)/(exp1*

+=

+−−=

−φεεσμ

εθ ω

• Random LR effects allow a “custom fit” growth curve for each AY • Yet the model is vary parsimonious• The model contains only 6 hyperparameters, but fits the loss triangle very well• Parsimony is achieved because the model is well suited to the data (not ad hoc)

2000

2500 1988 1989 1990 1991 1992

Weibull Growth Curve Model -- AR(1) Errors; Randomly Varying Ultimate Loss Ratio by AY

y ( )

1000

1500

500

2500 1993 1994 1995 1996 1997

Cum

ulat

ive

Loss

1000

1500

2000

24Copyright © 2008 Deloitte Development LLC. All rights reserved.6 30 54 78 102

500

6 30 54 78 102 6 30 54 78 102 6 30 54 78 102 6 30 54 78 102

Development Time

Page 26: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Baseline Model Performance( )[ ]

( )devAYdevAYdevAY

LRLRAY

devAYAYAYdevAY

aNLR

devpremLRCumLoss

,1,,

2,,

,~)/(exp1*

+=

+−−=

−φεεσμ

εθ ω

• We have estimated the parameters: {μLR; ω ; θ; σLR ; φ; σa}• Random effects are added to ultimate LR parameter Analogous to random intercepts• Should we also include random scale (θ) effects? Analogous to random slopes

2000

2500 1988 1989 1990 1991 1992

Weibull Growth Curve Model -- AR(1) Errors; Randomly Varying Ultimate Loss Ratio by AY

1000

1500

500

2500 1993 1994 1995 1996 1997

Cum

ulat

ive

Loss

1000

1500

2000

25Copyright © 2008 Deloitte Development LLC. All rights reserved.6 30 54 78 102

500

6 30 54 78 102 6 30 54 78 102 6 30 54 78 102 6 30 54 78 102

Development Time

Page 27: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Residual Diagnostics

• Advantage of stochastic reserving: enables us to graphically analyze residuals.• Residual diagnostics suggest a reasonable fit.

0.5

0.6 Residual Histogram

2

Normal QQ Plot

500

2000 Actual vs Predicted

Weibull Growth Curve Model -- AR(1) Errors; Randomly Varying Ultimate Loss Ratio by AY

0.2

0.3

0.4

10

1

0010

0015

-2 -1 0 1 2 3

0.0

0.1

-2 -1 0 1 2

-2-1

500 1000 1500 2000

50

3

3Residuals vs Predicted 3

3Residuals vs Development Time 3

3Residuals vs Accident Year

01

2

1

1

1 11

1

111

2

22

22

2

2

2

3

3

33

3

44

4

4

4

45

5

5

5

5

5

6

6

6

6

7

7

7

8

88

9

9

10

01

2

1

1

1 11

1

11 1

2

22

22

2

2

2

3

3

33

3

44

4

4

4

45

5

5

5

5

5

6

6

6

6

7

7

7

8

88

9

9

10

01

2

1

1

111

1

111

2

22

22

2

2

2

3

3

33

3

44

4

4

4

45

5

5

5

5

5

6

6

6

6

7

7

7

8

88

9

9

10

26Copyright © 2008 Deloitte Development LLC. All rights reserved.500 1000 1500 2000

-3-2

-1 1 1

23 3

45

67

79

-3-2

-1 1 1

23 3

45

67

79

6 18 30 42 54 66 78 90 102 114 1988 1990 1992 1994 1996

-3-2

-1 11

233

45

6 7

7 9

Page 28: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Model Results

Chain Ladder AnalysisAY premium 6 18 30 42 54 66 78 90 102 114 CL Ult CL res

1988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,036 2,036 01989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,987 2,017 291990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,919 1,986 671991 2,609 338 753 1,029 1,195 1,326 1,395 1,446 1,535 891992 2,077 257 569 754 892 958 1,007 1,110 1031993 1,703 193 423 589 661 713 828 1151994 1,438 142 361 463 533 675 1421995 1,093 160 312 408 601 1931996 1,012 131 352 702 3501997 976 122 576 454

Parameters and Estimated Reserves - Weibull Model

chain link 2.365 1.354 1.164 1.090 1.054 1.038 1.026 1.020 1.015 1.000 12,067 1,543chain ldf 4.720 1.996 1.473 1.266 1.162 1.102 1.062 1.035 1.015 1.000growth curve 21.2% 50.1% 67.9% 79.0% 86.1% 90.7% 94.2% 96.6% 98.5% 100.0%

The overall Weibull reserve estimate is higher than that of the chain l dd b f

Parameters and Estimated Reserves Weibull ModelAY prem dev LR omega theta growth reported eval120 eval240 ULT reserves

1988 2609 114 0.78 0.96 26.55 98.3% 2,036 2,017 2,045 2,045 91989 2694 102 0.75 0.96 26.55 97.4% 1,987 2,004 2,031 2,032 441990 2594 90 0.78 0.96 26.55 96.1% 1,919 2,001 2,028 2,029 1101991 2609 78 0.58 0.96 26.55 94.1% 1,446 1,497 1,518 1,518 721992 2077 66 0 53 0 96 26 55 91 0% 1 007 1 088 1 103 1 103 95 ladder because of

“tail” development beyond 120 months

1992 2077 66 0.53 0.96 26.55 91.0% 1,007 1,088 1,103 1,103 951993 1703 54 0.49 0.96 26.55 86.2% 713 818 829 829 1171994 1438 42 0.48 0.96 26.55 78.9% 533 688 697 697 1641995 1093 30 0.53 0.96 26.55 67.5% 408 567 575 575 1671996 1012 18 0.73 0.96 26.55 49.7% 352 725 735 735 3841997 975.9 6 0.61 0.96 26.55 21.2% 122 587 595 596 474

27Copyright © 2008 Deloitte Development LLC. All rights reserved.These are the 12 hierarchical model parameters.

total 11,991 12,159 1,636

Page 29: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Making Predictions is Difficult (Especially About the Future)

• Most loss reserve variability is due to “model risk”• In this context, the most serious model risk is the choice of growth curves• Both the Weibull and Loglogistic models fit the available data very well• But they extrapolate very differently

2500

Weibull Random LR Model

198819891990

Loglogistic Random LR Model

198819891990

Comparison of Weibull vs Loglogistic Model Extrapolations

2000

19901991199219931994199519961997

19901991199219931994199519961997

1500

500

1000

28Copyright © 2008 Deloitte Development LLC. All rights reserved.6 30 54 78 102 126 150 174 198 222 6 30 54 78 102 126 150 174 198 222

Page 30: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

Bayesian Motivation

“Given any value (estimate of future payments) and our current state of knowledge, what is the probability that the final payments will be no larger than the given value?”

-- Casualty Actuarial Society’s Working Party on Quantifying Variability in R E ti t 2004

• This can be read as a request for a Bayesian analysis• Bayesians (unlike frequentists) are willing to make probability statements about

Reserve Estimates , 2004

unknown parameters• Ultimate losses are “single cases” – difficult to conceive as random draws from a

“sampling distribution in the sky”.• Frequentist probability involved repeated trials of setups involving physical

d i tirandomization. • In contrast it is meaningful to apply Bayesian probabilities to “single case events”• The Bayesian analysis yields an entire posterior probability distribution – not

merely moment estimates

• Bayesian statistics is the ideal framework for loss reserving• Remaining task: put prior probability distributions on model hyperparameters• Hierarchical Bayes framework also allows us to bring in collateral information in the

29Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Hierarchical Bayes framework also allows us to bring in collateral information in the form of other loss triangles

• Other companies, other regions, …

Page 31: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

MCMC from A to B

• Before 1990, Bayesian statistics was a lot more talk than action.• Unless you use conjugate priors, calculating posterior probability distributions is

cumbersome at best, intractable at worst.

• Markov Chain Monte Carlo [MCMC]: a simulation technique used to solve high-dimensional integration problems.

D l d b h i i t t L Al• Developed by physicists at Los Alamos• Introduced as a Bayesian computational technique by Gelfand and Smith in 1990• Gelfand and Smith sparked a renaissance in the practice of Bayesian statistics• As a result of the Gelfand and Smith MCMC approach, Bayesian statistics is now

i fi ld likcommon in fields like:– Marketing science– Epidemiology, disease mapping– Marketing science (hierarchical approach: customers are exchangeable units)– GenomicsGenomics

• Rather than explicitly calculating high-dimensional, posterior densities, we estimate them by drawing repeated samples from them.

C t t M k h i h ilib i di t ib ti i th t i th t ’

30Copyright © 2008 Deloitte Development LLC. All rights reserved.

• Construct a Markov chain whose equilibrium distribution is the posterior that we’re trying to estimate.

Page 32: Hierarchical Models for Loss Reserving · Loss Reserving and its Discontents • Much loss reserving practice is “pre-theoretical” in nature. • Techniques like chain ladder,

References

Clark, David R. (2003) “LDF Curve Fitting and Stochastic Loss Reserving: A Maximum Likelihood Approach,” CAS Forum.

F Ed d (2006) L it di l d P l D t A l i d A li ti i thFrees, Edward (2006). Longitudinal and Panel Data Analysis and Applications in the Social Sciences. New York: Cambridge University Press.

Gelman, Andrew and Hill, Jennifer (2007). Data Analysis Using Regression and Multilevel / Hierarchical Models. New York: Cambridge University Press.

Guszcza, James. (2008). “Hierarchical Growth Curve Models for Loss Reserving,” CAS Forum.

Pinheiro, Jose and Douglas Bates (2000). Mixed-Effects Models in S and S-Plus. New York: Springer VerlagYork: Springer-Verlag.

Guszcza, James and Thomas Herzog (2010), “Enhanced Credibility: Actuarial Science and the Renaissance of Bayesian Thinking,” Contingencies

Zhang Yanwei Vanja Dukic James Guszcza (2010) “A Bayesian Nonlinear Model forZhang, Yanwei, Vanja Dukic, James Guszcza (2010). A Bayesian Nonlinear Model for Forecasting Insurance Payments,” working paper.

31Copyright © 2008 Deloitte Development LLC. All rights reserved.