1 Loss reserving with GLMs: a case study Greg Taylor Taylor Fry Consulting Actuaries Melbourne...

1

Loss reserving with GLMs: a case study

Greg TaylorTaylor Fry Consulting Actuaries

Melbourne University

University of New South Wales

Gráinne McGuireTaylor Fry Consulting Actuaries

Casualty Actuarial Society, Spring Meeting

Colorado Springs CO, May 16-19 2004

2

Purpose

• Examine loss reserving in relation to a particular data set• How credible are chain ladder reserves?• Are there any identifiable inconsistencies between the data

and the assumptions underlying the chain ladder model?• If so, do they really matter? Or are we just making an

academic mountain out of a molehill?• Can the chain ladder model be conveniently adjusted to

eliminate any such inconsistencies?• If not, what shall we do?

• Lessons learnt from this specific data set intended to be of wider applicability

3

The data set

• Auto Bodily Injury insurance• Compulsory• No coverage of property damage

• Claims data relates to Scheme of insurance for one state of Australia• Pooled data for the entire state

• Scheme of insurance is state regulated but privately underwritten• Access to common law

• But some restriction on payment of plaintiff costs in the case of smaller claims

• Premium rates partially regulated

4

The data set (continued)

• Centralised data base for Scheme

• Current at 30 September 2003

• About 60,000 claims

• Individual claim records• Claim header file

• Date of injury, date of notification, injury type, injury severity, etc

• Transaction file• Paid losses (corrected for wage inflation)

• Case estimate file

5

Starting point for analysis

• Chain ladder

• This paper is not a vendetta against the chain ladder

• However, it is taken as the point of departure because of its• Simplicity

• Wide usage

6

Chain ladder

• First, basic test of chain ladder validity

• Fundamental premise of chain ladder is constancy of expected age-to-age factors from one accident period to another

Payments in respect of settled claims: age-to-age

factors for various averaging periods

1.00

10.00

100.00

1:0 2:1 3:2 4:3 5:4 6:5 7:6 8:7 9:8 10:9

Development quarters

Ag

e-to

-ag

e fa

cto

rs

Last 1 year Last 2 years Last 3 years Last 4 years All years• This data set fails the test comprehensively

7

Chain ladder – does the instability matter?

• Range of variation is 19%

• Omitting just the last quarter’s experience increases loss reserve by 10-15%

Averaging period Loss reserve at 30 Sept 2003 (excl. Sept

2003 accident qr)

$B

All experience quarters 1.61

Last 8 experience quarters 1.68

All experience quarters except Sept 2003 (last diagonal) 1.78

Last 8 experience quarters except Sept 2003 (last diagonal)

1.92

8

Chain ladder – does the instability matter?• Actually, the situation is much worse than this• Effect of September 2003 quarter (last diagonal) on

loss reserve• Due to low age-to-age factors in the quarter• In turn due to low paid losses in the quarter

• Suggests• Not only omitting September 2003 quarter age-to-age

factors from averaging• But also recognising that loss reserve is increased by low

paid loss experience• Estimate loss reserve at 30 June 2003• Deduct paid losses during September 2003 quarter

9

Chain ladder – does the instability matter? • Now 46%

difference between highest estimate and lowest in previous table

• More than an academic molehill

Averaging period Loss reserve at 30 Sept 2003 (excl. Sept 2003

accident qr)Uncorrected Corrected

$B $BAll experience quarters except Sept 2003 (last diagonal)

1.78 1.94

Last 8 experience quarters except Sept 2003 (last diagonal) 1.92 2.35

10

Review basic facts and questions

• We have a model formulated on the assumption of certain stable parameters (expected age-to-age factors)• This assumption seems clearly violated

• Data contain clear trends over time

• Various attempts at correction for this• Including different averaging periods

• Different corrections give widely differing loss reserves• How might one choose the “appropriate” correction

• Omit just last quarter? Last two? …

• Including averaging period• Average last 4 quarters? Last 6? Last 8? …

11

Some responses to the questions

• DO NOT choose an averaging period• It is a statistical fundamental that one does not average in the presence

of trends

• Rather model the trend• This requires an understanding of the mechanics of the

process generating the trend• DO NOT try to use this understanding to assist in the choice

of an averaging period• Rather use it to model the finer structure of the data

• Otherwise the choice of factors is little more than numerology• These comments apply to not only the chain ladder

• But also any “model” that ignores the fine structure of the data in favour of averaging of some broad descriptive statistics

12

Effect on loss data of changes in underlying process

• Consider a 21x21 paid loss triangle from a fairly typical Auto Bodily Injury portfolio• Years numbered 0,1,2,…• Experience of all accident years

identical• Stable age-to-age factors

• Now assume that rates of claim closure (by numbers) increase by 50% in experience years (diagonals) 11-15

• Examine the ratios of “new:old” paid losses• No change = 100%

13

Effect on loss data of changes in underlying process (cont’d)• Now add superimposed inflation of 5% p.a.

to experience years 14-20

14

Effect on loss data of changes in underlying process (cont’d)• Now add a legislative change that reduces claim costs in

accident years 13-20• 50% reduction for the earliest claims settled

• 0% for the last 30% of claims settled

15

Effect on loss data of changes in underlying process (cont’d)• The ratio of modified experience to the norm

(stable age-to-age factors) is now complex

• Age-to-age factors now change in a complex manner• Trends across diagonals

• Further trends across rows

• Contention is that these trends will be identifiable only by means of some form of structured and rigorous multivariate data analysis

16

How might the loss data be modelled?Let

i = accident quarter

j = development quarter (=0,1,2,…)

Fij = incremental count of claims closed

CFij = incremental paid losses in respect

these closures

Sij = CFij / Fij = average size of these

closures

17

How might the loss data be modelled? (cont’d)• Modelling the loss data might consist of:

• Fitting some structured model to the average claim sizes Sij

• Testing the validity of that model

• The use of average claim sizes will make automatic correction for any changes in the rates of claim closure

18

Modelling the loss data

• Very simple model

Sij ~ logN (βj,σ)

Log normal claim sizes depending on development quarter

• Fit model to data using EMBLEM software

19

Dependency of average claim size on development quarter

Linear Predictor

7

8

9

10

11

12

13

14

0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2

development quarter

20

Add superimposed inflation

• Define

k = i+j = calendar quarter of closure

• Extend model

Sij ~ logN (βdj+ βf

k,σ)

Log normal claim sizes depending on development quarter and closure quarter (superimposed inflation)

21

Dependency of average claim size on closure quarter • Some upward

trend with closure quarter

• Positive superimposed inflation

Linear Predictor

8.0

8.1

8.2

8.3

8.4

8.5

8.6

8.7

8.8

8.9

Mar

-97

Jun-

97

Sep-9

7

Dec-9

7

Mar

-98

Jun-

98

Sep-9

8

Dec-9

8

Mar

-99

Jun-

99

Sep-9

9

Dec-9

9

Mar

-00

Jun-

00

Sep-0

0

Dec-0

0

Mar

-01

Jun-

01

Sep-0

1

Dec-0

1

Mar

-02

Jun-

02

Sep-0

2

Dec-0

2

Mar

-03

Jun-

03

Sep-0

3

finalisation quarter

22

Modelling individual claim data

• We could continue this mode of analysis

• But why model triangulated data?

• We have individual claim data

• More natural to model individual claim sizes

23

Notation for analysis of individual claim sizes• Time variables i, j, k as before• Yr = size of r-th closed claim

• ir, jr, kr are values of i, j, k for r-th closed claim• Also define

tr = operational time for r-th claim = proportion of claims from accident

quarter ir closed before r-th claim• Model

log Yr = fn(ir, jr, kr,tr) + stochastic error

24

Dependency on operational time

• Model

log Yr = fn(ir, jr, kr,tr) + stochastic error

• Specifically

log Yr ~ N(fn(tr), σ)

• Divide range of tr (0-100%) into 2% bands

25

Dependency on operational time

• Dependency close to linear over much of the range of operational time

Linear Predictor

7.5

8.0

8.5

9.0

9.5

10.0

10.5

11.0

11.5

optime

26

Dependency on calendar quarter of closure (superimposed inflation) • Some upward

trend with closure quarter

• Positive superimposed inflation

Linear Predictor

7.90

7.95

8.00

8.05

8.10

8.15

8.20

8.25

8.30

8.35

8.40

Jun-

97

Sep-9

7

Dec-9

7

Mar

-98

Jun-

98

Sep-9

8

Dec-9

8

Mar

-99

Jun-

99

Sep-9

9

Dec-9

9

Mar

-00

Jun-

00

Sep-0

0

Dec-0

0

Mar

-01

Jun-

01

Sep-0

1

Dec-0

1

Mar

-02

Jun-

02

Sep-0

2

Dec-0

2

Mar

-03

Jun-

03

Sep-0

3

finalisation quarter

27

Log normal assumption?

• Examine residuals of log normal model

Pearson Residuals

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5

-8

-7

-6

-5

-4

-3

-2

-1

0

1

2

3

4

5

6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5

Fitted Value

Pea

rso

n R

esid

ual

s

Largest 1,000 Pearson Residuals

-8

-7

-6

-5

-4

-3

-2

-1

0

1

2

3

4

5

7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5Fitted Value

• Considerable left skewness

28

Alternative error distribution

• Choose shorter tailed distribution from the family underlying GLMs• Exponential dispersion

family

• We choose EDF(2.3)V[Yr] = φ {E [Yr]}2.3

• Longer tailed than gamma

• Shorter than log normal

Studentized Standardized Deviance Residuals

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

-8 -6 -4 -2 0 2 4 6 8

Largest 100 Studentized Standardized Deviance Residuals

-8

-6

-4

-2

0

2

4

6

8

0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000 180,000 200,000

Fitted Value

29

Refining the model of the data

• …and so on

• We continue to refine the model of claim size• Paper

contains detail

• Final model includes following effects• Operational time (smoothed)

• Seasonal

• Superimposed inflation (smoothed)• Different rates at different operational times

• Different rates over different intervals of calendar time

• Accident quarter (legislative) effect• Diminishes with increasing operational time

• Peters out at operational time 35%

30

Final estimate of liability

Averaging period Loss reserve at 30 Sept 2003 (excl. Sept 2003 accident qr)

Uncorrected Corrected

$B $BAll experience quarters 1.61

Last 8 experience quarters 1.68

All experience quarters except Sept 2003 (last diagonal)

1.78 1.94

Last 8 experience quarters except Sept 2003 (last diagonal) 1.92 2.35*

GLM 2.23*

* Quite different distributions over accident years

31

Conclusions

• GLM has successfully modelled a loss experience with considerable complexity

• Simpler model structures, e.g.chain ladder, would have little hope of doing so• Indeed, it is not even clear how one would approach the problem with

these simpler structures

• The GLM achieves much greater parsimony• Chain ladder number of parameters = 73 with no recognition of any trends• GLM number of parameters = 13 with full recognition of trends

• GLM is fully stochastic• Provides a set of diagnostics for comparing candidate models and

validating a selection

• Understanding of the data set• Assists not only reserving but pricing and other decision making

1 Loss reserving with GLMs: a case study Greg Taylor Taylor Fry Consulting Actuaries Melbourne...

Documents

Transcript of 1 Loss reserving with GLMs: a case study Greg Taylor Taylor Fry Consulting Actuaries Melbourne...