Valkhof, Aart 0182737 MSc ACT

52
Pricing insurance products in the presence of multi-level factors Aart Valkhof Master’s Thesis to obtain the degree in Actuarial Science and Mathematical Finance University of Amsterdam Faculty of Economics and Business Amsterdam School of Economics Author: Aart Valkhof Student nr: 0182737 Email: [email protected] Date: June 21, 2016 Version: 1.0.0 Supervisor: Dr. K. Antonio Second reader: Dr. S.U. Can

Transcript of Valkhof, Aart 0182737 MSc ACT

Page 1: Valkhof, Aart 0182737 MSc ACT

Pricing insurance products in thepresence of multi-level factors

Aart Valkhof

Master’s Thesis to obtain the degree inActuarial Science and Mathematical FinanceUniversity of AmsterdamFaculty of Economics and BusinessAmsterdam School of Economics

Author: Aart ValkhofStudent nr: 0182737Email: [email protected]

Date: June 21, 2016Version: 1.0.0Supervisor: Dr. K. AntonioSecond reader: Dr. S.U. Can

Page 2: Valkhof, Aart 0182737 MSc ACT

Abstract

This thesis discusses the pricing of multi-level factors for non-life insurance products.We explore several pricing techniques like credibility, GLM and mixed models. We ques-tion ourselves whether the latter has additional value for the pricing. We investigatefour different implementations of mixed models: the backfitting algorithm, Laplace ap-proximation, the penalized quasi-likelihood method and the Gauss-Hermite quadraturemethod. The backfitting algorithm is comprehensive to implement, has no statisticalframework, but provides valuable credibility factors. The other three techniques are of-fered by standard software. They come with a powerful statistical framework, but lackcredibility factors. We apply the techniques on a commercial general liability portfolio.The data that we use for categorizing the company activities contain a nested structure.We conclude that the use of this structure in your models brings an improvement forthe pricing of the multi-level factor.

Keywords Insurance, Actuarial, Liability, Pricing, GLM, Credibility, Backfitting algorithm,

GLMM, GLMC, Hierarchical models, Multi-level factor, MLF

Statement of Originality

This document is written by student Aart Valkhof who declares to take full responsibilityfor the contents of this document. I declare that the text and the work presented in thisdocument is original and that no sources other than those mentioned in the text andits references have been used in creating it. The Faculty of Economics and Business isresponsible solely for the supervision of completion of the work, not for the contents.

Page 3: Valkhof, Aart 0182737 MSc ACT

Contents

1 Introduction 11.1 Aim of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Outline of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Commercial general liability 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Product characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.4 Standard Industrial Classification . . . . . . . . . . . . . . . . . . . . . . 6

3 Data 73.1 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.2 Data fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2.1 Claim year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.2.2 Business sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.2.3 Standard Industrial Classification . . . . . . . . . . . . . . . . . . 83.2.4 Revenue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2.5 Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Theoretical framework 124.1 Basic concepts of tariff analysis . . . . . . . . . . . . . . . . . . . . . . . 124.2 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.3 Fixed versus random effects . . . . . . . . . . . . . . . . . . . . . . . . . 134.4 Multi-level factors (MLF) . . . . . . . . . . . . . . . . . . . . . . . . . . 144.5 Generalized linear models . . . . . . . . . . . . . . . . . . . . . . . . . . 154.6 Actuarial models for claim frequencies . . . . . . . . . . . . . . . . . . . 164.7 Credibility theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.8 GLMs with random effects . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.8.1 Backfitting algorithm . . . . . . . . . . . . . . . . . . . . . . . . 184.8.2 Generalized linear mixed models . . . . . . . . . . . . . . . . . . 19

5 Modeling of the MLF 225.1 Actuarial modeling of claim frequencies . . . . . . . . . . . . . . . . . . 225.2 Generalized linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2.1 Complete pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.2 Semi-complete pooling . . . . . . . . . . . . . . . . . . . . . . . . 255.2.3 No pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.3 Non-hierarchical models . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.3.1 Backfitting algorithm . . . . . . . . . . . . . . . . . . . . . . . . 265.3.2 Adaptive Gauss-Hermite quadrature . . . . . . . . . . . . . . . . 275.3.3 A comparison of the methods . . . . . . . . . . . . . . . . . . . . 27

5.4 Hierarchical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.4.1 Backfitting algorithm . . . . . . . . . . . . . . . . . . . . . . . . 295.4.2 Laplace approximation . . . . . . . . . . . . . . . . . . . . . . . . 30

ii

Page 4: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof iii

5.4.3 Penalized quasi likelihood . . . . . . . . . . . . . . . . . . . . . . 305.4.4 A comparison of the methods . . . . . . . . . . . . . . . . . . . . 305.4.5 Non-hierarchical versus hierarchical models . . . . . . . . . . . . 31

6 Conclusions 33

References 34

A Tables 36A.1 Revenue classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36A.2 Hospitality SIC codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37A.3 GLM results for basic model . . . . . . . . . . . . . . . . . . . . . . . . . 38A.4 Factors non-hierarchical models . . . . . . . . . . . . . . . . . . . . . . . 39A.5 Factors hierarchical models . . . . . . . . . . . . . . . . . . . . . . . . . 41

B R-code 43B.1 Data samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43B.2 Backfitting algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Page 5: Valkhof, Aart 0182737 MSc ACT

iv Aart Valkhof — Pricing multi-level factors

Page 6: Valkhof, Aart 0182737 MSc ACT

Chapter 1

Introduction

The task of the pricing actuary is to develop a premium based on factors that representthe risk of the policy. The actuary takes into account two types of rating factors (Manoand Rasa (2006)).

1. Continuous factors like driver’s age or reinstatement of a building;

2. Factors that are categorical without an inherent ordering, such as type of fuel,occupation and economic activity.

The latter can take on a certain number of values. These are referred to as the levels ofa factor. Pricing is straightforward when the number of levels of the categorical factoris small as is the case with type of fuel. When the number of levels is large and thereis no logical way to order them, then the pricing becomes problematic. Ohlsson andJohansson (2004) denominate rating factors with a large number of levels as multi-level factors (MLFs). We want to include this type of information in the pricing list- besides the non-MLFs, also called ordinary rating factors - even though the factorconsists of many levels that do not have a sufficient amount of data. The model of acar is an example of a MLF. It is an important rating factor in motor insurance. Thereare several thousands of car model classes. Some represent popular cars with sufficientdata available, some have sparse data. Another example is geographical zones. Denselypopulated zones within cities have enough data. Zones from rural territories have lessdata.

It is common to use generalized linear models (GLMs) to estimate non-MLFrating factors. McCullagh and Nelder (1989) start the rise of GLM as the most importantstatistical technique for the non-life pricing actuary. Many textbooks give a generalcontext of GLM, see Kaas et al. (2008), Ohlsson and Johansson (2010) and Frees etal. (2014). GLM is very suitable to estimate claim frequencies and claim severities inthe presence of risk factors. Firstly they enable the actuary to use a distribution ofthe exponential family. Secondly, GLM offers opportunities for multiplicative modelsrather than for additive models. In this way relative increments are obtained, which isdesirable for pricing. Ohlsson and Johansson (2004) and Mano and Rasa (2006) illustratethe problems with GLM pricing of a MLF. In GLMs the effects are fixed, which makesGLM only suitable for categorical covariates with a limited set of levels like type offuel and age class. A MLF will have levels where the data is too sparse to estimate afixed regression parameter for such levels. Estimating parameters for these sparse levelswill lead to strange and unreliable results. Another way to deal with MLF is to clusterthe levels to groups with enough data points. Often this clustering is done in an ad hocway, or requires a lot of time and manual intervention to execute many hypothesis tests.The actuary ends up merging levels to get more reliable estimates at the price of a lessdetailed tariff.

Credibility theory is a very old pricing technique. Credibility aims to price indi-vidual insurance contracts taking into account the whole portfolio as well. It is a trade

1

Page 7: Valkhof, Aart 0182737 MSc ACT

2 Aart Valkhof — Pricing multi-level factors

off between an insured’s own loss experience and the experience of the whole portfolio.The expression credibility alludes to the weight given to the experience of the individual.If this individual experience is credible, then the individual experience will determine thepremium rate. Vice versa, if the individual experience is not credible, then the collectiveexperience will determine the premium. These are two extreme positions. In practicea compromise between the extreme positions determines the premium. We distinguishtwo different types of credibility: limited fluctuations credibility and greatest accuracycredibility. The limited fluctuations credibility descends from the beginning of the twen-tieth century. Although this theory provides simple solutions, it suffers from a lack oftheoretical justification (Denuit et al. (2007)). In this thesis we will focus on the secondtype that was introduced by Buhlmann (1967) in the 1960s. See Buhlmann and Gisler(2005) for an extensive exposition on credibility. Credibility is a pricing technique thatcan solve the rating for MLFs, but it does not have the advanced statistical frameworkthat GLMs have. Ohlsson and Johansson (2004) propose to use a combination of gen-eralized linear model and credibility in order to treat the situation where we have MLFbesides non-MLF.

The idea of combining GLM and credibility (GLMC) was introduced in Nelder andVerrall (1997). They showed how credibility theory can be a building block for a hier-archical generalized linear model (HGLM). Ohlsson (2008) presents the GLMC modelassumptions and estimators for MLFs. The article also provides several examples, suchas the MLFs car brand and car model for a motor insurance product. This is an exampleof a nested data structure, where car model is hierarchically ordered under car brand.Mano and Rasa (2006) work out an application of a GLMC model with spatial data asa MLF. Frees et al. (2014), chapter 16, discuss the pricing of MLFs with generalizedlinear mixed models (GLMMs). GLMM is an extension to GLM. A GLMM can treatthe situation where there are fixed and random effects. The theory of GLMM is coveredin Breslow and Clayton (1993). The non-MLF are seen as fixed effects and are modeledby the GLM. These explanatory variables are fixed, but unknown. The MLFs are seenas random effects. These are random variables that capture the effects that are not ob-servable in the other explanatory variables. By using GLMC and GLMM the advantagesof GLM and credibility can contribute to the solution of our problem with the pricing ofMLF. The mathematical calculations of GLMM are complex. Frees et al. (2014), chapter16, tackle three methods to estimate the regression parameters and the random effectsof a GLMM: approximating the integrand with the Laplace method; approximating thedata; approximating the integral through numerical integration. Including the approachof Ohlsson this gives us four methods to determine the estimators of a MLF. In thesemethods we need to distinguish between non-hierarchical and hierarchical models toinvestigate the added value of the nested structure of the data.

1.1 Aim of thesis

The aim of this master paper is to compare the estimates obtained by GLMC to theestimates obtained by the three mentioned estimating methods of GLMM. We do this bymaking use of a portfolio of a commercial-line insurance product, namely commercialgeneral liability (CGL). The data of the portfolio is made available by a Dutch insurer.Currently the insurer constructs the tariff of the product by a GLM with only non-MLFs. With the help of expert judgment this tariff is transformed to a tariff for theMLF levels. This transformation costs a lot of time and the expert judgment is notwithout subjectivity. We will discuss the current model and investigate if the help ofmixed models will lead to improvements. We will construct a GLM model with the MLFas a fixed effect to demonstrate the problem of pricing the MLF. Subsequently we willpropose GLMC and GLMM models as an improved alternative. Finally we ask ourselvesif the nested structure of data could contribute to an improved tariff.

Page 8: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 3

1.2 Outline of thesis

In chapter 2 we explain the product of commercial general liability (CGL). CGL is aproduct that is not commonly used for research questions, because of the shortage ofdata. Motor products are more likely to appear in actuarial literature. However, thelack of data is exactly the reason why it is interesting to use a CGL portfolio for thepricing of MLF. In this chapter also the MLF that we will focus on, is introduced:the economic activity of an enterprise indicated by a code. The data of the portfolio isdescribed in chapter 3. In chapter 4 the theoretical framework can be found. We definesome elementary terms and elaborate about different basic concepts. Furthermore wediscuss the main theory of GLM, credibility and GLMM. We present the results of theanalysis in chapter 5 and compare the outcomes of the different estimation techniques.We conclude with the conclusion in chapter 6.

Page 9: Valkhof, Aart 0182737 MSc ACT

Chapter 2

Commercial general liability

This chapter describes the commercial general liability (CGL) insurance product. Forthe literature related to this chapter we refer to Kealy (2015). After the introduction insection 2.1 we enumerate the main coverages in section 2.2. Every coverage is illustratedwith an example. We continue in section 2.3 with exploring some special characteristicsof the product like the occurrence limit and long settlement periods. Finally section 2.4explains the role of the economic activity as an important MLF in the price list of thisproduct.

2.1 Introduction

General liability is an insurance product that prevents the insured from suffering majorfinancial losses if an entity is held legally responsible in an event of bodily injury ordamage. For instance, if a customer enters a shop and slips due to the wet floor, thenthe shop owner is set liable for the customer’s damage. General liability can involveprivate and commercial entities. In case of a private entity we talk about private lia-bility, in case of a commercial entity we talk about commercial liability. This thesis isabout commercial general liability. If in this thesis one states liability, then this meanscommercial general liability.

2.2 Coverage

The product consists of three main coverages as expressed in the list below. Everycoverage is clarified with an example.

1. Third-party liability. The enterprise is protected against general liability claimsfrom third parties. Section 2.1 gives an example.

2. Employer’s liability. The enterprise is protected for financial losses in case of aclaim from an employee. For example, a construction worker falls from a scaffold.This is a tragic event especially because the safety net was not in place. Theemployer of the injured worker is set liable for the cost of the injuries.

3. Product liability. The enterprise is protected against claims that are caused bya delivered product or service. For example, a plastic packaging supplier sells one-liter containers to a mouthwash manufacturer. The containers have special capswith child safety locks. The customers of the mouthwash manufacturer complainbecause the mouthwash containers have bad child safety locks. The mouthwashmanufacturer is forced to recall all containers and to compensate its customers.The mouthwash manufacturer holds the packaging supplier responsible for theincorrect product and claims the cost for the recall activities and for the compen-sation.

4

Page 10: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 5

In practice there are far more coverages, but this goes beyond the scope of this thesis.

2.3 Product characteristics

Property and motor hull insurance products know their maximum loss beforehand. Itis the reinstatement of a building or the catalog value of a vehicle. But with generalliability there is no intrinsic maximum loss. The loss depends on the claim events andthe maximum size of the claim is not a priori known. In order to limit the risk of theinsurer and to cap the loss every policy is subject to a predefined occurrence limit andto an annual aggregate limit. The occurrence limit is the maximum reimbursementthat the insurance policy will pay for a single event. The annual aggregate limit is themost that an insurance policy will pay regardless of the number of events in a year. Itis common to define a limit with round numbers like 0.5, 1.0, 1.25 or 2.5 million. Ingeneral the annual aggregate limit is twice as much as the occurrence limit. If the claimis higher than the limit, then the insurer is not accountable for the surplus.

General liability is characterized by long settlement periods and long reportingdelays. Even years after the policy year is ended, a claim can be reported. In contrastto property insurance, where a claim has typically been observed together with theoccurrence of the insured event, the insurer can face the claim long after the eventactually happened. A well known example are asbestosis claims, see Ratner (1983). Itcan take decades before the bodily injury of a worker that touched asbestos, manifestsitself. Settlements in court can drag along for years and the bodily injury can worsenover the years. Such claims go together with a large claim size and it can take yearsbefore the exact claim size is known. This makes it difficult to determine the ultimateclaim cost and number of claims. Kaas et al. (2008), chapter 10, explains the modelingof the ultimate claim size and claim cost by using reserving techniques, but this goesbeyond the scope of this thesis.

Table 2.1 demonstrates the claim handling of CGL by means of an example. Itshows the information of the policyholder and the claim information, like claim causeand financial settlement. Although the claim cost is not that large it takes more than ayear to settle the claim.

Information of policyholder

Economic activity Wholesale of agricultural machinery, equipment and(SIC code) tractors (4661)

Business sector Revenue Per occurrencelimit

Annualaggregate limit

Wholesale 7.786.000 2.500.000 5.000.000

Claim information

Cause Improper adjustment of cattle concentrate feed making machinewhich disrupts milk production. The concentrate is destroyed.

Claim date 15-04-2013

Date Remark Payment Reserve Incurred

15-05-2013 Claim reported 5.000,00 5.000,0009-12-2013 Payment 5.806,18 806,18 5.806,1823-04-2014 Payment 6.125,36 6.125,36 11.931,5429-04-2014 Closed 11.931,54

Table 2.1: Example of a CGL claim. The amounts are in euro.

Page 11: Valkhof, Aart 0182737 MSc ACT

6 Aart Valkhof — Pricing multi-level factors

2.4 Standard Industrial Classification

The economic activity of an enterprise is an important risk driver in several non-lifeproducts for commercial lines. The Standard Industrial Classification of Economic Ac-tivities codes (SIC) - or Standaard Bedrijfsindeling (SBI) in Dutch - classifies enterprisesby the type of economic activity in which they are engaged. The list of classificationsconsists of approximately 1200 levels and is maintained by Statistics Netherlands. Formore information on SIC codes and a complete list of the SIC codes we refer to StatisticsNetherlands (2015).

The importance of the economic activity for the CGL product is illustrated with thefollowing examples.

• Illustration 1: shops selling kitchens have more third-party risk than shops sellingbooks. Workers will install the kitchens at the customer’s (third-party) house andthe customer’s property can be damaged. This increases the risk of a claim. Forexample, the plumbing system may start leaking because of the activity of theworkers. This damages the customer’s floor. The kitchen shop is responsible forthe damage.

• Illustration 2: an enterprise that has an activity in the construction business hasmore employer’s liability risk than an enterprise with an administrative activity.The probability of an accident is higher at a construction site than at an office.

• Illustration 3: shops selling medical goods have more product liability risk thanshops selling vegetables. In case there is something wrong with the medicine, thenthe probability of bodily injury is higher.

The SIC code has a hierarchical structure. At the top of the structure there isthe business sector, like Retail, Wholesale, Construction, Industry, etc. Below this topthere are four nested layers. The top layer is indicated by two digits, the deepest layeris indicated with 5 digits. Table 2.2 shows a part of the SIC code listing (StatisticsNetherlands (2015)). The example displays a part of the business sector Retail. SICcodes that are indicated with an arrow have a deeper layer. SIC codes that are indicatedwith a diamond are the deepest or end layer. The example shows that the deepest layerdoes not have to be a five digit SIC code.

SIC Description

. . .B 47 Retail trade

. . . . . .B 472 Specialized shops selling food and beverages

. . . . . . . . . � 4721 Shops selling potatoes, fruit and vegetables

. . . . . . . . .B 4722 Shops selling meat and meat products, game and poultry

. . . . . . . . . . . . � 47221 Shops selling meat and meat products

. . . . . . . . . . . . � 47222 Shops selling game and poultry

. . . . . . . . . � 4723 Shops selling fish

. . . . . . . . .B 4724 Shops selling bread, pastry, chocolate and sugar confectionery

. . . . . . . . . . . . � 47241 Shops selling bread and pastry

. . . . . . . . . . . . � 47242 Shops selling chocolate and sugar confectionery

Table 2.2: Structure example of the SIC coding in the retail sector.

In 2008 Statistics Netherlands reformed the structure of the SIC codes. The re-form mainly involved the transformation from a national coding to a more internationalcoding. Nowadays the SIC codes are almost one hundred percent in line with the in-ternational standard of the European Union. One of the reforms eliminated the codeswith six digits. See Eurostat (2015) for information on the European coding.

Page 12: Valkhof, Aart 0182737 MSc ACT

Chapter 3

Data

The data used in this thesis is from a general liability insurance product of a commercialportfolio from a Dutch insurance company. The data covers the years 2012-2015, but for2015 the data only covers the first three quarters of the year. The targeted customersare independent contractors and small and medium-sized enterprises (SMEs)1. Besidesthe main risk drivers of the premium - revenue, business sector, occurrence limit andeconomic activity - the year of occurrence of the claim is available. The available claimstatistics are claim amount and number of claims. We will analyze these statistics perclaim year. The data set does not distinguish between the three main coverages asdescribed in section 2.2.

3.1 Descriptive statistics

In this section we describe the explanatory variables. Table 3.1 summarizes all covariatesof the data set. Then we discuss every variable briefly in section 3.2.

Continuous covariates

Field name Description Min/Max Mean Std.dev.

Gross Revenue Yearly revenue of enterprise 1/64,000,000 466,771 1.340×106

Categorical covariates

Field name Description Levels Mode

Claim year Year in which claim occurs 2012, . . . , 2015 2014Business sector Clustering of economic

activitiesWholesale, Retail,Construction, Garage,Hospitality, Manufacturing

Hospitality

SIC Code for economic activity 349 levels 56101Limit The maximum loss per

event1.25/2.50 mio. 2.50 mio.

Table 3.1: Description of covariates in the data set.

1According to the definition of the European Union (2003) SMEs have an annual revenue of less than50 million euro.

7

Page 13: Valkhof, Aart 0182737 MSc ACT

8 Aart Valkhof — Pricing multi-level factors

3.2 Data fields

In this section we discuss in detail the covariates, or: risk factors, present in the dataset. In the tables we use the exposure as the sum of the duration per policy in years. Nis the sum of the number of reported claims. The frequency is the division of N by theexposure.

3.2.1 Claim year

The claim year is the year in which the claim occurred. This data set contains fourconsecutive years of data: 2012, 2013, 2014 and 2015. Table 3.2 shows the frequencyper claim year. The frequency drops in 2015. This is not because 2015 is a year withless claims, but this is a consequence of the delay of claim reporting that comes with aliability product as discussed in section 2.3. We can conclude that the claim statisticsfor the year 2015 are incomplete. Therefore the claim year 2015 will be excluded in allresearch queries for this thesis.

Claim year Exposure N Frequency

2012 2,685 105 0.0392013 3,668 150 0.0412014 6,898 332 0.0482015 6,377 230 0.036

Table 3.2: Claim statistics per claim year.

3.2.2 Business sector

The economic activities are clustered in business sectors. The data set contains sixbusiness sectors. More sectors exist but are not present due to several reasons. Firstlybecause of the high risk several sectors are unacceptable risks. Secondly because thesectors with negligible exposure are excluded from the data. The business sector hasa one-to-one relation with the economic activity. This one-to-one relation is definedby Statistics Netherlands. In table 3.3 the exposure and claim statistics per cluster isstated. We see that the business sector Construction has the highest frequency. Thiscorresponds with the risk profile of this sector. Due to the heavy labor on constructionsites more accidents happen compared to other sectors.

Business sector Exposure N Frequency

Construction 3,541 303 0.086Garage 192 5 0.026Hospitality 4,191 122 0.029Manufacturing 722 36 0.050Retail 3,335 73 0.022Wholesale 1,271 48 0.038

Total 13,252 587 0.044

Table 3.3: Claim statistics per business sector over the claim years 2012 till 2014.

3.2.3 Standard Industrial Classification

The economic activity of an enterprise is expressed in a Standard Industrial Classifi-cation. This classification is in detail described in subsection 2.4. Although there are

Page 14: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 9

0.0 0.2 0.5 0.8 1.0

56101

56102

5630

5621 55101

55102

56103

5629

5530

55201

Hospitality

4120

4332

4334 43221

432101 4312

4333

43222

4331 4339

42112

43993

4222 422101

4211143992

4329024391

Construction

4711

47713

47712

47761

4726 4742 47591 47221

47789 47793 47721 47763 47643

47717

4723

47241

47593

47762

47741 4725 4741 4721 47528

47711

47782

4765

47714

47299 47819 47525 476203

47716

47291

47526

47431

47597 47783 477814789947592

47432

47642

47521

47527

4777

477914782472924724247811

4761

47293

47594

47742

475969478914752247511

47792

4775

475301

476202

476201

4772247222

4751347543478924764447530247715

4719147524

476347999

47541

Retail

4651

4622

4634

466992

464999

46473 46462 4649924673846311

46739 46383 46496 4639 46731

46471

466991

4662

4652

46695

46741464214645 4672246422

46442

46737

46429

46425

46772

466501

463202

46433

46472

46732466824649914643646901

46497

46423

4666

46424

46382

466944673646441466924676246381466999467424649546435462194669146432

46498467614633146412

Wholesale

31011

1071

2561

2562 16231181292550

18123

310902

1392

16101

3012108401222301331234332212

331212

2312

309201

2030

2893

27401813236112920223702512231128493213331211

234133121311053109011812116239

Manufacturing

4531145112

4730 4532

45194Garage

Figure 3.1: Mosaic graph per SIC clustered by business sector over the claim years 2012till 2014. The color indicates the frequency as displayed in the bar. The size of therectangles indicates the exposure.

approximately 1200 SIC codes, the data only contains 349 codes. This is because theinsurer rejects enterprises within some specific economic activities. For instance, Roof-ing (SIC code 4391) has a high liability risk and is rejected. Our data still contains SICcodes with six digits. This is a legacy from the old coding as described in subsection2.4.

Figure 3.1 plots the exposure and claim frequency corresponding to the SIC codespresent in our portfolio. Every rectangle represents a SIC code (349 codes). The sizeof the rectangle is proportional to the exposure. Some rectangles are so small thatonly a part of (or no) SIC code can be displayed. The color of the SIC indicates theobserved claim frequency over the total observation period. Close to white means a claimfrequency of zero. Dark red means a frequency of one. The SIC codes are clusteredby business sector. The cluster Hospitality, which based on exposure is the biggestsector, consists of only eleven SIC codes. Manufacturing, which has a much smallerexposure, consists of 116 SIC codes. Manufacturing has a few SIC codes with a veryhigh frequency, but most of the SIC codes have a very low or zero frequency. These highand low frequencies do not tell the full story. For instance, SIC code 3230 (Manufactureof sports goods) from the sector Manufacturing, shows dark red and has a frequency of0.88. It has three claims on an exposure of 3.4. Is this due to the fact that this economicactivity involves many risks? Or is this just one bad policy?

3.2.4 Revenue

Revenue is the annual revenue of an enterprise and is expressed in euro. It is a continuousvariable, but in this thesis we transform this information into a categorical variable.The revenue is classified into 10 percentiles: 0% to 10%, 10% to 20%, 20% to 30%,etc. The classes are sequentially numbered 1, 2, 3 to 10. This is done per businesssector because the magnitude of the revenue differs per sector. For example, the sector

Page 15: Valkhof, Aart 0182737 MSc ACT

10 Aart Valkhof — Pricing multi-level factors

Wholesale Retail Construction Garage Hospitality Manufacturing

Rev

enue

0

500000

1000000

1500000

2000000

2500000

3000000

Figure 3.2: Revenue deciles per business sector over the claim years 2012 till 2014. Thetop decile is omitted due to lack of space.

Wholesale has higher revenue ranges compared to the other sectors. This is shown byfigure 3.2. The deciles of Wholesale piled up on top of one another, reach a higherrevenue, compared to the other business sectors. Table A.1 shows the exact borders ofthe percentiles per sector. The advantage of this classification is that the number ofobservations in every interval is the same. The disadvantage is that this classificationmay assign almost identical observations to consecutive classes and observations withwidely different values to the same class, see Fischer and Wang (2011). Although thetargeted enterprises are independent contractors and SMEs it can be possible that anon-SME company is in the portfolio due to product replacement, portfolio take-oversor simple mistakes. The revenue of an enterprise is a risk driver for a general liabilityproduct. Table 3.4 shows a clear link between the revenue and frequency: the higher therevenue, the higher the frequency.

Revenue class Exposure N Frequency

1 1,454 29 0.0202 1,800 36 0.0203 855 21 0.0254 1,339 31 0.0235 1,468 48 0.0336 1,162 40 0.0347 1,234 54 0.0448 1,308 59 0.0459 1,324 91 0.069

10 1,307 178 0.136

Table 3.4: Claim statistics per revenue class over the claim years 2012 till 2014.

Page 16: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 11

3.2.5 Limit

In subsection 2.3 the role of the occurrence limit is explained. There are only two valuesof occurrence limits in the data: 1.25 and 2.5 million euro. The policyholder only hasthese two options to choose from. The 2.5 million option has the highest exposure andclaim frequency, namely 9,354 to 3,897 for exposure and 0.0474 to 0.0369 for frequency.The annual aggregate limits are twice as large as the occurrence limit, so 2.5 or 5.0million euro correspondingly.

Page 17: Valkhof, Aart 0182737 MSc ACT

Chapter 4

Theoretical framework

After the description of the data in the previous chapter, we dive into the theoreticalframework. We start this chapter with the introduction (4.1) of fundamental definitionsof non-life insurance products. Then we explain the notation of the data structure insection 4.2. In section 4.3 we explain the concepts of fixed and random effects. Weexplain several data structures for random effects and introduce our multi-level factor(MLF). Section 4.5 summarizes the main principles of the pricing technique GLM. Wereview another pricing technique, credibility theory, in section 4.7. The theory of GLMand credibility come together in subsection 4.8.1, where Ohlsson’s backfitting algorithmis described. We end with the theory of mixed models in subsection 4.8.2.

4.1 Basic concepts of tariff analysis

We highlight the main concepts for the pricing of non-life insurance products. Detailedinformation can be found in Denuit et al. (2007) and Ohlsson and Johansson (2010).The exposure of a policy is the duration of the policy. This is the amount of time it isin force. The exposure is calculated for every policy per year or per period of insurance.This means that the exposure has a minimum value of zero and an maximum of one. Theexposure of a group of policies is obtained by adding the duration of individual policies.A claim represents an event reported by the policy holder, for which he demandseconomic compensation. We assume that the claim is actually justified. The claimfrequency is the average number of claims per year on one policy. It is the number ofclaims reported by the policy holder divided by the exposure. The claim cost is theamount paid by the insurer to the insured in case of a claim. The claim severity isthe average cost per claim. It is the total claim cost divided by the number of claims.The earned premium is the annual premium times the exposure. It is the amountof premium income paid by the insured for the period that the policy is in force. Thepure premium or risk premium is the claim frequency multiplied by the averagecost per claim. The actual premium is the premium for one year according to thetariff in force. This premium includes loadings for expenses and capital cost and is notdirectly comparable to the pure premium. The policyholder can be seen as a single riskwith a risk profile expressed by the values for the levels of the rating factors. If severalpolicyholders have identical values for the levels of rating factors - an identical risk profile-, then they form a risk class. The non-life pricing actuary has the task to computethe pure premium for every risk group based on a statistical model incorporating allthe available information about the policyholders in such a class.

In liability insurance modeling the claim costs is much more difficult than claimfrequencies. Denuit et al. (2007) explain this with three reasons. Firstly, claims costsare often a mix of small and large claims. Large liability claims need several years tobe settled. Only estimates of the final cost appear in the data until the claim is closed.Secondly, the statistics available to fit a model for claim severities are much more limited

12

Page 18: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 13

than for claim frequencies, since only four percent of the policies produce claims. Finally,the cost of an accident is for the most part beyond the control of a policyholder since thepayments of the insurance company are determined by third-party characteristics. Theinformation contained in the available observed covariates is usually much less relevantfor claim sizes than for claim counts. Our goal in this thesis is to compare differentmethods to analyze the claim frequencies reported on this product. We can achieve thisby analyzing the claim frequency.

4.2 Data structure

Our data set has the structure of panel data involving information of policyholders overtime. We denote our response of interest - number of claims - Nit and xit is a vector ofp explanatory variables. The subscripts indicate the policyholder i and the claim yeart. Panel data is very suitable for a posteriori ratemaking. This tariff predicts the nextyear’s loss for a particular policyholder, using the dependence between the current year’sloss and losses reported by this policyholder in previous years. We aggregate our data tothe level of risk classes. We do this to save computational time in our analysis while wedo not get different results, see Denuit et al. (2007) page 66 for a theoretical justification.This is only applicable in our situation with Poisson likelihood. The number of claimsand exposure are aggregated like this.

Nrt =n∑i=1

Nirt wrt =n∑i=1

wirt, (4.1)

where r is the risk class, w is the exposure and n is the number of policyholders. SectionB.1 in the appendix shows data samples of the panel data per policyholder and per riskclass.

4.3 Fixed versus random effects

In this chapter we will use the terms fixed effects and random effects. We use linearmodels to explain the differences between these two terms although linear models arenot a subject of this thesis. The given models are from Frees et al. (2014), chapter 8. Abasic linear model with no clustering of data, is model (4.2)

E[Nrt] = α+ x′rtβ, (4.2)

V ar(Nrt) = σ2,

where vector Nrt is the observed response variable, xrt = (xrt,1, . . . , xrt,p) is a vector ofp explanatory variables, β = (β1, . . . , βp)

′ is a vector of p corresponding parameters tobe estimated by the model. r denotes a risk class and t is the time period. This modelproduces identical estimates for all risk classes r given a xrt, because it ignores the panelstructure. An example of a linear fixed effects model is (4.3). It is the same model as(4.2), but with a risk class specific intercept

E[Nrt] = αr + x′rtβ, (4.3)

V ar(Nrt) = σ2.

Each risk class r has its own fixed - but unknown - intercept αr. There is no pooling ofinformation, because the data is clustered per risk class. The intercept captures all ofthe effects that are not observable in the other explanatory variables. We assume inde-pendence among all observations. The model is called a fixed effects model because

Page 19: Valkhof, Aart 0182737 MSc ACT

14 Aart Valkhof — Pricing multi-level factors

the quantities αr are fixed parameters to be estimated. Another approach is the linearmixed model. This model allows for random intercepts, with model equation

Nrt = αr + x′rtβ + εrt, (4.4)

where εrt is an identically and independently distributed error term - E[εrt] = 0 andV ar(εrt) = σ2

ε . The intercept αr is now a random variable with variance σ2α that repre-

sents variation between risk classes. These random intercepts capture the heterogeneityamong the risk classes and structure the dependence between observations on the samerisk class. These random effects represent unobserved characteristics to the actuary. Thevariance structure σ2

ε represents the variability within risk class r. The random effectsare mixed with the regression parameters β, that are considered as fixed effects. Hencethe term mixed model. Two extremes exists. When σ2

α → 0, then there is completepooling. When σ2

α →∞, then we speak of no pooling. A mixed model is a compromisebetween these two extremes, balancing between the complete pooling and no poolingmodels. This is known as partial pooling. In this balancing between two extreme po-sitions actuaries will recognize the resemblance with credibility theory, which we willdiscuss in section 4.7.

Single random effect

55101

55102

55201

4622

4634

4636

4711

. . .

Figure 4.1: Single random effects per level. The pink nodes are SIC codes.

4.4 Multi-level factors (MLF)

Frees et al. (2014), chapter 8, enumerate four kinds of structures for the random effects:1) single random effect per level; 2) multiple random effects per level; 3) Nested randomeffects and 4) crossed random effects. Our random effects for this thesis have the firstor the third structure. We will restrict ourselves to these two structures. The singlerandom effect per level is the most common structure. The random effect correspondsto a certain level of a single grouping factor. For example, Ohlsson and Johansson (2004)uses bus companies as an example of a MLF that has this particular structure. Theirexample consist of data of 624 bus companies and two ordinary rating factors, namelyage and zone. These are the bus age with five classes and a standard subdivision ofSwedish parishes into seven zones. The MLF is the bus company itself and added tothe model as a random effect. All levels, meaning all bus companies, are listed side byside next to each other. In case of nested random effects some levels of one factoroccur only within certain levels of a first factor. For example, the MLFs car brand andcar model in Ohlsson (2008) are nested random effects. Car models (Volvo V70, VolvoS60, Volvo XC90) are hierarchically ordered under car brands (Renault, Volvo, etc.).Cars of the same brand have risk characteristics in common, even if they representdifferent models. This gives an advantage when new models are introduced by a brand.For example, Swedish cars are well-known for their safety. A new model from the brand

Page 20: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 15

Volvo will have this risk characteristic. Even when there is no data available for the newmodel.

The MLFs of our CGL portfolio are business sector and SIC code. Figure 4.1 showsthe SIC code as a single random effect per level. All SIC codes are listed on the samehierarchical level. If a model includes a random effect according to this structure, thenwe call this model non-hierarchical. Figure 4.2 shows the SIC code hierarchicallyordered under business sector as nested random effects. The business sectors are listednext to each other at the same hierarchical level. Every SIC code is listed under one andonly one business sector. A model with nested random effects is called hierarchical ormultilevel.

Nested random effects

Hospitality Wholesale Retail · · ·

55101

55102

55201

· · ·

4622

4634

4636

4637

· · ·

4711

47191

47221

47222

47241

· · ·

· · ·

Figure 4.2: Nested random effects. The green nodes are business sectors. The

pink nodes are SIC codes.

4.5 Generalized linear models

Linear regression models like the examples in section 4.3 assume normally distributedrandom errors and a mean structure that is linear in the regression parameters. Thisconflicts with the responses of interest in non-life pricing models. Firstly because thenumber of claims follows a discrete probability distribution on the non-negative integers.Secondly because the mean number of claims is not linear with the covariates. Wetypically want to impose a multiplicative rather than the additive structure which linearregression brings us. In actuarial science the generalized linear model (GLM) isthe main regression technique to find the relation between the response and explainingvariables. It provides solutions to the two drawbacks of the linear models. The techniqueis thoroughly discussed in Kaas et al. (2008), where the following three components aredescribed.

1. A stochastic component: a set of independent random variables Yi, i = 1, . . . , nwith a density in the exponential dispersion family of the form.

fYi(y; θi, φ) = exp

(yθi − b(θi)φ/wi

+ c(y;φ/wi)

), (4.5)

where b(.) and c(.) are real functions, θi is the natural parameter and φ is the scaleparameter. Here i represents the policyholder and wi is the weight of policyholderi in the policy year under consideration.

2. A systematic component that attributes a linear predictor ηi = x′iβ =

∑j xijβj

to every observation. βj are fixed regression coefficients and xij are covariates.

3. A link function that links the expected value µi of the response variable to thelinear predictor such that ηi = g(µi).

Page 21: Valkhof, Aart 0182737 MSc ACT

16 Aart Valkhof — Pricing multi-level factors

GLM gives us the advantage to use another distribution than the normal distributionfor the random deviations from the mean. Another advantage is that the mean of therandom variable may not be a linear function of the explanatory variables, but it maybe expressed on a logarithmic scale. In this case we get a multiplicative model insteadof an additive model. The outcome of the GLM is the so-called a priori tariff (Denuit etal. (2007)), meaning that the actuary only uses covariates that are known in advance. Adisadvantage is that GLM cannot directly include random effects to take dependenciesor hierarchically structured data into account, or to create an a posteriori tariff. Thismakes GLM an example of a fixed effects model. Another disadvantage is that GLMassumes homogeneity of the underlying portfolio. We will discuss this further in thenext section.

4.6 Actuarial models for claim frequencies

Before using the GLM we should determine which distribution is valid for our responsevariable N . In this section we do not use the indices i and t for readability reasons.The Poisson distribution is often used to model count data in general and to modelthe number of claims made in particular. Of course the number of claims is a discretevariable, so we are restricted to the discrete distributions. Another candidate is thenegative binomial distribution. Here the mean is no longer equal to the variance - as inthe case of Poisson - but the variance exceeds the mean. We call this over-dispersion.The Poisson distribution with exposure has the following probability mass function:

P (N = k) =(λw)k

k!e−λw, (4.6)

where w is the exposure measure and k the number of claims, k = 0, 1, 2, . . . . Denuitet al. (2007) states that the use of the Poisson distribution is obvious, but only whenthe underlying population is homogeneous. Unfortunately in practice this is not alwaysthe case. The difference in behavior among individual policyholders that cannot beobserved by the actuary leads to unobserved heterogeneity. Over-dispersion is a well-known consequence of unobserved heterogeneity in count data analysis. This meansthat the variance of the number of claims is larger than the mean. A way to managethis unobserved heterogeneity is to impose a random variable (called Θ) on the meanparameter of the Poisson distribution. Denuit et al. (2007) call this a mixed Poissondistribution with parameters λ for the mean frequency and Θ for the positive randomeffect. In a mixed Poisson model the annual expected claim frequency itself becomes arandom variable. The obtained distribution is defined as

P (N = k |Θ) =(λwΘ)k

k!e−λwΘ. (4.7)

We obtain unconditional probabilities by integrating 4.7 over the random variable Θ.

P (N = k) = E[P (N = k |Θ)] =

∫ ∞0

e−λwθ(λwθ)k

k!dFΘ(θ), (4.8)

where FΘ(θ) is the distribution function of the random variable Θ. The mixed Poissonmodel 4.8 is an accident-proneness model: it assumes that a policyholder’s mean claimfrequency does not change over time but allows some insured persons to have highermean claim frequencies than others. We will say that N is mixed Poisson distributedwith parameter λ and risk level Θ, denoted as N ∼MPoi(λ,Θ) when it has probabilitymass function 4.8. The condition E[Θ] = 1 ensures that

E[N ] = E[E[N |Θ]] =

∫ ∞0

∞∑k=0

k e−λwθ(λwθ)k

k!dFΘ(θ) = λw. (4.9)

Page 22: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 17

A well-known candidate for the distribution of Θ is a Gamma(a, a)-distribution forsome a > 0. Now the expectation of Θ meets the condition of being equal to one. Thedensity function of this distribution is

fΘ(θ) =1

Γ(a)aa θa−1 e−aθ, θ > 0. (4.10)

The unconditional probability mass function then becomes

P (N = k) = EΘ[E[P (N = k | λwθ) |Θ]]

=

∫ ∞0

e−λwθ(λwθ)k

k!dFΘ(θ).

Replace dFΘ(θ) by fΘ(θ)dθ = 1Γ(a)a

aθa−1e−aθdθ and we obtain

P (N = k) =

∫ ∞0

e−λwθ(λwθ)k

k!

1

Γ(a)aa θa−1 e−aθ dθ

=aa(λw)k

Γ(a)k!

∫ ∞0

e−(λw+a)θθk+a−1 dθ

=Γ(a+ k)

Γ(a)Γ(k + 1)

(a

a+ λw

)a ( λw

a+ λw

)k.

In the last line we recognize the probability mass function of the negative binomialdistribution with expectation wλ and variance wλ+ α(wλ)2.

The usual way to find an estimate for the parameters of the distributions is themethod of maximum likelihood. This method defines the likelihood as the product of theprobabilities to observe all outcomes in the data, and approaches this as a function of thedistribution parameters. The maximum likelihood estimator (MLE) of the parametersis the value for which the likelihood is maximum. GLM also uses this method to findan estimate for β.

4.7 Credibility theory

Credibility theory is one of the oldest actuarial techniques. We will focus on the greatestaccuracy credibility, introduced by Buhlmann (1967) in the 1960s. Many textbooksexplore this subject, like Kaas et al. (2008), chapter 8 and Buhlmann and Gisler (2005).Credibility is useful if the actuary has to set a premium for a group of policies forwhich there is limited claim experience on a smaller group, but a lot more experienceon a larger group of policies that are more or less related. Credibility can set up anexperience rating system to determine the pure premium, taking into account not onlythe individual experience with the group, but also the collective experience. There aretwo extreme positions. One is to charge every policy the same premium, calculated overall policies that are present in the data. The other extreme is to charge group or policyr its own premium based on its own claim average. Credibility provides the followingformula to obtain a weighted average of the two extreme positions.

zrNr + (1− zr)N , 0 6 zr 6 1, (4.11)

where r is a group of policyholders or risk class, Nr is the average number of claimsof the risk class r and N is the average number of claims of the portfolio. zr is thewell-known credibility factor or credibility weight.

We now switch to a chronological overview of the different credibility models toenvision this theory. We start with the Buhlmann model:

Nrt = m+ Ξr + Ξrt. (4.12)

Page 23: Valkhof, Aart 0182737 MSc ACT

18 Aart Valkhof — Pricing multi-level factors

We interpret m as the overall mean. This is the expected number of claims for an arbi-trary risk class. Ξr and Ξrt are independent random variables for which the expectationsare zero. The variance of Ξr describes the variation between risk classes. Ξr denotes arandom deviation from mean m, specific for risk class r. The components Ξrt denotethe deviation for year t from the long-term average of a risk class r. They describethe within-variation of a risk class. After the introduction of the Buhlmann model, en-hancements have been added. Buhlmann and Straub (1970) created a model where theweight of a risk is included. The Buhlmann-Straub model has the same decompositionas 4.12, but the variance of Ξrt is s2/wrt. Here wrt is the weight attached to observationNrt. Buhlmann and Jewell (1987) introduced Jewell’s hierarchical model. This is animprovement of the Buhlmann-Straub model that is compatible with the modeling ofhierarchically structured data. Antonio et al. (2010) implements a hierarchical modelfor data of insurance companies, fleets of vehicles and vehicles. The number of claimsfor risk class r in sector p in year t can be decomposed as follows:

Nprt = m+ Ξp + Ξpr + Ξprt. (4.13)

Again m is the overall mean. Ξp is the deviation of sector p from mean m. In ourexample p could be the insurance company. Splitting up insurance company p into fleetq and each fleet q in vehicle v, each with its own deviation Ξp + Ξpq + Ξpqv, leads toa hierarchical chain of models. One can use this model to fixed effects. For example,denote p as the region and j as the gender of the driver. By adding the term Ξ′j one candescribe the risk characteristics of group j. This model is a cross-classification model.

While GLMs make use of a distribution, which is specified at forehand, credibilitytheory does not use a distribution. Due to this distribution free property the estimationof the parameters is difficult. This estimation depends on moment estimation. This isa cumbersome method especially compared to the maximum likelihood method. Freeset al. (1999) took account for the renaissance of credibility theory by showing that theclassical credibility model can be reformed to a linear mixed model (LMM). For themodeling of claim counts the GLM is a better framework than LM. Therefore we wantto switch to GLM with fixed and random effects in a GLMM framework.

4.8 GLMs with random effects

In this section we extend the GLM with random effects. In subsection 4.8.1 we discussthe combination of GLM and credibility. In subsection 4.8.2 we elaborate on GLMMs.

4.8.1 Backfitting algorithm

Ohlsson (2008) describes the ideas underlying the combination of GLM with credibilitytheory (GLMC). It is a distribution-free and a simple approach similar to credibilitytheory. Besides models for random variables with a single random effect per level, Ohls-son also describes models with nested random effects. Ohlsson poses that GLMC isespecially suited for the estimation of multi-level factors (MLFs). He incorporates thetwo structures of random effects which are discussed in section 4.4. The first structure isthe single random effect per level, denoted as Uj . In this subsection we use the notationthat is used by Ohlsson (2008). The multiplicative model then looks like this.

E[Yijt | Uj ] = µγiUj , (4.14)

where Yijt is the observed response variable, i is a priori tariff cell and j is a groupof risks, like the MLF level. Repeated observations are indexed by t. µ is the basepremium and γi is product of the price relativities for tariff cell i, so γi = γi1γ

i2 . . . γ

iR

with R denoting the number of ordinary factors. µ and γir can be estimated by standardGLM methods by initially disregarding Uj . Uj is the random effect, with E[Uj ] = 1. Ujis estimated by the following algorithm. It is named the backfitting algorithm.

Page 24: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 19

Step 0 Initially, let Uj = 1 for all j;

Step 1 Estimate the parameters for the ordinary rating factors by a GLM using Poissonwith log-link, using log(Uj) as an offset-variable. This yields µ and γi;

Step 2 Compute σ2 and τ2 using the formulas (2.9) and (2.10) of Ohlsson (2008) andthe outcome of Step 1;

Step 3 Use equation 4.15 to compute Uj using the estimates from Step 1 and 2;

Step 4 Return to Step 1 with the new Uj from Step 3 until convergence.

Uj is estimated with this technique

Uj = zjY .j.

µ+ (1− zj), (4.15)

where Y .j. is the average response over group j. The tilde symbol means that Y istransformed by γi. The bar means that it is the average. zj is the well-known credibilityfactor. If zj is zero then Uj will be equal to 1. If zj is 1, than Uj will be the average of thegroup j. zj is depending of the total exposure of j, σ2 and τ2, like is specified in formula(2.5) of Ohlsson (2008). For groups with small variation between the observations ofthat group in comparison with the variation between the groups, zj tends to 1.

Ohlsson (2008) also considers the use of nested random effects. This is the multi-plicative model:

E[Yijkt | Uj , Ujk] = µγiUjUjk. (4.16)

Here we have two random effects, namely Uj for sector j and Ujk for group k withinsector j. The other indices correspond to those in the hierarchical model in expression4.14. Assumptions are that E[Uj ] = 1 and E[Ujk |Uj ] = 1. We illustrate the hierarchicalmodel with the example with car brand and car model form Ohlsson (2008). In theexample the key ratio Yijkt is the claim frequency, i is a policyholder, j is a car brand,k is a car model and t is the time period. The ordinary rating factors γi = γi1γ

i2 . . . γ

iR

are well-known factors, like vehicle class, vehicle age and geographic zone.To find the estimates of the hierarchical model the same iterative backfitting algo-

rithm is used. The random effects Uj and Ujk are both initially set to 1 for all j and

k. µ and γir’s can be estimated by standard GLM methods by incorporating log(Uj)and log(Ujk) as an offset-variable. The formulas of the estimates are different than thenon-hierarchical case, but provided by Ohlsson (2008).

4.8.2 Generalized linear mixed models

Generalized linear mixed models (GLMMs) extend GLMs by including random effectsin the linear predictor. The random effects reflect the idea that there is a natural hetero-geneity across risk classes and that the observations on the same subject share commoncharacteristics. In section 4.3 the advantages of linear mixed models are discussed. Theseadvantages also apply to GLMMs. The idea of combining credibility and GLMs was in-troduced in the actuarial literature by Nelder and Verrall (1997). Frees et al. (2014),chapter 16, and Antonio and Beirlant (2007) elaborate this topic and explain severalexamples.

GLMMs extend GLMs by adding random effect z′ijui to the linear predictor x′ijβ.z′ij is a vector of known covariates of the random effects and ui is a parameter vector of

random effects for subject i. Conditional on ui, GLMM assumptions for the jth responseon subject i, response variable Yij are

Yij | ui ∼ fYij | ui(yij | ui)

fYij | ui(yij | ui) = exp

(nijθij − b(θij)

φ− c(nij , φ)

)ui ∼ fU (ui). (4.17)

Page 25: Valkhof, Aart 0182737 MSc ACT

20 Aart Valkhof — Pricing multi-level factors

Like with GLM there is a link function g(.) in order to relate the mean µij to the fixed(β) and random effect (ui) parameter vectors.

g(µij) = x′ijβ + z′ijui, (4.18)

where ui is the vector of random effects for cluster i, β is the vector with the fixedeffects parameters, xij and zij are p and q dimensional vectors of known covariatescorresponding with the fixed and random effects, respectively.

When the response variable Yij follows a Poisson distribution, like in our case withnumber of claims, we use the logarithm as the link function. So,

log(µij) = x′ijβ + z′ijui

µij = ex′ijβ+z′ijui . (4.19)

The likelihood of the GLMM with specification 4.17 is

L(β,D | yij) =

∫fYij | ui

(yij | ui)fU (ui)du, (4.20)

where the integral goes over the random effects vector ui (with covariance matrixD). Frees et al. (2014), chapter 16, state that due to the integral in 4.20 there are noexplicit expressions for estimators and predictors. Approximations to the likelihood ornumerical integration techniques are required to maximize 4.20 with respect to the un-known parameters. Three approaches are distinguished to approximate the estimationsof β, D and predictions of the random effect for clusters i, ui.

1. The Laplace approximationThe Laplace method approximates integrals of the form∫

eh(u)du (4.21)

for some function h of a q-dimensional vector u. See Tierny and Kadane (1986).Then, we have, by Taylor expansion,

h(u) ≈ h(u+1

2(u− u)′h′′(u)(u− u). (4.22)

This expansion approximates 4.20.

2. The penalized quasi-likelihood (PQL)This method is also called pseudo-likelihood (PL). It is based on an algorithmwith a working variate. The algorithm starts with initial estimates of β, u andthe variance components. By using linear mixed model techniques the workingvariates and variance are updated. These steps are repeated until convergence ofthe estimates. Breslow and Clayton (1993) give justifications of the approach.

3. The adaptive Gauss-Hermite quadrature (GHQ)The non-adaptive Gauss-Hermite quadrature approximates the value of integralsof the kind stated in (4.23) with a weighted sum:∫ ∞

−∞h(z)e−z

2dz ≈

Q∑l=1

wlh(zl). (4.23)

Q is the order of the approximation, the zl are the zeros of the Qth order Her-mite polynomial, and the wl are corresponding weights. With the adaptive Gauss-Hermite quadrature rule, the nodes are rescaled and shifted such that the inte-grand is sampled in a suitable range. The integral in 4.20 is approximated with the

Page 26: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 21

adaptive Gauss-Hermite quadrature rule for numerical integration. This numeri-cal integration technique still enables for instance a likelihood ratio test. Moreoverthe estimation process is just singly iterative. On the other hand, at present, theprocedure can only deal with a small number of random effects which limits itsgeneral applicability. When Q = 1, z1 = 0 and w1 = 1, then this method corre-sponds with the Laplace approximation method. Lui and Pierce (1994) give moredetails on GHQ.

Frees et al. (2014), chapter 16, discusses some pros and cons of the three methods.Laplace and PQL rely on quite a few approximations. Therefore the accuracy is low. Theadvantage of PQL is that a large number of random effects but also crossed and nestedrandom effects can be handled. The approximation through numerical integration ismore accurate. But this method is limited to GLMMs with a small number of nestedrandom effects. Gauss-Hermite quadrature is explicitly designed for normally distributedrandom effect which gives you less flexibility.

Page 27: Valkhof, Aart 0182737 MSc ACT

Chapter 5

Modeling of the MLF

Before we start the analysis section, we describe the current practice for the CGIportfolio at the insurer concerned. The insurer uses GLM to model the claim frequency.Besides an overall intercept their model includes the covariates revenue class and busi-ness sector. Although this model is technically correct, it is not accurate enough. Theinsurer projects the tariff per business sector to the appropriate level of SIC codes. Thisprojection is done by risk experts. This is a time consuming and subjective process. Thetariff is heavily depending on human preferences. In this chapter we will relate to thecurrent practice and explore alternatives.

We analyze the data described in chapter 3. The variable of interest is Nrt, which isthe number of claims observed per risk class r and time period t. The number of policyyears wrt is the measure of exposure. We perform the data preparation and analysisin statistical software package R. Section 5.1 presents the modeling of the claimfrequency. Several well-known distributions for claim counts have been fitted to thedata. The goal is to choose a distribution that we can use in the GLM frameworkpresented in section 5.2. In this section we search for the best GLM models using onlyfixed effects. Here, we approach all available covariates as fixed effects, including theMLF. Playing with the MLF in this manner will show the difficulties of using the MLF.Section 5.3 introduces the MLF as a random effect. First in section 5.3 we incorporatethe SIC code in a GLMM with a non-hierarchical structure. Finally in section 5.4we incorporate the business sector and SIC code in a GLMM with a hierarchicalstructure in order to show the added value of such a nested structure of our randomeffects.

5.1 Actuarial modeling of claim frequencies

We start with fitting a negative binomial, Poisson and over-dispersed Poisson (quasipois-son) distributions. We do not take any regression parameters into account, but we doincorporate the exposure. To get a first impression of our data we made table 5.1. Table5.1 compares the fit of several distributions to the data. Hereby we follow the procedurefrom Kaas et al. (2008), page 65, to fit the negative binomial distribution to the data.First we calculate the estimates for the parameters of the fitted distributions. Then wetabulate the empirical distribution of the data as well as the fitted distributions. Forthe Poisson distribution λ = 0.0443 is used. We estimate λ as follows

λ =

∑rtNrt∑rtwrt

(5.1)

where r and t indicate the risk classes and the repeated observations. Table 5.1 showsthat the data has a right-skewed tail. The negative binomial distribution fits this tailbetter compared to the Poisson. After this first impression we fit the data with the helpof the glm and glm.nb functions of the R library stats. We used the following model

22

Page 28: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 23

Number of claims

Distribution Parameter 0 1 2 3 4 5 6 7 8 9 10 11 12 13

Empirical 5,684 294 56 27 9 4 0 2 1 1 0 0 0 1Negative binomial r=0.0967;p=0.5004 5,685 275 75 26 10 4 2 1 0 0 0 0 0 0

Poisson λ=0.0443 5,816 258 6 0 0 0 0 0 0 0 0 0 0 0

Table 5.1: Comparing different distributions for the number of claims.

and commands

λrt = exp (β0 + log(wrt)), (5.2)

glm(N∼ 1, offset = log(exposure), family = poisson(link = log)) (5.3)

glm.nb(N∼ 1 + offset(log(exposure)), link = log) (5.4)

glm(N∼ 1, offset = log(exposure), family = quasipoisson). (5.5)

Figure 5.1 and table 5.2 show the resulting estimates. 95% confidence intervals, estimatesand Akaike information criterion (AIC) are given. The AIC is derived as stated in Kaaset al. (2008), page 248, and is calculated with the following equation:

AIC = −2`+ 2k, (5.6)

where k is the number of parameters and ` is the logarithm of the likelihood. Theroutines glm and glm.nb include the AIC in their output. The quasipoisson distributiondoes not include a AIC because there is no log likelihood for this distribution. The

−3.25−3.20−3.15−3.10−3.05−3.00

Neg. binomiaal Poisson Quasipoisson

Figure 5.1: Estimate (+/- 1.96 * s.e.).

Distribution Estimate s.e. AIC

Neg. binomial -3.123 0.052 2,966.7Poisson -3.117 0.041 3,118.1Quasipoisson -3.117 0.046

Table 5.2: Comparing distributions.

confidence intervals show small differences, though the confidence interval of the Poissondistribution is the smallest. The negative binomial distribution returns the lowest AIC.Based on the fits obtained above where no covariates are taken into account, we shouldopt for a negative binomial distribution to fit the data. Unfortunately the R library thatwe will use for the GLMM modeling does not support this distribution. The library lme4

contains the function glmer.nb where nb stands for negative binomial. The packagemanual -lme4 (2015)- declares this function as experimental. Taking this into account,we continue our analysis with the Poisson distribution and extend the basic fit obtainedhere by fixed and random effects.

5.2 Generalized linear model

In this section we discuss the GLM analyses and outcomes. We fit the GLMs using theglm function in R. For all models we choose for a Poisson distribution in combinationwith a log link function. We use the logarithm of exposure as an offset. First we proposea basic model. This is a model with the ordinary rating factors without the use ofthe MLF. We refer to this model as complete pooling because the model ignores theclustering of the data in economic activities. In a next step we extend the basic model

Page 29: Valkhof, Aart 0182737 MSc ACT

24 Aart Valkhof — Pricing multi-level factors

by the MLF as if it were fixed effects. We first add business sector to the basic modeland after that we will replace business sector with SIC code. The latter is called theno pooling model, because there is no pooling of the data anymore. Every SIC codewill have its own estimate. The basic model extended by the business sector is calledsemi-complete pooling model. The pooling of the data is in between complete poolingand no pooling.

5.2.1 Complete pooling

The starting point of our analyses is the basic GLM model, where we ignore the cluster-ing of the data by the economic activity. We refer to this model as a complete poolingmodel. The covariates under consideration are the ordinary rating factors which we ap-proach as the fixed effects, namely revenue class, occurence limit and claim year. Weuse β0 for an overall intercept. β1 is the parameter for the revenue class. This is theclassification of the revenue of the policyholder as described in subsection 3.2.4. Thiscovariate contains numeric values and is estimated by a single parameter β1, inspiredby Kaas et al. (2008) apply to the bonus-malus factor in chapter 9. In order to re-trieve the appropriate frequency we multiply β0 by the revenue class. For example, ifa policyholder has revenue classification 5, then the frequency -apart from the otherparameters- will be exp(β1 ·5). The vector β2 consists of two parameters for the levels ofthe explanatory variable occurrence limit. Vector β3 contains three parameters for thethree levels of the explanatory variable claim year. We investigate three GLM models.

Nrt ∼ POI(wrt · λrt),λrt = exp (β0 + β1 · revenue classrt), (5.7)

λrt = exp (β0 + β1 · revenue classrt + β2 · limitrt), (5.8)

λrt = exp (β0 + β1 · revenue classrt + β2 · limitrt + β3 · claim yearrt). (5.9)

Here r is a risk class and t are the repeated observations. The covariates occurrencelimit and claim year need to be treated as categorical variables, which is achieved in R

by the factor instruction. We define:

limit = factor(occurrence limit) claimyear = factor(claim year). (5.10)

We create model (5.9) in R with the following instruction:

glm(N ∼ revenue cat + limit + claimyear, offset = log(exposure),

family = poisson(link = log), data = ds glm)(5.11)

We commence our quest for the relevant rating factors by comparing the models(5.7) and (5.8). (5.7) is the most restrictive model, because it does not include covariatesand presumes that the elements of the vectors β2 and β3 must be equal to zero. In ahypothesis test, the null hypothesis is H0 : β2 = 0. The alternative hypothesis is H1 :β2 6= 0. The hypothesis testing is done by the function anova with the following call:

anova(glm.fit1, glm.fit2, test = ”Chisq”) (5.12)

Here glm.fit1 stands for (5.7) and glm.fit2 for (5.8). The function returns a p-valueof 0.01864, which tells us that the null hypothesis is rejected. The same test for model(5.8) to model (5.9) gives a p-value of 0.2634. The null hypothesis is not rejected. Intable A.3 of the appendix the estimates, standard errors and AIC of all three modelsare listed.

Based on the hypothesis testing the relevant covariates are revenue class and theoccurrence limit. Later on in this chapter we will find out that one of the mixed modelsdoes not converge if the covariate occurrence limit is included. Because our main goal is

Page 30: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 25

to compare the different mixed models methods, we decide to choose for revenue classas our only fixed effect. We call this model the basic model (5.13). In this model everypolicyholder within a revenue class will have the same rate independent of the economicactivity. For practical utilization by an insurer the rate needs more differentiation. Wewill do this in the next subsection by investigating the covariates business sector andSIC code as fixed effects.

λrt = exp (β0 + β1 · revenue classrt). (5.13)

Model names

Complete (5.7) Semi-complete (5.14)Covariate Estimate (s.e.) p-value Estimate (s.e.) p-value

Intercept −4.541 (0.121) 0.000 −4.944 (0.145) 0.000Revenue cat 0.225 (0.016) 0.000 0.224 (0.016) 0.000Business sector

Hospitality ref. groupConstruction 1.063 (0.107) 0.000Garage −0.109 (0.456) 0.812Manufacturing 0.517 (0.190) 0.007Retail −0.302 (0.148) 0.042Wholesale 0.251 (0.170) 0.141

Observations 6,079 6,079Log Likelihood −1,443.877 −1,354.026Akaike Inf. Crit. 2,891.753 2,722.053

Table 5.3: Estimate, s.e. and statistics for the (semi-) complete poling models.

5.2.2 Semi-complete pooling

Starting from the model in (5.13) we analyze the candidate random effects as if they arefixed effects. In this way we create a model that we can use for comparison. We executeour strategy in two steps. Firstly, we extend our basic model with the business sector.Secondly, we take the SIC code into account. We call the model with the extensionof business sector semi-complete pooling (5.14), because the pooling of data is inbetween complete pooling and no pooling. The business sector is taken as a fixed effectbesides the other fixed effect revenue class. We use the R function factor to establishthat business sector is a categorical variable.

λrt = exp (β0 + β1 · revenue classrt + β4 · Sectorrt). (5.14)

Here, the vector β4 is a vector with six elements that corresponds with the six regres-sion parameters for the covariate business sector. Table 5.3 states the outcomes of thecomplete and semi-complete pooling models. The estimates for the covariate revenueclass are almost equal, but the overall intercept of the semi-complete pooling model ismuch higher. This is compensated by the estimate of the covariate business sector. Weperformed an hypothesis test with H0 : β4 = 0 and H1 : β4 6= 0. With a p-value of 0 H0

is rejected. We include business sector in our model.With a simple example we demonstrate the risk of the complete pooling model.

According to the complete model the annual frequency for a policyholder in the Con-struction sector with a revenue class of 4 is exp(−4.541 + 4 · 0.225) = 0.0262. Note thecalculation for the revenue class factor. We multiply β1 by four because revenue class

Page 31: Valkhof, Aart 0182737 MSc ACT

26 Aart Valkhof — Pricing multi-level factors

is modeled as a numeric value. The semi-complete model gives the same policyholder apremium of exp(−4.944 + 4 · 0.224 + 1.063) = 0.05054. The policyholder will choose forthe first premium, and the insurer will end up with all the bad risks for a low premium.

5.2.3 No pooling

We construct a model with no pooling. We do this by replacing the business sectorwith the categorical variable SIC code:

λrt = exp (β0 + β1 · revenue classrt + β5 · SICrt). (5.15)

Vector β5 consists of 349 elements that correspond with the levels of the covariate SIC.The outcomes of model (5.15) are questionable. The covariate SIC code has many levelswith very low exposure. This leads to non-realistic outcomes. For example, SIC code1072 has an estimated value of −15.925 and a standard error of 25, 624.20. The p-valueis 0.9995. Such outcomes are the rule rather than the exception. This issue is discussedin section 4.5. The data is too sparse to accurately estimate a parameter for each level.We add this model to our research to demonstrate the problem of pricing of a MLF.

We conclude that the GLM techniques can bring us a tariff per business sector. Aswe described in the introduction of this chapter this tariff is not accurate enough. Theinsurer will be outcompeted in a competitive market. We will now investigate if theGLMM techniques can improve the tariff.

5.3 Non-hierarchical models

We start the investigation of GLMMs with a discussion on models that do not make useof the nested structure of our MLF. We therefore only analyze the SIC code covariateand neglect the covariate business sector. This section compares three models. Our firstmodel is the no pooling model with only fixed effects. We will not discuss this model inthis section separately as we discussed it before in subsection 5.2.3. Then we investigatea random effects model using the SIC code as a random effect and calibrate it with thebackfitting algorithm of Ohlsson, followed by the same random effects model calibratedby adaptive Gauss-Hermite quadrature method. The last calibration method can onlyhandle non-hierarchical, univariate random effects, hence the choice for a non-nestedmodel.

5.3.1 Backfitting algorithm

Ohlsson (2008) proposed a strategy to combine GLMs with credibility theory, knownas their backfitting algorithm. See subsection 4.8.1 in this thesis. We implemented thealgorithm in R. Starting point of our analysis is the basic model proposed in (5.7). Itcontains a base β0, plus a parameter β1 for the ordinary risk factor revenue class. Weadd to this model the covariate SIC code as a random effect, denoted as vj . j stands forthe level of the MLF (j = 1, 2, ..., 349). The non-hierarchical model for the backfittingalgorithm looks like this.

λit | vj = exp (β0 + β1 · revenue classit + β5 · vj),E[vj ] = 0, (5.16)

for policyholder i, observation t and SIC code j.In contrast to the GLMM there are no standard R libraries that support the back-

fitting algorithm. Therefore we had to implement the complete algorithm ourselves. Animplementation of the backfitting algorithm is given in appendix B.2. The core of thealgorithm is given by

glm(N ∼ revenue cat + offset(log(v hat j)), offset = log(exposure),

family = poisson(link = log), ds glm)).(5.17)

Page 32: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 27

A remarkable difference compared to glm call 5.11 is the offset(log(U hat j)) spec-ification which is included in the formula parameter. Due to this offset the estimatesfor β0 and β1 will be determined iteratively. vj itself will be determined is step 3 of the

algorithm. The output of the algorithm is β0 = −4.586 (with s.e 0.121), β1 = 0.227(with s.e. 0.016). Besides the estimates of the parameters and the standard errors of thefixed effects, the algorithm produces credibility factors. Table A.6 displays the credibil-ity factors zj and estimates of the algorithm. We see that SIC codes with high exposurehave a high zj score which means that the estimate of the SIC code is mainly basedon its own experience. For SIC codes with a low zj value the estimate is based on theexperience of the whole portfolio. More information on the credibility factors and onthe algorithm can be found in subsection 4.8.1.

5.3.2 Adaptive Gauss-Hermite quadrature

The function glmer within the lme4 package also offers functionality to approximatethe integral 4.20 using the adaptive Gauss-Hermite quadrature method (GHQ). Theparameter nAGQ=15 in R command 5.19 defines the accuracy of this approach. nAGQ

stands for the number of points per axis for evaluating the adaptive Gauss-Hermiteapproximation to the log-likelihood and corresponds with the Q in 4.23. This parameteris one by default, which corresponds to the Laplace approximation. In our experimentsa value higher than 15 does not lead to different outcomes. The GH quadrature inpackage lme4 does not allow us to model more than one random effect as stated in lme4(2015). For that reason we restrict this method for non-nested models only. The modelspecification is as follows:

Nrt | vj ∼ POI(wrt · λrt | vj)λrt | vj = exp (β0 + β1 · revenue classrt + β5 · vj)

vj ∼ N(0, σ2v), (5.18)

where r and t denote the risk class and the repeated observation. j represents the levelof the MLF SIC code. The corresponding R instruction then is:

glmer(N ∼ revenue cat + (1|SIC), offset = log(exposure),

family = poisson(link = log), ds glm, nAGQ = 15))(5.19)

From the R output we can conclude β0 = −5.037 (with s.e 0.168), β1 = 0.226 (with s.e.0.016) and σ2

v = 0.976 is the estimate of the variance of the random effect. More resultscan be found in table A.5 and A.6 in the appendix.

5.3.3 A comparison of the methods

In this subsection we compare the different methods for non-hierarchical models. Be-cause there are 349 SIC codes it is hard to plot all of them on a single graph. Therefore,we have to make a selection of SIC codes with the purpose to give a balanced represen-tation between detail and overview. In figure 5.2 the estimates and confidence intervalsfor the SIC codes of the Hospitality sector are shown. This involves eleven SIC codes(j = 1, ..., 11) and 3 models, so there are 33 estimates. The x-axis represents the log ofthe sum of the exposure of observation t over SIC code j (i.e., log(

∑twjt)). The point

estimates are the sum of the overall intercept and the estimates for the random effect,so β0 + β5 · vj . The red dashed line is y = −4.9441, the sum of the overall intercept ofthe semi-complete pooling model (5.14) and the estimate for the business sector Hospi-tality, so β0 + β4,Hospitality. We obtained the confidence intervals by adding/subtracting1.96 ·sqrt(σ2

j ) to the point estimates, like the procedure from Frees et al. (2014), chapter16. The SIC codes with high exposure have resembling estimates in all three models and

Page 33: Valkhof, Aart 0182737 MSc ACT

28 Aart Valkhof — Pricing multi-level factors

56101

56102

5630562155101

55102

56103

5629

5530

55201

55202

−8

−6

−4

−2

2 4 6log(exposure)

estim

ate

(+/−

1.9

6 *

s.e.

)

Figure 5.2: Estimates and 95% confidence intervals for the eleven SIC codes of theHospitality sector. Every SIC code has a group of three dots for the following estimates:• No pooling; • backfitting algoritm; • GH quadrature.

their corresponding standard errors are small. For SIC codes with smaller exposure theno pooling model deviates from the other two models. This shows the weakness of theno pooling model. For example, SIC code 5629 and 56103 have an exposure of 65.82and 80.67. There are no claims reported. The estimate of the no pooling model is verysmall and the confidence interval is infinite. These estimates and confidence interval arenot realistic. The estimates of the backfitting algorithm and GH quadrature methoddo treat these SIC codes better. Even though the data is sparse, these models willreturn acceptable outcomes. The two models have added value in comparison to thecurrently used semi-complete model. Since the estimates per SIC code differ from thesemi-complete overall intercept we conclude that differentiation on SIC code is justified.

The regression parameters of all three models are available in the appendix. TableA.4 shows the fixed factors of the three non-hierarchical models. The overall interceptof the backfitting algorithm is the highest, followed by no pooling and Gauss-Hermitequadrature model. The regression parameters of the revenue class are almost equal toeach other. Table A.6 shows the estimates of the MLF SIC code. Besides the exposurewj we show the credibility factor zj , that are calculated by the backfitting algorithm.Sorting on the credibility factor we made a selection of the top, middle and bottom SICcodes.

Figure 5.3 shows the frequencies for a random selection of 500 policyholders. Forpolicyholders with high exposure the difference between the three premiums is small.For lower exposure the GH quadrature deviates from the other two methods. It is hard tosee any blue dots because the red dots overwrite the blue dots. The no pooling premiumsare close to the backfitting premiums. When a risk class has little observations, thenthe no pooling model returns unrealistic small premiums and the backfitting or GHquadrature premium is preferred.

5.4 Hierarchical models

We implement three hierarchical models, each with a different method to do the estima-tion. Firstly we present the solutions of the backfitting algorithm of Ohlsson. Secondlythe outcomes obtained by Laplace approximation are analyzed. We conclude with thepenalized quasi likelihood method (PQL). We use the nested structure that is explainedin section 4.4 where the SIC code is hierarchical ordered under the business sector.

Page 34: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 29

Hospitality Construction Garage

Manufacturing Retail Wholesale

0

2

4

6

0

2

4

6

−2.5 0.0 2.5 −2.5 0.0 2.5 −2.5 0.0 2.5log(exposure)

rate

Figure 5.3: Premium rates for 500 random policyholders grouped per sector. ◦ No pooling;◦ backfitting algoritm; ◦ GH quadrature.

5.4.1 Backfitting algorithm

The outcomes of the first hierarchical model are calculated with the algorithm thatOhlsson describes in section 3 of Ohlsson (2008). The MLF under consideration is theSIC code. The covariates SIC code and business sector are added to the model as randomeffects. The business sector is denoted as uj for j = 1, ..., 6 and the SIC code is denotedas vjk for k = 1, ..., 349. Our hierarchical model for the backfitting algorithm looks likethis.

λit | uj , vjk = exp (β0 + β1 · revenue classit + β4 · uj + β5 · vjk),E[uj ] = 0,

E[vjk] = 0, (5.20)

where i and t denote the policyholder and the repeated observation. In step 1 of thealgorithm this glm call is invoked:

glm(N ∼ revenue cat + offset(log(u hat j)) + offset(log(v hat jk)),

offset = log(exposure), family = poisson(link = log), ds glm))(5.21)

This is the core of the algorithm where the fixed effects are approximated by offsettingthe two random effect. In step 2 and 3 the variance parameters and the random effectsuj and vjk are approximated. After step 3 the algorithm will start again at step 1 untilconvergence. In our models convergence is achieved within 5 iterations. The estimatedregression parameters of the fixed effects are β0 = −4.531 (with s.e. 0.121) and β1 =0.225 (with s.e. 0.016). The tables A.8, A.9 and A.10 show the estimates for the randomeffects business sector and SIC code. Also the exposure and the credibility factor zj aregiven.

Page 35: Valkhof, Aart 0182737 MSc ACT

30 Aart Valkhof — Pricing multi-level factors

5.4.2 Laplace approximation

We use Laplace approximation to estimate the outcomes of the second hierarchicalmodel. R offers several libraries for this approximations, like lme4, lme4a and glmmML.We do the modeling with the glmer function from the library lme4. This function usesLaplace approximation by default.

The model under consideration is this hierarchical model

Nrt | uj , vjk ∼ POI(wijk · λrt | uj , vjk)λrt | uj , vjk = exp (β0 + β1 · revenue classrt + β4 · uj + β5 · vjk)

uj ∼ N(0, σ2u)

vjk ∼ N(0, σ2v), (5.22)

for risk class r and repeated observation t. uj is a random effect for the covariate businesssector and vjk is a random effect for the SIC code within the sector. Besides the randomeffects we still have the fixed effects: the overall intercept and revenue class.

This is the model in R parlance with the Laplace approximation method.

glmer(N ∼ (1|Sector/SIC) + revenue cat, offset = log(exposure),

family = poisson(link = log), ds glm)(5.23)

The R-code returns the following output. β0 = −4.839 (with s.e 0.257) and β1 = 0.224(with s.e. 0.016) are the estimates for the fixed effects parameters. σ2

u = 0.4889 andσ2v = 0.251 are the estimates of the variance of the random variables. More results can

be found in table A.9 and A.10 in the appendix.

5.4.3 Penalized quasi likelihood

We calculate the outcomes of the third hierarchical model with the help of the penalizedquasi likelihood (PQL) method. The function glmmmPQL from the library MASS is avail-able to fit mixed models with PQL. The R code is listed below. The model specificationis the same as the model specification for the Laplace approximation method (5.22).

glmmPQL(N ∼ revenue cat + offset(log(exposure)), random =∼1|Sector/SIC, family = poisson(link = log), ds glm))

(5.24)

These are the estimates from the R output. β0 = −4.791 (with s.e 0.246) and β1 = 0.225(with s.e. 0.013) are the estimates for the fixed effects parameters. σ2

u = 0.492 andσ2v = 0.906 are the estimates of the variance of the random variables. More results can

be found in table A.9 and A.10 in the appendix.

5.4.4 A comparison of the methods

Like we did for the non-hierarchical models in subsection 5.3.3 we compare the methodsfor the hierarchical models. We use the same kind of graphs. In graph 5.4 we showthe estimates and confidence intervals of the eleven Hospitality SIC codes. The pointestimates are the sum of the overall intercept and the estimates for the random effects,so β0 + β4 · uj + β5 · vjk. The red dashed line is y = −4.9441, the sum of the overallintercept of the semi-complete pooling model (5.14) and the estimate for the businesssector Hospitality, so β0 + β4,Hospitality. The estimates of the backfitting algorithm arenot accompanied by confidence intervals. The theory behind this algorithm make use ofcredibility theory, which is distribution-free. Therefor the is no comprehensive statisticalframework available, like we have with GLMM. For SIC codes with high exposure theestimates of the three methods are equally. The confidence intervals of the PQL method

Page 36: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 31

56101

56102

5630562155101

55102

56103

5629

5530

55201

55202

−8

−6

−4

−2

2 4 6log(exposure)

estim

ate

(+/−

1.9

6 *

s.e.

)

Figure 5.4: Estimates and 95% confidence intervals for the SIC codes of the Hospitalitysector. • Backfitting algorithm; • Laplace approximation; • PQL.

are wider. For SIC codes with low exposure there is more difference between the es-timates. The backfitting algorithm has the lowest estimates, followed by the Laplaceapproximation and the PQL method. For SIC codes with middle exposure the PQLmethod gives the lowest estimates and the backfitting and Laplace method are compa-rable. All three mixed models differ from the semi-complete model. The mixed modelsadd differentiation for SIC codes, which is lacking in the semi-complete model. Thisrefinement is just what the insurance company is looking for. With mixed models onecan make the differentiation with the help of statistical models.

Table A.7 in the appendix compares the fixed effect of the hierarchical models.The intercept of the backfitting algorithm is the highest, followed by the PQL andLaplace method. The estimates for revenue class does not differ between the threemethods. Tabel A.8 shows the regression paramters for business sector. Now we see thatthe estimates of the backfitting algorithm are lower. For all methods the Constructionsectors is the highest risk and Retail is the lowest risk. This is in line with table 3.3.Table A.10 shows a selection of estimates for the MLF SIC code. The table is sorted bythe credibility factor zjk. This factor - discussed in subsection 4.8.1 - is an outcome ofthe backfitting algorithm. The GLMM do not output a credibility factor. The advantageof this factor that it is easy to interpret and to explain to non-actuarial people.

Figure 5.5 shows the premium for 500 random policyholder. Every sector has onegraph. For the policyholders with high exposure the different rates are very similar. Forpolicyholders with the lower exposure the backfitting algorithm rate is too low.

5.4.5 Non-hierarchical versus hierarchical models

Finally we compare non-hierarchical model with a hierarchical model to judge if the useof the nested structure brings any advantage. In figure 5.6 we display the estimates ofthe backfitting algorithm - both non-hierarchical and hierarchical - for the eleven SICcodes concerning the Hospitality sector. The figure is equally setup as figure 5.2 and 5.4.In case of high exposure the non-hierarchical and the hierarchical model do not differby much. But in the case of less exposure we see that the estimates of the hierarchicalmodel are closer to the average sector estimate, which is the red dashed line.

Page 37: Valkhof, Aart 0182737 MSc ACT

32 Aart Valkhof — Pricing multi-level factors

Hospitality Construction Garage

Manufacturing Retail Wholesale

0

2

4

6

0

2

4

6

−2.5 0.0 2.5 −2.5 0.0 2.5 −2.5 0.0 2.5log(exposure)

rate

Figure 5.5: Premium rates for 500 random policyholders grouped per sector. ◦ backfittingalgoritm; ◦ Laplace approximation; ◦ PQL.

56101

56102

56305621

5510155102

561035629

5530

55201

55202

−6.0

−5.5

−5.0

−4.5

−4.0

−3.5

2 4 6log(exposure)

estim

ate

(+/−

1.9

6 *

s.e.

)

Figure 5.6: Estimates and 95% confidence intervals for the SIC codes of the Hospitalitysector. • Non-hierarchical; • Hierarchical.

Page 38: Valkhof, Aart 0182737 MSc ACT

Chapter 6

Conclusions

This thesis has investigated the pricing of insurance products in the presence of MLFs.Several pricing techniques have been explored. We used a commercial general liabilityportfolio to apply the considered pricing techniques. The MLF in question was the eco-nomic activity. First this MLF is priced by a GLM as if it were a fixed effect. Thenthe MLF was incorporated as a random effect by using different GLMMs. We investi-gated several GLMM techniques. These are the backfitting algorithm by Ohlsson, theLaplace approximation, the Gauss-Hermite quadrature method and the penalized quasi-likelihood method. Due to the fact that our MLF has a nested structure we weightednon-hierarchical models and hierarchical models. Our main goal is to determine if thepricing of MLF by using mixed models techniques is an enrichment of the current pric-ing by the insurance company. If so, we want to know which techniques suits the most.Thirdly, we raise the question if the nested structure of the data justify the use of ahierarchical model.

Applying mixed models to the pricing of MLF improves the pricing of the CGL tariff.The current pricing by means of a GLM with the covariate business sector has a lack ofdifferentiation. Adding the MLF as a fixed effect to this GLM pricing will not providegood estimations for every single level of the MLF. When a mixed model is applied, thenall levels - even the levels with sparse data - will have proper estimations. We have seenthat for all GLMMs the rates are comparable if there were enough data. When the datais sparse then the estimates can differ. An advantage of the backfitting algorithm is thatthe credibility factor gives an indication of the credibility of our MLF. This is usefulfor the insurer. In case of a low credibility additional expert judgment can be deployed.A disadvantage of the backfitting algorithm is that inference testing and statistics arenot available. Another drawback of the backfitting algorithm is the lack of standardsoftware. Implementing the algorithm is a comprehensive and time consuming task.The glmer and glmmPQL function for GLMMs do not provide the credibility factors,but have a better statistical framework. Currently GLMMs have several drawbacks.Although progress has been made in the past years, the software is not as advancedas the GLM software. For instance, the GHQ functionality in the library glmer onlysupports one random effect. Another example is the experimental status of the glmer.nblibrary. Despite this drawback pricing actuary in the non-life domain should by familiarwith GLMM techniques to obtain better pricing. Mixed models techniques improve thepricing of MLFs.

Exploiting the nested structure of the data improves the tariff of a CGL portfolio.Again the tariff between between the non-hierarchical model and hierarchical model donot differ by much in case there is plenty of data, but in case the data is too sparse thehierarchical model gives better estimates.

33

Page 39: Valkhof, Aart 0182737 MSc ACT

Bibliography

Antonio K., J. Beirlant (2007). Actuarial statistics with generalized linearmixed models. Insurance: Mathematics and Economics (40), pp 58-76,doi:10.2143/AST.40.1.2049223.

Antonio K., E.W. Frees, E.A. Valdez (2010). A multilevel analysis of intercompany claimcounts. ASTIN Bulletin (40), pp 151-177.

Antonio K., Y. Zhang (2012). Mixed models for predictive modelling in actuarial science.Chapter 8 and 16 in Predictive Modeling Applications in Actuarial Science. Volume1: Predictive Modeling Techniques, Frees et al. (2014).

Breslow N.E., D.G. Clayton (1993). Approximate Inference in Generalized Linear MixedModels. Journal of the American Statistical Association (88), pp 9-25.

Buhlmann H. (1967). Experience rating and credibility. ASTIN Bulletin (4), pp 199-207.

Buhlmann H., E. Straub (1970). Glaubwrdigkeit fr Schadenstze. Bulletin of Swiss Asso-ciaton of Actuaries, pp 111-133.

Buhlmann H., W.S. Jewell (1987). Hierarchical credibility revisited. Bulletin of SwissAssociaton of Actuaries, pp 35-54.

Buhlmann, H., A. Gisler (2005). A course in credibility theory and its applications.Springer, Berlin.

Denuit M., X. Marchal, S. Pitrebois, J.F. Walhin (2007). Actuarial Modelling of ClaimCounts - Risk Classification, Credibility and Bonus-Malus Systems. John Wiley &Sons Ltd, West Susses, England.

European Union (2003). Commission recommendation of 6 May 2003 concern-ing the definition of micro, small and medium-sized enterprises. URLhttp://ec.europa.eu/growth/smes/business-friendly-environment/

sme-definition/index_en.htm.

Kealy, D.M. (2015). Understanding the Commercial General Liability Insurance Pol-icy. Self-published. URL https://books.google.nl/books?id=UAUbCgAAQBAJ&

printsec=frontcover&hl=nl&source=gbs_ge_summary_r&cad=0#v=onepage&q&

f=false.

Fischer M.M., J. Wang (2011). Spatial data analysis: models, methods and techniques.Springer Science & Business Media.

Frees E.W., V.R. Young, Y. Luo (1999). A longitudinal data analysis interpretation ofcredibility models. Insurance: Mathematics and Economics (24), pp 229-247.

Frees E.W., R.A. Derring, G. Meyers (2014). Predictive Modeling Applications in Ac-tuarial Science. Volume 1: Predictive Modeling Techniques. Cambridge UniversityPress.

34

Page 40: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 35

Information from Statistics Netherlands about SIC (2015). URL: http://www.cbs.nl/nl-NL/menu/methoden/classificaties/overzicht/sbi/default.htm.

Kaas R., M.J. Goovaerts, J. Dhaene, and M. Denuit (2008). Modern Actuarial RiskTheory—Using R, 2nd edition. Springer, Heidelberg.

Liu Q., D.A. Pierce (1994). A note on Gauss-Hermite quadrature. Biometrika (81), pp624-629.

Mano C., E. Rasa (2006). Use of classification analysis for grouping multi-level ratingfactors. International Congress of Actuaries (28), pp 1-30.

McCullagh, P., Nelder, J.A. (1989). Generalized linear models. In: Monographs on Statis-tics and Applied Probability. Chapman and Hall, New York.

Nelder, J.A., R.J. Verrall (1997). Credibility theory and generalized linear models. ASTINBulletin (27), pp 71-82.

Information from Eurostat about NACE (2015). URL http://ec.europa.

eu/eurostat/statistics-explained/index.php/Glossary:Statistical_

classification_of_economic_activities_in_the_European_Community_

%28NACE%29.

Ohlsson, E., B. Johansson (2004). Combining Credibility and GLM for Rating of Multi-Level Factors. CAS 2004 Discussion Paper Program, Colorado Springs.

Ohlsson E. (2008). Combining generalized linear models and credibility mod-els in practice. Scandinavian Actuarial Journal (4), pp 301-314, DOI-10.1080/03461230701878612.

Ohlsson, E., B. Johansson (2010). Non-Life Insurance Pricing with Generalized LinearModels. Springer, Heidelberg.

Ratner P.E. (1983). Insurance Coverage of asbestosis Claims - Running for cover orcoverage. Emory Law Journal (32), pp 901.

Tierny L. and J. Kadane (1986). Accurate approximations for posterior moments andmarginal densities. Journal of the American Statistical Association (81), pp 82-86.

R Development Core Team (2012). R: A Language and Environment for StatisticalComputing. R Foundation for Statistical Computing, Vienna, Austria. URL http:

//www.R-project.org/. ISBN 3-900051-07-0.

Bates D., M. Maechler, B. Bolker, S. Walker (2015). lme4 package. URL LinearMixed-Effects Models using ’Eigen’ and S4. https://cran.r-project.org/web/packages/lme4/lme4.pdf.

Page 41: Valkhof, Aart 0182737 MSc ACT

Appendix A

Tables

A.1 Revenue classes

This table shows the boundaries between the deciles of the revenue class per businesssector. For example, the first decile of Construction has an interval from 0 to 37,145euro, the interval of decile 2 is from 37,145 to 50,000 euro, etc.

Business sector 1 2 3 4 5

Construction 37,145 50,000 59,171 70,000 100,000Retail 50,000 87,500 120,000 180,000 250,000Garage 31,399 60,000 100,000 135,000 188,788Wholesale 70,000 120,000 200,000 300,000 450,000Hospitality 50,000 100,000 120,000 150,000 200,000Manufacturing 45,000 75,000 100,000 165,000 244,794

Business sector 6 7 8 9 10

Construction 120,000 175,000 300,000 600,000 13,677,718Retail 325,000 448,613 608,557 1,000,000 20,000,000Garage 350,000 500,000 700,000 1,500,000 47,475,000Wholesale 650,000 1,000,000 1,500,000 3,000,000 64,000,000Hospitality 250,000 308,571 450,000 678,000 21,031,000Manufacturing 340,000 500,000 800,000 1,586,086 17,416,328

Table A.1: Decile boundaries for revenue class per business sector. Amounts in euro.

36

Page 42: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 37

A.2 Hospitality SIC codes

This section sums up the statistics for the SIC codes of the Hospitality business sector.

SIC Exposure N Frequency

56101 1,556.82 67 0.043056102 1,055.46 20 0.01895630 918.80 20 0.02185621 178.88 4 0.022455101 118.08 2 0.016955102 112.06 1 0.008956103 80.67 0 0.00005629 65.82 0 0.00005530 61.90 4 0.064655201 37.29 2 0.053655202 5.06 2 0.3950

Total 4,190.84 122 0.0291

Table A.2: Claim statistics per SIC code for the Hospitality sector over the claim years2012 till 2014.

Page 43: Valkhof, Aart 0182737 MSc ACT

38 Aart Valkhof — Pricing multi-level factors

A.3 GLM results for basic model

Complete pooling model

(5.7) (5.8) (5.9)Covariate Estimate (s.e.) p-value Estimate (s.e.) p-value Estimate (s.e.) p-value

Intercept −4.541 (0.121) 0.000 −4.694 (0.139) 0.000 −4.776 (0.161) 0.000Revenue cat 0.225 (0.016) 0.000 0.224 (0.016) 0.000 0.224 (0.016) 0.000Occurrence limit

1250000 ref. group ref. group2500000 0.222 (0.096) 0.021 0.196 (0.097) 0.045

Claim year2012 ref. group2013 0.050 (0.127) 0.6932014 0.163 (0.113) 0.150

Observations 6,079 6,079 6,079Log Likelihood −1,443.877 −1,441.109 −1,439.775Akaike Inf. Crit. 2,891.753 2,888.218 2,889.550

Table A.3: Estimate, s.e. (in parentheses) and statistics for the complete poling models.

Page 44: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 39

A.4 Factors non-hierarchical models

These are the tables for comparison of the parameter estimates of the non-hierarchicalmodels.

Non-hierarchical models

Covariate No pooling Backfitting GH quadrature

Intercept -4.711 -4.586 -5.037revenue cat 0.225 0.227 0.226

Table A.4: Comparison of the fixed effects of non-hierarchical models.

Non-hierarchical models

SIC wj zj N No pooling Backfitting GH quadrature

56101 1,556.82 0.977 67 0.000 -0.139 0.3115630 918.80 0.953 20 -0.470 -0.573 -0.14656102 1,055.46 0.953 20 -0.470 -0.571 -0.14555101 118.08 0.813 2 -1.218 -0.928 -0.6215621 178.88 0.749 4 -0.159 -0.215 0.12655102 112.06 0.742 1 -1.504 -0.915 -0.66156103 80.67 0.585 0 -17.785 -0.879 -0.7355530 61.90 0.580 4 0.618 0.303 0.7245629 65.82 0.533 0 -18.179 -0.761 -0.64855201 37.29 0.392 2 0.681 0.249 0.61255202 5.06 0.146 2 2.015 0.589 1.255

Table A.5: Comparison of MLF SIC code for Hospitality sector.

Page 45: Valkhof, Aart 0182737 MSc ACT

40 Aart Valkhof — Pricing multi-level factors

Non-hierarchical models

SIC wj zj N No pooling Backfitting GH quadrature

56101 1,556.82 0.977 67 0.000 -0.139 0.3114120 980.77 0.960 53 0.355 0.205 0.6585630 918.80 0.953 20 -0.470 -0.573 -0.14656102 1,055.46 0.953 20 -0.470 -0.571 -0.14543221 317.51 0.903 48 1.200 0.991 1.483432101 312.25 0.897 13 -0.047 -0.168 0.2494332 395.53 0.887 21 0.533 0.356 0.8094334 370.84 0.882 36 1.127 0.910 1.4034711 214.98 0.865 4 -0.919 -0.835 -0.4844333 244.04 0.852 26 1.059 0.825 1.321

......

......

......

...47783 25.84 0.348 0 -17.280 -0.428 -0.39247521 20.95 0.347 1 0.188 0.016 0.22947592 22.13 0.341 0 -17.567 -0.417 -0.383476202 12.66 0.331 0 -17.903 -0.402 -0.37146421 22.72 0.328 0 -16.942 -0.397 -0.36746737 18.94 0.317 5 1.931 0.947 1.780

......

......

......

...18121 4.56 0.139 0 -16.995 -0.149 -0.150466999 7.47 0.138 1 1.384 0.293 0.628331211 6.17 0.137 1 1.392 0.294 0.6301813 8.11 0.136 0 -17.229 -0.146 -0.14847542 4.67 0.135 0 -17.417 -0.144 -0.146108501 4.26 0.134 0 -18.241 -0.144 -0.145

......

......

......

...4664 0.17 0.002 0 -15.702 -0.002 -0.003331236 0.08 0.002 0 -15.664 -0.002 -0.002131006 0.08 0.002 0 -15.474 -0.002 -0.00247192 0.17 0.002 0 -15.477 -0.002 -0.002331232 0.08 0.002 0 -15.441 -0.002 -0.002310121 0.17 0.002 0 -14.589 -0.002 -0.00246232 0.17 0.001 0 -15.252 -0.001 -0.00246771 0.08 0.001 0 -15.148 -0.001 -0.00145192 0.08 0.001 0 -14.800 -0.001 -0.00146733 0.08 0.001 0 -14.350 -0.001 -0.001

Table A.6: Comparison of MLF SIC code of non-hierarchical models. The table is ordereddescending by zj. The SIC codes that are displayed are the top 10, 100th to 105th, 200thto 205th and the last 10.

Page 46: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 41

A.5 Factors hierarchical models

These are the tables for comparison of the parameter estimates of the hierarchicalmodels.

Hierarchical models

Covariate Backfitting Laplace PQL

Intercept -4.531 -4.839 -4.791revenue cat 0.225 0.224 0.225

Table A.7: Comparison of the fixed effects for hierarchical models.

Hierarchical models

Business sector Backfitting Laplace PQL

Construction 0.788 0.967 0.847Garage -0.365 -0.132 -0.223Hospitality -0.387 -0.139 -0.162Manufacturing 0.117 0.329 0.185Retail -0.637 -0.412 -0.553Wholesale -0.150 0.070 -0.091

Table A.8: Comparison of the business sectors for hierarchical models.

Hierarchical models

SIC wjk zjk N Backfitting Laplace PQL

56101 1,556.82 0.956 67 0.196 0.262 0.2395630 918.80 0.912 20 -0.239 -0.182 -0.22056102 1,055.46 0.912 20 -0.239 -0.182 -0.21955101 118.08 0.686 2 -0.575 -0.520 -0.7175621 178.88 0.602 4 0.028 0.073 0.06855102 112.06 0.593 1 -0.564 -0.515 -0.77056103 80.67 0.417 0 -0.539 -0.500 -0.8765530 61.90 0.411 4 0.421 0.554 0.7035629 65.82 0.366 0 -0.456 -0.433 -0.78055201 37.29 0.246 2 0.301 0.409 0.62555202 5.06 0.079 2 0.501 0.764 1.398

Table A.9: Comparison of MLF SIC code for Hospitality sector. The table is ordereddescending by zjk.

Page 47: Valkhof, Aart 0182737 MSc ACT

42 Aart Valkhof — Pricing multi-level factors

Hierarchical models

SIC wjk zjk N Backfitting Laplace PQL

56101 1,556.82 0.956 67 0.196 0.262 0.2394120 980.77 0.923 53 -0.552 -0.464 -0.4075630 918.80 0.912 20 -0.239 -0.182 -0.22056102 1,055.46 0.912 20 -0.239 -0.182 -0.21943221 317.51 0.824 48 0.193 0.349 0.425432101 312.25 0.815 13 -0.734 -0.768 -0.7674332 395.53 0.799 21 -0.333 -0.277 -0.2264334 370.84 0.790 36 0.126 0.275 0.3514711 214.98 0.764 4 -0.334 -0.253 -0.2394333 244.04 0.744 26 0.067 0.207 0.283

......

......

......

...47783 25.84 0.213 0 -0.239 -0.195 -0.37247521 20.95 0.211 1 0.174 0.201 0.40947592 22.13 0.207 0 -0.232 -0.190 -0.363476202 12.66 0.200 0 -0.223 -0.182 -0.35046421 22.72 0.198 0 -0.221 -0.269 -0.48146737 18.94 0.190 5 0.731 1.264 1.761

......

......

......

...18121 4.56 0.075 0 -0.078 -0.131 -0.261466999 7.47 0.075 1 0.194 0.329 0.699331211 6.17 0.074 1 0.137 0.291 0.6021813 8.11 0.074 0 -0.077 -0.129 -0.25747542 4.67 0.073 0 -0.075 -0.065 -0.136331215 7.00 0.073 0 -0.075 -0.084 -0.181

......

......

......

...4664 0.17 0.001 0 -0.001 -0.002 -0.004331236 0.08 0.001 0 -0.001 -0.002 -0.00547192 0.17 0.001 0 -0.001 -0.001 -0.002131006 0.08 0.001 0 -0.001 -0.002 -0.004331232 0.08 0.001 0 -0.001 -0.001 -0.003310121 0.17 0.001 0 -0.001 -0.001 -0.00346232 0.17 0.001 0 -0.001 -0.001 -0.00246771 0.08 0.001 0 -0.001 -0.001 -0.00245192 0.08 0.000 0 -0.000 -0.001 -0.00146733 0.08 0.000 0 -0.000 -0.000 -0.001

Table A.10: Comparison of MLF SIC code of hierarchical models. The table is ordereddescending by zjk. The SIC codes that are displayed are the top 10, 100th to 105th, 200thto 205th and the last 10.

Page 48: Valkhof, Aart 0182737 MSc ACT

Appendix B

R-code

B.1 Data samples

This section shows two samples of the data. One sample is on the level of the policy-holder, the other is on the level of the risk class.

Polisnr Claim year Revenue cat Revenue Business sector SIC Limit Exposure N

......

......

......

......

...194579 2012 4 132,000 Manufacturing 1052 1250000 0.4153 0189867 2014 4 150,000 Hospitality 56102 2500000 0.4959 0219825 2013 10 1,000,000 Hospitality 56101 2500000 0.4137 0

2246286 2013 4 121,862 Hospitality 5630 1250000 0.2521 02272528 2014 6 201,270 Hospitality 56101 2500000 0.9151 0263617 2013 2 100,000 Hospitality 5630 2500000 1.0000 0264314 2013 3 110,000 Hospitality 56101 2500000 1.0000 0280043 2014 6 250,000 Hospitality 5630 2500000 0.6712 0303192 2014 3 116,560 Hospitality 56101 1250000 0.0849 0154134 2012 4 60,000 Construction 4332 1250000 0.5847 0160622 2012 9 600,000 Construction 4334 2500000 0.5847 1177847 2012 7 150,000 Construction 43222 2500000 0.0847 0203822 2014 1 14,125 Construction 4334 2500000 1.0000 1

2258151 2013 5 72,000 Construction 4120 2500000 0.0849 02271558 2014 6 100,000 Construction 42112 2500000 0.5890 02271721 2014 6 100,000 Construction 4312 2500000 0.1616 02291677 2014 9 569,342 Construction 4339 2500000 0.4192 1260880 2013 7 150,000 Construction 4120 1250000 0.5863 0271279 2013 10 1,050,000 Construction 4120 2500000 0.0849 0281308 2014 6 100,000 Construction 4120 2500000 1.0000 0187438 2014 6 310,000 Retail 47712 2500000 0.5863 0201704 2012 6 296,740 Retail 47783 1250000 0.0847 0348831 2014 1 50,000 Retail 47789 2500000 0.0849 0398472 2014 7 354,000 Retail 47761 2500000 0.1671 0

2741687 2014 10 26,274,000 Wholesale 46311 2500000 0.1671 0...

......

......

......

......

Table B.1: Small sample of the data set on the level of policyholder.

43

Page 49: Valkhof, Aart 0182737 MSc ACT

44 Aart Valkhof — Pricing multi-level factors

Claim year Revenue cat Business sector SIC Limit Exposure N

......

......

......

...2014 1 Retail 4777 1250000 0.75 02012 7 Manufacturing 16231 1250000 0.42 02013 2 Construction 43993 1250000 2.07 02014 10 Retail 47521 1250000 1.00 02012 9 Retail 47526 1250000 0.92 02014 2 Retail 47643 1250000 1.00 02013 1 Retail 47791 1250000 1.00 02013 1 Retail 47899 1250000 0.08 02013 5 Manufacturing 310902 1250000 0.83 02013 1 Wholesale 4636 2500000 0.29 02014 9 Wholesale 4652 2500000 1.16 02014 6 Retail 4711 2500000 4.93 02014 9 Manufacturing 23611 2500000 0.44 02014 10 Construction 43993 2500000 2.08 02013 3 Garage 45112 2500000 1.59 02012 10 Wholesale 46382 2500000 1.58 12014 1 Wholesale 46383 2500000 1.42 02014 6 Wholesale 46692 2500000 1.00 02013 10 Wholesale 46737 2500000 1.00 32014 1 Wholesale 46901 2500000 0.08 02014 10 Retail 47292 2500000 2.76 02014 8 Retail 47293 2500000 0.08 02013 9 Retail 47712 2500000 0.17 02013 9 Retail 47761 2500000 1.17 02014 5 Hospitality 55201 2500000 3.00 0

......

......

......

...

Table B.2: Small sample of the data set on the level of risk class.

Page 50: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 45

B.2 Backfitting algorithm

In this section we discuss the R implementation of the backfitting algorithm for thehierarchical model. The code is based on the hierarchical models in section 3 of Ohlsson(2008). The four steps of the backfitting algorithm are mentioned in comment. Numbersin parentheses, for example (3.3), refer to the formulas.

1 ds ohl # datase t to be used2

3 # i : l e v e l in ord inary r a t i n g f a c t o r s4 # j : bus in e s s s e c t o r ; the s e c t o r .5 # k : SIC ; the groups6 # t : every s i n g l e ob s e rva t i on s .7

8 # c r e a t e datase t f o r l e v e l s e c t o r and group9 ds j <− unique ( ds ohl [ c ( ” bus in e s s s e c t o r ” ) ] )

10 ds jk <− unique ( ds ohl [ c ( ” bus in e s s s e c t o r ” , ”SIC” ) ] )11

12 # gene ra l parameters13 p = 1 # po i s son so , var i ance funt i on muˆp14 R = 1 # amount o f ord inary r a t i n g f a c t o r s : revenue cat15 J = length ( unique ( ds ohl $ bus in e s s s e c t o r ) ) # number o f s e c t o r s16 K = length ( unique ( ds ohl $SIC ) ) # number o f groups17

18 # step 0 . I n i t i a l l y l e t U hat = 1 f o r a l l j and k19 U hat j <− with ( ds j , rep (1 , nrow ( ds j ) ) )20 ds j <− cbind ( ds j ,U hat j )21 ds ohl $U hat j <− rep (1 , nrow ( ds ohl ) )22 U hat jk <− rep (1 , nrow ( ds jk ) )23 ds jk <− cbind ( ds jk , U hat jk )24 ds ohl $U hat jk <− rep (1 , nrow ( ds ohl ) )25

26 # s t a r t the i t e r a t i o n .27 f o r ( i t e r in 1 : 2 0 ) {28 # step 1 .29 ( glm . f i t <− glm (N ˜ revenue cat + o f f s e t ( l og (U hat j ) ) + o f f s e t ( l og (U

hat jk ) ) , o f f s e t=log ( exposure ) , f ami ly=po i s son ( l i n k=log ) , ds ohl ) )30

31 # t stand f o r t i l d e32 mu hat <− exp ( glm . f i t $ c o e f f i c i e n t s [ ” ( I n t e r c e p t ) ” ] )33 # r e t r e i v e the gamma’ s from the model . Take a l l o f f s e t with i t .34 ds ohl $gamma i <− f i t t e d ( glm . f i t , ds ohl ) /35 ( ds ohl $ exposure ∗ds ohl $U hat j ∗ds ohl $U hat jk ∗mu

hat )36 ds ohl $Y i j k t <− ds ohl $N/ds ohl $ exposure37 ds ohl $Y t i j k t <− ds ohl $Y i j k t / ( ds ohl $gamma i ) # ( 3 . 3 )38 ds ohl $w t i j k t <− ds ohl $ exposure ∗ ( ds ohl $gamma i ) ˆ(2−p) # ( 3 . 3 )39

40 # exposure i s needed f o r comparison o f models41 exposure . jk . <− tapply ( ds ohl $ exposure , ds ohl $SIC , FUN=sum)42 ds jk $ exposure . jk . <− exposure . jk . [ match ( ds jk $SIC , names ( exposure . jk

. ) ) ]43

44 # weight t i l d e c a l c u l a t i o n s45 w t . jk . <− tapply ( ds ohl $w t i j k t , ds ohl $SIC , FUN=sum)46 ds ohl $w t . jk . <− w t . jk . [ match ( ds ohl $SIC , names (w t . jk . ) ) ]47 ds jk $w t . jk . <− w t . jk . [ match ( ds jk $SIC , names (w t . jk . ) ) ]48 w t . jk . 2 <− tapply ( ds ohl $w t i j k t , ds ohl $SIC , FUN=sum) ˆ249 ds jk $w t . jk . 2 <− w t . jk . 2 [ match ( ds jk $SIC , names (w t . jk . 2) ) ]50 w t . j . . <− tapply ( ds ohl $w t i j k t , ds ohl $ bus in e s s s ec to r , FUN=sum)51 ds ohl $w t . j . . <− w t . j . . [ match ( ds ohl $ bus in e s s s ec to r , names (w t . j

. . ) ) ]52 ds jk $w t . j . . <− w t . j . . [ match ( ds jk $ bus ine s s s ec to r , names (w t . j . . ) )

]

Page 51: Valkhof, Aart 0182737 MSc ACT

46 Aart Valkhof — Pricing multi-level factors

53 ds j $w t . j . . <− w t . j . . [ match ( ds j $ bus in e s s s ec to r , names (w t . j . . ) ) ]54 w t . . . . <− sum( ds ohl $w t i j k t )55 ds ohl $w t . . . . <− w t . . . .56

57 # Y t i l d e bar c a l c u l a t i o n s58 Y t bar . jk . <− tapply ( ds ohl $w t i j k t ∗ds ohl $Y t i j k t , ds ohl $SIC , FUN=

sum) /59 tapply ( ds ohl $w t i j k t , ds ohl $SIC , FUN=sum) # ( 3 . 3 )60 ds ohl $Y t bar . jk . <− Y t bar . jk . [ match ( ds ohl $SIC , names (Y t bar . jk

. ) ) ]61 ds jk $Y t bar . jk . <− Y t bar . jk . [ match ( ds jk $SIC , names (Y t bar . jk . ) )

]62 Y t bar . j . . <− tapply ( ds jk $w t . jk . ∗ds jk $Y t bar . jk . , ds jk $ bus ine s s

s ec to r , FUN=sum) /63 tapply ( ds jk $w t . jk . , ds jk $ bus ine s s s ec to r , FUN=sum) # ( 3 . 5 )64 ds ohl $Y t bar . j . . <− Y t bar . j . . [ match ( ds ohl $ bus in e s s s ec to r , names (

Y t bar . j . . ) ) ]65 ds jk $Y t bar . j . . <− Y t bar . j . . [ match ( ds jk $ bus ine s s s ec to r , names (Y

t bar . j . . ) ) ]66 ds j $Y t bar . j . . <− Y t bar . j . . [ match ( ds j $ bus in e s s s ec to r , names (Y t

bar . j . . ) ) ]67

68 # amount o f groups per s e c t o r69 K j tmp <− with ( ds ohl , tapply (SIC , ds ohl $ bus in e s s s ec to r , FUN=func t i on

( x ) { l ength ( unique ( ( x ) ) ) }) )70 K j <− K j tmp [ ! i s . na (K j tmp) ] # number o f groups per s e c t o r71 ds ohl $K j <− K j [ match ( ds ohl $ bus in e s s s ec to r , names (K j ) ) ]72 ds j $K j <− K j [ match ( ds j $ bus in e s s s ec to r , names (K j ) ) ]73

74 # T. Amount o f obse rvat i on i and t f o r the groep j , k .75 T jk tmp <− t ab l e ( ds ohl $SIC )76 T jk <− subset (T jk tmp , T jk tmp >0 ) # remove SIC with zero

obs e rva t i on s77 ds ohl $T jk <− T jk [ match ( ds ohl $SIC , names (T jk ) ) ]78

79 # Step 2 . e s t imator o f var iance parameters80 h i e r . sigma hat 2 <− 1/sum(T jk−1) ∗ sum( ds ohl $w t i j k t ∗ ( ds ohl $Y t i j k t

−ds ohl $Y t bar . jk . ) ˆ2) # ( 3 . 7 )81 h i e r . vu hat 2 <− (sum( ds jk $w t . jk . ∗ ( ds jk $Y t bar . jk .− ds jk $Y t bar

. j . . ) ˆ2) − h i e r . sigma hat 2 ∗ sum(K j − 1) ) / (w t . . . . − sum( ds jk $wt . jk . ˆ2 /ds jk $w t . j . . ) ) # ( 3 . 8 )

82

83 # z c a l c u l a t i o n s84 z jk <− ds jk $w t . jk . / ( ds jk $w t . jk . + h i e r . sigma hat 2/ h i e r . vu hat 2)

# ( 3 . 5 )85 ds jk $z jk <− z jk [ match ( ds jk $SIC , names ( z jk ) ) ]86 ds ohl $z jk <− z jk [ match ( ds ohl $SIC , names ( z jk ) ) ]87 z j . <− tapply ( ds jk $z jk , ds jk $ bus ine s s s ec to r , FUN=sum)88 ds j $z j . <− z j . [ match ( ds j $ bus in e s s s ec to r , names ( z j . ) ) ]89 z . . <− sum( ds j $z j . )90

91 # Y t i l d e bar z c a l c u l a t i o n s92 Y t bar . j . . z <− tapply ( ds jk $z jk ∗ds jk $Y t bar . jk . , ds jk $ bus ine s s

s ec to r , FUN=sum) /93 tapply ( ds jk $z jk , ds jk $ bus ine s s s ec to r , FUN=

sum)94 ds j $Y t bar . j . . z <− Y t bar . j . . z [ match ( ds j $ bus in e s s s ec to r , names (

Y t bar . j . . z ) ) ]95 Y t bar . . . . z <− sum( ds j $z j . ∗ds j $Y t bar . j . . z ) /sum( ds j $z j . )96

97 # Step 2 . e s t imator o f var iance parameter98 h i e r . tau hat 2 <− (sum( ds j $z j . ∗ ( ds j $Y t bar . j . . z − Y t bar . . . . z )

ˆ2) − h i e r . vu hat 2∗ (J−1) ) / ( z . . − sum( ds j $z j . ˆ2 /z . . ) ) # ( 3 . 9 )99

100 # q c a l c u l a t i o n s101 q j <− ds j $z j . / ( ds j $z j .+ h i e r . vu hat 2/ h i e r . tau hat 2) # ( 3 . 5 )

Page 52: Valkhof, Aart 0182737 MSc ACT

Pricing multi-level factors — Aart Valkhof 47

102 ds j $q j <− q j [ match ( ds j $ bus in e s s s ec to r , names ( q j ) ) ]103 ds ohl $q j <− q j [ match ( ds ohl $ bus in e s s s ec to r , names ( q j ) ) ]104

105 # V c a l q u l a t i o n s106 V hat j <− ds j $q j ∗ ds j $Y t bar . j . . z + (1−ds j $q j ) ∗ mu hat #

( 3 . 4 )107 ds j $V hat j <− V hat j [ match ( ds j $ bus in e s s s ec to r , names (V hat j ) ) ]108 ds jk $V hat j <− V hat j [ match ( ds jk $ bus ine s s s ec to r , names (V hat j ) ) ]109 V hat jk <− ds jk $z jk ∗ ds jk $Y t bar . jk . + (1−ds jk $z jk ) ∗ ds jk $V

hat j # ( 3 . 6 )110

111 # Step 3 . U hat c a l c u l a t i o n s112 U hat j <− ds j $q j ∗ ds j $Y t bar . j . . z/mu hat + (1−ds j $q j )113 ds j $U hat j <− U hat j [ match ( ds j $ bus in e s s s ec to r , names (U hat j ) ) ]114 ds jk $U hat j <− U hat j [ match ( ds jk $ bus ine s s s ec to r , names (U hat j ) ) ]115 ds ohl $U hat j <− U hat j [ match ( ds ohl $ bus in e s s s ec to r , names (U hat j ) ) ]116 U hat jk <− ds jk $z jk ∗ ds jk $Y t bar . jk . /ds jk $V hat j + (1−ds jk $z

jk )117 ds jk $U hat jk <− U hat jk [ match ( ds jk $SIC , names (U hat jk ) ) ]118 ds ohl $U hat jk <− U hat jk [ match ( ds ohl $SIC , names (U hat jk ) ) ]119

120 # step 4 . Return to step 1 u n t i l convergence121 } # end o f f o r loop

: