Kernel Alternatives to Approximate Operational Severity ... ANNUAL...

28
1 Kernel Alternatives to Approximate Operational Severity Distribution: An Empirical Application. Filippo di Pietro University of Seville [email protected] María Dolores Oliver Alfonso University of Seville [email protected] Ana I. Irimia Diéguez University of Seville [email protected] The severity loss distribution is the main topic in operational risk estimation. Numerous modelling methods have been suggested although very few work for both high frequency small losses and low frequency big losses. In this paper several methods of estimation are explored. The good performance of the double transformation kernel estimation in the context of operational risk severity is worthy of a special mention. This method is based on the work of Bolance and Guillen (2009), it was initially proposed in the context of cost of claims insurance, and it supposes an advance in operational risk research. Keys words: Operational risk, Basel II, Loss Distribution Approach, Non- parametric estimation, Kernel smoothing estimation. JEL Classification: G10, G20, G21, G32, C11, C14, C16

Transcript of Kernel Alternatives to Approximate Operational Severity ... ANNUAL...

Page 1: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

1

Kernel Alternatives to Approximate Operational Seve rity

Distribution: An Empirical Application.

Filippo di Pietro

University of Seville

[email protected] María Dolores Oliver Alfonso

University of Seville

[email protected]

Ana I. Irimia Diéguez

University of Seville

[email protected]

The severity loss distribution is the main topic in operational risk estimation.

Numerous modelling methods have been suggested although very few work for

both high frequency small losses and low frequency big losses. In this paper

several methods of estimation are explored. The good performance of the

double transformation kernel estimation in the context of operational risk

severity is worthy of a special mention. This method is based on the work of

Bolance and Guillen (2009), it was initially proposed in the context of cost of

claims insurance, and it supposes an advance in operational risk research.

Keys words : Operational risk, Basel II, Loss Distribution Approach, Non-

parametric estimation, Kernel smoothing estimation.

JEL Classification : G10, G20, G21, G32, C11, C14, C16

Page 2: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

2

1. Introduction

The revised Basel Capital Accord requires banks to meet a capital requirement

for operational risk as part of an overall risk-based capital framework. As

regards the definition aspects, the Risk Management Group (RMG) of the Basel

Committee and industry representatives have agreed on a standardised

definition of operational risk, i.e. “the risk of loss resulting from inadequate or

failed internal processes, people and systems or from external events” (BIS,

2001b).

The proposed discipline establishes various schemes for the calculation of the

operational risk charge, which increased sophistication and risk sensitivity.

The most sophisticated approach is the Advanced Measurement Approaches

(AMA), based on the adoption of internal models of banks. The framework gives

banks a great deal of flexibility in the choice of the characteristics of their

internal models, provided they comply with a set of eligible qualitative and

quantitative criteria and can demonstrate that their internal measurement

systems are able to produce reasonable estimates of unexpected losses. Use

of the AMA is subject to supervisory approval.

The Basel Committee on Banking Supervision (the Committee) recognises that

the selection of approach for operational risk management chosen by an

individual bank will depend on a range of factors, including its size and

sophistication and the nature and complexity of its activities.

As regards the measurement issue, a growing number of articles, research

papers and books have addressed the topic from a theoretical point of view. In

practice, this objective is complicated by the relatively short period over which

operational risk data have been gathered by banks; obviously, the greatest

difficulty is in collecting information on infrequent, but large losses, which, on

the other hand, contribute the most to the capital requirement. The need to

evaluate the exposure to potentially severe tail events is one of the reasons why

the new Capital framework requires banks to supplement internal data with

further sources, (i.e. external data, scenario analysis), in order to compute their

operational risk capital requirement.

Page 3: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

3

Unlike the modelling of market and credit risk, the measurement of operational

risk is faced with the challenge of limited data availability. Furthermore, due to

the sensitive nature of operational loss data, institutions are unlikely to freely

share their loss data. Recently, the measurement of operational risk has moved

towards the data-driven Loss Distribution Approach (LDA) and therefore, many

financial institutions have begun collecting operational loss data in order to take

advantage of this approach.

Within the AMA approach, LDA is a tool used by banks to identify and assess

operational risk. LDA is a quantitative approach based on actuarial models

which can provide the risk profile of a financial entity. LDA has emerged as the

best model for statistical treatment of operational risk.

The selection of the severity loss distribution is probably the most important

phase and the one that involves more complication in the estimation of the

capital requirement of operational risk. An incorrect choice of distribution leads

to a severe distortion of the model and a low estimate or overestimates of

regulatory capital with respect to the operational risk. This could have a big

impact to Basel II economic capital requirement. Recent literature on

operational risk has focused attention on the use of a parametric specification of

loss distribution. It is the simplest method to follow, since it attempts to fit

analytical distributions with certain properties. The aim of this approach is to find

a distribution of losses that may feasible to the severity distribution of the losses

of the sample available. Another commonly applied technique in operational risk

is Extreme Value Theory (EVT) which is a good methodology in cases where

the main attention is the tail of the distribution.

The main problems of parametric methodologies is that it is frequently

impossible to represent the operational severity distribution with just one

parametric distribution that give a good fit in the same time in the body and in

the tail of the distribution. Accordingly to the most recent literature (Gustafsson,

J., 2008) on operational risk, one good candidate for an operational risk severity

distribution that can solve the problem of the whole fit is provided be the

modified Champerwone distribution.

Page 4: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

4

An alternative way is shown by EVT which attempts to solve this problem fitting

the tail alone, however no objective method exists which decides the threshold

where the tail starts, and this affects greatly the estimation. Embrechts et al.

(2003, 2004) explain that whereas EVT is the natural set of statistical

techniques for estimating high quantiles of a loss distribution, this can be carried

out with sufficient accuracy only when data satisfy specific conditions. Another

problem is that EVT tends to overestimate the capital requirement with respect

to operational risk; this might be particularly relevant in this case where the

database come from a medium-size Spanish Savings Bank. Despite these

problems, EVT is widely used in risk management and especially in operational

risk, as in McNeil (1996), Moscadelli (2004), and Cruz (2002).

Consider this; we take as alternative techniques those that permit the

quantification of operational loss severity by fitting the whole distribution and

that do not require the specification of parametric distribution. To this end, the

kernel estimation is taken as the starting point which has been improved with

the parametric transformation approach given in Wand, Marron and Ruppert

(1991) and recently considered in Bolance, Guillen and Nielsen, (2008), Buch-

Larsen, Nielsen, Bolance and Guillen (2005). The analysis is completed, with

the latest methodology developed by Bolance and Guillen (2009) based on a

double transformation which in our opinion can notably improve the operational

loss severity estimation. In order to explore all possible methods a data sample

of losses is utilized based on operational risk of a medium-sized Spanish

Savings Bank.

In this article, we attempt to find which techniques (parametric and non-

parametric) yield a more appropriate measure of operational risk severity, with a

particular focus on saving Bank cases, demonstrating that non-parametric

estimation with last improvement (especially the double transformation kernel

estimation) is a good alternative method to fit loss severity distribution, that

perform much better than parametric distribution. These methodologies permit

to fit the whole distribution making unnecessary to estimate a threshold.

Another good property of non-parametric techniques respect on EVT is that

seems no overestimate capital requirement. The double transformation kernel

Page 5: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

5

estimation was applied in insurance claim field; we think that this methodology

can represent an advance in operational loss severity estimation too.

The remainder paper is organized in the following way:

In the second section, a full description of the parametric and non parametric

methodologies is reported. In the third section the characteristics of data and an

exploratory analysis are presented. In the fourth section the results analysis

and comparison are commented. And the last is the conclusion section.

2. Theoretical Framework

In operational risk and specifically in the dataset considered, most operational

losses are small and extreme losses are rarely observed, although the latter

can have a substantial influence on the capital charge estimation.

In previous studies, i.e. de Fontnouvelle, Rosengren, and Jordan (2005), et. al.,

the authors have tried to find a parametric distribution that could fit the whole

distribution well; however it is a major problem to find a distribution that provides

a good fit for both the body and the tail part of the loss distribution. This method

suffers from another problem since no emphasis is given to the importance of a

good fit in the tail, where the problem of operational risk is focused.

One way to solve such an inconvenient is to analyze small and large losses

separately with the Extreme Value Theory (EVT). This approach involves some

drawbacks: choosing the appropriate parametric model; finding the best way to

estimate the parameter; and the most important, identifying a criterion to

determine the threshold, not to mention that in this case any loss threshold

applied would lose important information and therefore robustness of the results

obtained.

A non-parametric approximation, especially the kernel estimation, can be a

good alternative. The classic kernel estimation of the distribution function is a

simple smoothing of the empirical distribution, and for this reason lack of

sample information imply loss of precision in the estimation of the heavy losses.

The transformed kernel estimation is an alternative that improves the results

obtained with the classic kernel estimation. This approach transforms the

Page 6: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

6

variable by using concave function symmetrization on the original data, and

then obtains the classic transformed Kernel variable.

The methodology used in Bolancé y Guillen (2009) is proposed as an

alternative method

They propose new transformation kernel estimation, based on a double

transformation, which can improve the estimation of risk measures. In this paper

this methodology is tested in the field of operational risk in order to compare the

results with those obtained through more traditional methodologies. To this end,

in addition to the non-parametric methodology to fit the data, parametric

distributions such as lognormal, Weibull, and a special case of the

Champernowne distribution are used.

2.1. The parametric loss models.

This section explores three parametric alternatives which capture the severity

losses for operational risk.

The most common distribution in modelling the OR is the lognormal whereby

distribution1, one can say that a random variable X has a lognormal distribution

if its density and distribution are, respectively2:

2

2

2

)(log

2

1)( σ

πσ

ux

ex

xf−−

= (1)

−Φ=σ

uxxF

log)( , x>0 (2)

where Φ (x) is the standard normal distribution.

The parameters )( ∞<<−∞ uu and )0( >σσ are the centre and scale,

respectively, and can be estimated with the maximum likelihood method:

∑=

=n

jjx

nu

1

log1

ˆ , ∑=

−=n

jj ux

n 1

22 )ˆ(log1σ̂ (3)

The inverse distribution is upepF +Φ− −

= σ)(1 1

)( , therefore the lognormal random

variable can be simulated.

1 The lognormal distribution was proposed by the Basel Committee for modelling operational

risk. 2 See: Chernobai. A., Rachev. S., Fabozzi. F. Operational Risk: A guide to Basel requirements, Models, and Analysis (2006).

Page 7: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

7

The Weibull distribution is a generalization of the exponential distribution,

whereby two parameters instead of one allow more flexibility and heavier tails.

The density and distribution are: αβααβ xexxf −−= 1)( (4)

αBxexF −−= 1)( (5) where x> 0, with )0( >ββ as the scale parameter and )0( >αα as the shape parameter. There is no inverse of a Weibull random variable in a closed formula. To

generate a Weibull random variable, an exponential random variable Y must be

generated with parameter β and then follow the transformation α1

YX =

follows.

The Champernowne distribution was used in the field of operational risk either

as suitable parametric distribution to fit the severity of operational risk3, or as a

transformation in kernel estimation4. The original Champernowne distribution

has density

0 x)(

2

1)(

2

1(

)( ≥++

=− αα λ

M

x

M

xx

cxf , (6)

where c is a normalizing constant and α, λ and M are parameters. When λ

equals 1 and c equals α2

1 the density of the original distribution is simply

called the Champernowne distribution

)(

)(2

1

αα

αααMx

xMxf

+=

, (7)

The Champernowne distribution converges to a Pareto distribution in the tail,

while resembling like a lognormal distribution near 0 when α>1. Its density is

either 0 or infinity at 0 (unless α=1).

3 See:Gustafsonn. J., Thuring. F. (2008) “A Suitable Parametric Model for Operational Risk Applications” 4 See: Bolance. C., Guillen. M., (2009) “Estimación Núcleo Transformada para aproximar el Riesgo de Pérdida”

Page 8: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

8

In this work the modified Champernowne distribution function5 is used, with a

new parameter c that ensures the possibility of a positive finite value of the

density at 0 for all α.

The modified Champernowne cdf is defined for 0≥x and has the form

+∈∀−+++

−+= RxccMcx

ccxxT cM

2)()()(

)(,, ααα

αα

α (8)

2.2. Extreme Value Theory

As seen in the conventional inference, the influence of the small/medium-sized

losses in the curve parameters estimate prevents the attainment of models that

fit the tail data accurately. An obvious solution to this problem is to disregard the

body of the distribution, and to focus the analysis only on the large losses.

When only the tail area is considered, several distributions could be adopted

such as lognormal and Pareto, which are often used in insurance to model large

claims. However, in this section the attention is focused on extreme distribution

stemming from the Extreme Value Theory (EVT) 6, and especially on Peak Over

Threshold (POT)7.

As witnessed by Chapelle et al. (2008), this approach enables us to

simultaneously determine the cut-off threshold and to calibrate a distribution for

extreme losses above this threshold. The severity component of the POT

method is based on a distribution (Generalized Pareto Distribution - GPD),

whose cumulative function is usually expressed as the following two-parameter

distribution8:

0 exp1

0 )1(1)(

1

,

=

−−

≠+−=

ξσ

ξσ

ξ ξ

σξif

x

ifx

xGPD

(9)

5 See: Buch-Larsen, T., Guillén, M., Nielsen, J.P. and Bolance, C. (2005) “Kernel density estimation for heavy-tailed distributions using the Champernowne transformation”- 6 For a comprehensive source on the application of EVT in finance and insurance see Embrechts et al., (1997), and Reiss and Thomas, (2001) 7 In the peak over threshold (POT) model, the focus of the statistical analysis is put on the observations that lie above a certain high threshold. 8 See: Moscadelli, M., (2004), “The Modelling of Operational Risk: Experience with the Analysis of the Data Collected the Basel Committee”,

Page 9: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

9

where: 0000 <−≤≤≥≥ ξξσξ if , if xx and ξ and σ represent the shape

and the scale parameter respectively.

In this work we use the extended version of GPD which includes a location

parameteru :

0 exp1

0 )1(1)(

1

,

=

−−−

≠−+−=

ξσ

ξσ

ξ ξ

σξif

ux

ifux

xGPD (10)

The inverse of the GPD distribution has a simple form:

0for /)1()(

and 0for )1log()(1

1

≠−−==−−=

−−

ξξβξβ

ξpupF

pupF (11)

Although, several authors (see Dupuis, 2004; Matthys and Beirlant, 2003) have

suggested methods to identify the threshold, there is no single widely accepted

method to select the appropriate cut-off.

2.3. The Kernel Estimation of severity loss distribution

The kernel estimation of the distribution function equals the integral of the

kernel estimation of the density function:

∑=

−=n

i

iX h

Xxk

nhxf

1

1)(ˆ

, (12)

Given by the kernel estimation of the density function, where k(.) is the kernel,

that normally coincides with a symmetric density function centered at zero and

asymptotically bounded or unbounded. Two examples are the Epanechnikov:

>

≤−=

10

11(43

)(

)2

tif

if tttk (13)

And the Gaussian kernel:

+∞≤≤∞−

−= t

ttk

2exp

2

1)(

2

π (14)

Page 10: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

10

The parameter h is the bandwidth used to control the degree of smoothing of

the estimation, i.e the higher the value of this parameter, the greater the

smoothing of the estimation. By integrating the core estimation of the density

function and using the change of variable hXut i /)( −= , the kernel estimation of

the distribution function is obtained:

duh

Xuk

nhdu

h

Xuk

nhdu

h

Xuk

nhxF

n

i

h

Lxi

n

i

xi

n

i

ix

X

i

∑∫∑∫∑∫=

∞−=

∞−=

∞−

−=

−=

−=111

111)(ˆ

(15)

In the above expression k (.) is the distribution function associated with the

density function that acts as a kernel. The properties of the kernel estimation of

the distribution function were analyzed by Reiss (1981) and Azzalini (1981).

Both authors suggest that when ∞→n , the mean square error of )(ˆ xF X is

approximated as: 2

24

2

)()(2

1)(1)(())(1)((

′+−

−−∫

∫ dttktxfhn

dttKxhf

n

xFxFX

XX

(16)

The first two parts of the above expression correspond to the asymptotic

variance and the third term is the squared asymptotic bias.

2.4. Transformation Kernel Estimation

The classic kernel estimation of the distribution function is a simple smoothing

of the empirical distribution, and for this reason the loss of precision recurs in

the estimation of the heavy losses due to the lack of sample information. The

transformation kernel estimation is an alternative that improves the results

obtained with the classic kernel estimation. This approach transforms the

variable by using concave function symmetrization on the original data, and

then obtains the classic transformed Kernel variable. Let T (.) be a concave

transformation, where y = T (x) and Yi=T(Xi), i=1…..,n are the observed

transformed losses. Therefore, the kernel estimation of the transformed variable

is:

))((ˆ)()(11)(ˆ

11

xTFh

XTxTK

nh

YyK

nyF Y

n

i

in

i

iY =

−=

−= ∑∑

== (17)

It shows that the transformed kernel estimation of )(ˆ xFX equals:

))((ˆ)(ˆ)( xTFxF xTx = (18)

Page 11: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

11

In order to obtain the transformed kernel estimation, it is necessary to determine

which transformation to use, the kernel function, and to calculate the bandwith.

In the kernel estimation of density function literature, several methods are

proposed for the calculation of the bandwith9, however very few alternatives are

analyzed in the context of kernel function of distribution estimation.

Boloance and Guillen (2009) propose an adaptation of the method based on the

normal distribution described by Silverman (1986).

This method implies the minimization of the mean integrated squared error

(MISE):

( ) .)(ˆ)())(ˆ(2

−= ∫ dyyFyFEyFMISE YYY (19)

The asymptotic value of MISE is known as A-MISE (Asymptotic Mean

Integreted Squared Error).

By integrating the asymptotic mean square error given in expression (19) and

by taking the distribution function YF estimate into account the A-MISE

becomes:

∫∫∫ ′′+−−− −− ,))((4

1))(1)(())(1)(( 24

211 dyyFhkdttKtKhndyyFyFn YY

(20)

where dyyfdyyFdttktk y

2222 ))((())(( )( ∫∫ ∫ ′=′′=

minimizing (20) by h& , overwhelming the asymptotically optimal smoothing

parameter:

3

13

1

22 ))((

))(1)(( −

−=

∫∫ n

dyyfk

dttKtKh

Y

& (21)

Silverman (1986) proposed approximating h& by replacing the terms that depend

on the theoretical density function with the value they would obtain if were

assumed that f is the density of a normal distribution (u ,σ ). By using the kernel

of Epanechnikov:

3

13

1

35900 −

= nh πσ& (22)

A change in the variable is the basis for the transformed kernel estimate and is

known to enhance non-parametric estimation of density which has strong

9 Wand, M., P., and Jones, M.,C., (1995). “Kernel Smoothing”. Chapman & Hall

Page 12: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

12

asymmetry. In their work Bolonce and Guillen (2009) review various types of

transformations, both parametric and non-parametric. In this article, we consider

the transformations used by Buch-Larsen et al. (2005), who transformed data

by using distribution of function of a generalization of the distribution of

Champernowne (1958).

2.4.1. The double transformation kernel estimation

In this work as an alternative method the methodology used in Bolancé y

Guillen (2009) is proposed

This estimation was initially applied in Bolancé, Guillen and Nielsen (2008) in

the density function context and later in Bolance and Guillen (2009) for the

estimation of a distribution function. In this article, this methodology applied to

is the estimation of operational loss severity distribution.

Bolancé and Guillen (2009) propose a new transformation kernel estimation,

based on a double transformation, which can improve the estimation of risk

measures.

The A_MISE expression given in (20) shows that to obtain the asymptotically

optimal smoothing parameter it is sufficient to minimize:

∫∫ −−′ − ,))(1)(())((4

1 1242 dttKtKhndyyfhk (23)

This is minimized when the function ∫ ′ dyyf 2))(( is minimal.

The proposed method is based on transforming the variable so that its

distribution is achieved by minimizing the above functional.

Terrell (1990) proves that the density of the Beta (3, 3) defined in the domain (-

1, 1) minimizes ∫ ′ dyyf 2))(( among all densities with a known variance. The

density functions and distribution of Beta (3, 3) are:

.2

1

16

15

8

5

16

3)(

,11,)1(16

15)(

35

22

++−=

≤≤−−=

xxxxH

xxxh (24)

Double transformed kernel estimation involves carrying out a first processing of

the data, by using the distribution function of the generalized Champernowne

with three parameters, and hence the transformed variable has a distribution

Page 13: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

13

that is located around a Uniform distribution (0,1). Subsequently, the data is

transformed again by using the inverse of the Beta function (3, 3),

nnt YZHYZH == −− )(,........,)( 11

1 .

The result of this double transformation will have a distribution close to that of

Beta (3, 3). The resulting transformed kernel estimation is:

∑∑=

−−

=

−=

−=

n

i

icMcMn

iY h

XTHxTH

nh

YyK

nyF

1

ˆ,ˆ,ˆ1

ˆ,ˆ,ˆ1

11

))(())((11)(ˆ αα (25)

The smoothing parameter h is estimated by following the methodology

explained above, with the knowledge that 715))(( 2 =′∫ dyyf ,for Beta (3, 3), and

hence by using the Epanechnikov kernel, we obtain:

3

1

3

1

3 33 −== nnh& (26)

3. Empirical Analysis

The selection of the severity loss distribution is probably the most important

phase and the one that involves more complication in the estimation of the

capital requirement of operational risk. An incorrect choice of distribution leads

to a severe distortion of the model and a low estimate or overestimates of

regulatory capital with respect to the operational risk.

In this article, we attempt to find which techniques (parametric and non-

parametric) yield a more appropriate measure of operational risk severity, with a

particular focus on saving Bank cases. To this end, all methodologies presented

in the previous sections are applied to a dataset presented below.

Data available in this work is provided by a medium-sized Spanish Savings

Bank which has compiled internal data in order to make step-by-step advances

in measuring and modelling Operational Risk.

The threshold for collecting the loss data in the saving bank is set to 0 Euros.

Unlike other studies, where the threshold is set at 10,000 Euros, 95% of the

losses are lower than 542.32 Euros. The data provided spans from the year

2000 to 2006. Years 2000, 2001, 2002 and 2003 contain 2, 5, 2, and 50

Page 14: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

14

operational loss events, respectively, due to in that year’s bank did not provide

a recompilation system for operational losses; however, in the year 2004 the

bank provided such a system, and hence for the years 2004, 2005 and 2006

these one 6,397, 4,959, 6,580 operational loss events, respectively.

In order to prevent distortion of the estimation of the distribution models, data

from 2000 to 2003 are disregarded. The core business of any Spanish Savings

Bank is retail banking, and for this reason all the data come from this business

line. Our data set is relatively small but is satisfactory enough for operational

risk analysis to the level of whole bank.

Figure 1: Monthly amount of losses

1 2 3 4 5 6 7 8 9 10 11 12 2004

0

100000

200000

300000

400000

500000

600000

2004

2005

2006

Page 15: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

15

Figure 2: Monthly number of losses

1 2 3 4 5 6 7 8 9 10 11 12 2004

0

200

400

600

800

1000

1200

1400

2004

2005

2006

Exploratory Data Analysis (EDA) is quantitative detective work, according to

Tukey (1977). In the EDA, data should first be explored without any

assumptions being made about probabilistic models, error distributions, number

of groups, relationships between the variables, and so on. in order to discover

what the data can tell us about the phenomena under investigation.

EDA is also a collection of techniques for the discovery of information about the

data, and the methods for the visualization of the information, in order to expose

any underlying process.

Page 16: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

16

In regards to this work, EDA techniques are used in order to describe the main

characteristics of data such as: central tendency, dispersion (i.e. variance and

standard deviation), and to determine how data might be distributed, by

focusing on the evaluation of skewness and kurtosis.

The data is adjusted according to the Consume Price Index (CPI) to prevent

distortion in the work outcome, and 2006 is taken as the baseline year.

In Table 1, all the main features on central tendency, asymmetry and tail-

heaviness of the data set are given.

Table1: Descriptive statistics

Statistics Operational risk

N

Mean

Median

Standard .deviation

Skewnes coefficient

Kurtosis coefficient

17936

254.48

51.35

3602.71

73.48

6924.22

The mean is much higher than the median; this is a feature present in the

skewed distributions, and is confirmed in our case by Skewness coefficients.

The Kurtosis coefficient also shows leptokurtic tails.

The Q-Q plot and the P-P plot show that the data are far from being normally

distributed.

Figure 3: Q-Q plot and P-P plot

Q-Q Plot

Normal

x36000032000028000024000020000016000012000080000400000

Qua

ntile

(M

odel

)

360000

320000

280000

240000

200000

160000

120000

80000

40000

0

P-P Plot

Normal

P (Empirical)10,90,80,70,60,50,40,30,20,10

P (

Mod

el)

1

0,9

0,8

0,7

0,6

0,5

0,4

0,3

0,2

0,1

0

Page 17: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

17

The parameter estimation for the whole distribution by parametric

methodologies is carried out in order to advance in the analysis.

Table 2: Parameter estimation for the whole distrib ution

Distributions Parameters

Weibull α 0.59

β 104.42

Lognormal ų 3.92

σ 1.44

Champerwone α 1.27

c 0

M 50

Estimated the parameter for the Weibull, lognormal and Champrwone, the

attention is focused to the estimation of the GPD parameters.

Table 3: Parameter estimation of Generalized Pareto Distribution

Distribution threshold obs ξ u σ

GPD 166.53 3049 0.81 197.90 159.67

To estimate the threshold, the method described in Reiss and Thomas (2007) is

applied.

3.1. Results analysis

In order to analyze the different methods to fit the severity loss distribution we

compare the results of the empirical cumulative distribution function with each

method presented in this work, with particular attention on the tail. We divide

the cdf into two parts, one part for the body that varies from 0 to 0.99

percentiles and the other part for the tail from 0.99 to 1.

Page 18: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

18

Figure 3: The cdf for the lognormal distribution co mpared with that of the empirical function

The two plots above show that the lognormal distribution fits the body (left)

reasonably well but is not a good selection for the higher quantiles (right).

Figure 4: The cdf for the Weibul distribution compa red with that of the empirical function

The Weibul distribution has a worse fit than the lognormal distribution in the

body (left) and fits as badly as the lognormal in the tail (right).

Page 19: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

19

Figure 5: The cdf for the Champernowne distribution compared with that of the empirical function

The Champernowne distribution provides a better performance in the tail of the

distribution with respect to those previously plotted, but its fit is still not

unsuitable.

Figure 6: The cdf for Generalized Pareto Distributi on compared with that of the empirical function

In the case of GPD only the tail is considered to analyze the fit that this

distribution provides. As it is possible to note from the figure, the GPD present a

better fit on the tail than the parametric distribution that don’t focus the attention

on the tail.

Page 20: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

20

Figure 7: The cdf for the transformed kernel estima tion compared with that of the empirical function

PLOT emp Champ

0. 48

0. 50

0. 52

0. 540. 56

0. 58

0. 60

0. 62

0. 64

0. 66

0. 68

0. 70

0. 720. 74

0. 76

0. 78

0. 80

0. 82

0. 840. 86

0. 88

0. 90

0. 92

0. 94

0. 96

0. 981. 00

x

0 1000 2000 3000

PLOT emp Champ

0. 986

0. 987

0. 988

0. 989

0. 990

0. 991

0. 992

0. 993

0. 994

0. 995

0. 996

0. 997

0. 998

0. 999

1. 000

x

0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1100000

The kernel cdf estimation using the Champernowne distribution provides a good

fit in the body of the distribution but has a serious problem in the tail due to the

lack of information in this part of the distribution.

Figure 8: The cdf for the double transformed kernel estimation compared with that of the empirical function

As shown in figure 7, the double transformation in the kernel estimation function

significantly improves the fit of the cumulative distribution in both the body and

in the tail.

For a numerical confirmation, the quantiles for α=0.95, α=0.99 and α=0.999 are

estimated for each distribution and compared with the empirical quantile

estimation and are shown in table 4.

Page 21: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

21

Table 4: Quantile estimation of the whole distribut ions

CDF α=0.95 α=0.99 α=0.999

Empirical 541 2475 23205

Lognormal 533 1427 3718

Weibull 529 1098 2246

Champernowne 316 1126 6150

KTCH 557 10036 +∞

KDTB 560 2535 27233

There is a large gap between the parametric distributions and the empirical

distribution in the higher quantiles. The kernel estimation with only one

transformation fails to estimate the quantile at 99.9% due to lack of information

in the tail and the quantile value estimated at 99% presents the widest gap with

that of the empirical distribution. By focusing attention on the kernel estimation

with double transformation, it can be seen that the quantile estimations confirm

the visual results of the cdf plots. The KDTB is the estimator with the least

difference with respect to the empirical distribution.

Since the quantile values estimated with the non-parametric methodology are

higher than those of the parametric ones we compare these results with the

Generalized Pareto Distribution, presented in Section 3. This is a well-known

extreme value model, widely used to fit operational loss severity distribution.

Table 5: Quantile estimation of GPD and Comparison with Empirical, KTCH and KDTB

CDF α=0.95 α=0.99 α=0.999

GPD 2705 10662 36178

Empirical 541 2475 23205

KTCH 557 10036 +∞

KDTB 560 2535 27233

Looking at the table 5, it is evident that the quantile values estimated with GPD

are higher for all levels with respect to the non-parametric model. The gap

between GPD quantile estimation and empirical quantile estimation increases

when the more extreme quantiles are considered.

4. Conclusions

Page 22: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

22

In recent literature on operational risk, the severity loss distribution is the main

topic. Numerous modelling methods have been suggested although very few

work for both high frequency small losses and low frequency big losses. Hence,

common sense suggests a mixture of two distributions. For small losses,

distribution such as lognormal and Weibull are frequently used in combination

with Extreme Value Theory. In these practices questions arise, such as at which

point the data should be split in order to obtain the optimal threshold and which

model should be used.

An alternative distribution in the form of the Champernowne distribution is

explored, whose characteristics promise to provide a good alternative to fit the

whole distribution, however with disappointing results.

Attention is then focused on an alternative one-method-fits-all approach, and

we analyze the transformation kernel estimation method and the double

transformation kernel estimation. The good performance of the latter in the

context of operational risk severity is worthy of a special mention. This method

is based on the work of Bolance and Guillen (2009), it was initially proposed in

the context of insurance. In our opinion, these methodologies, especially the

double transformation kernel estimation can provide an excellent alternative to

estimate an operational risk loss severity distribution. This methodology takes

into account all tail behaviour and also includes data of high frequency small

losses that form the body of the distribution.

However, the outcomes of this study cannot be considered for deeper

understanding for example to estimate economic capital. We can conclude that

the double transformation kernel estimation delivers a good performance in all

severity distributions respect to the classic parametric model, particularly in

case where not many data are available to permit to split it in two parts. This is

the case of Savings Banks which main business is retail banking characterized

for high frequency small losses data. These methodologies permit to fit the

whole distribution making unnecessary to estimate a threshold. Another good

property of non-parametric techniques respect on EVT is that seems no

overestimate capital requirement. The double transformation kernel estimation

was applied in insurance claim field; we think that this methodology can

represent an advance in operational loss severity estimation too. In this sense

Page 23: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

23

the next step is to combine the double transformation kernel estimation with a

frequency loss distribution in order to estimate an operational VaR which is

specified as the 99.9th confidence level of the aggregate loss distribution for the

Committee.

Bibliography

Azzalini, A. (1981) “A note on the estimation of a distribution function and

quantiles by a kernel method”, Biometrica 68, 1, 326-328.

Basel Committee on Banking Supervision:

• (1998): Operational risk management, BASEL, Basilea.

• (2001a), Operational Risk-Consultative Document, Supporting Document

to the New Basel Capital Accord.

• (2001b), Working Paper on the Regulatory Treatment of Operational

Risk.

• (2002): Operational risk data collection exercise - 2002, BASEL, Basilea.

• (2003a), Third Consultative Paper, The New Basel Capital Accord.

• (2003b): Sound Practices for the Management and Supervision of

Operational Risk, BASEL, Basilea.

• (2006): Basel II: International Convergence of Capital Measurement and

Capital Standards: A Revised Framework – Comprehensive Version,

BASEL, Basilea.

Page 24: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

24

Bolancé, C., Guillén, M. and Nielsen, J.P. (2003) “Kernel density estimation of

actuarial loss functions” Insurance: Mathematics and Economics, 32, 1, 19-36

Bolancé, C., Guillén M. and Nielsen, J.P. (2009): “Transformation Kernel

Estimation of Insurance Claim Cost Distributions” in Corazza, M. and Pizzi, C

(Eds.) Mathematical and Statistical Methods for Actuarial Sciences and

Finance, Springer, 223-231

Bolancé C., Guillén M., and Nielsen J.P. (2008) “Inverse Beta transformation

in kernel density estimation”, Statistics & Probability Letters, 78, 1757-1764.

Bolancé, C., Guillén, M. (2009) “Estimación Núcleo Transformada para

aproximar el Riesgo de Pérdida”, Investigaciones en Seguros y Gestión de

Riesgos: Riesgo 2009, Cuadernos de la Fundación.

Buch-Larsen, T., Guillén, M., Nielsen, J.P. and Bolance, C. (2005) “Kernel

density estimation for heavy-tailed distributions using the Champernowne

transformation”, Statistics, 39, 6, 503-518.

Champernowne, D.G. (1952). “The Graduation of Income Distributions”

Econometrica, 20, 4, 591-615.

Chapelle, A., Crama, Y., Hübner, G., and Peter , J., “Practical methods for

measuring and managing operational risk in the financial sector: A clinical

study” , Journal of Banking & Finance Volume 32, Issue 6, June 2008, Pages

1049-1061

Chernobai A., Menn C., Trϋck S., Rachev T. S., (2004), “A Note on the

Estimation of the Frequency and Severity Distribution Losses” . Applied

Probability Trust.

Chernobai A., Rachev S., Fabozzi F., (2007), Operational Risk: A guide to

Basel 2 Capital Requirements, Models, and Analysis. John Wiley and Son, Inc.

New Jersey.

Page 25: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

25

Cruz, M. G. (2002) Modeling, Measuring and Hedging Operational Risk, John

Wiley & Soon, New York, Chichester.

Da Costa, L (2004), Operational Risk with Excel and VBA. John Wiley and Son,

Inc. New Jersey.

Danielsson, J. and de Vries, C.G., (1997), “Tail index and quantile estimation

with very high frequency data”, Journal of Empirical Finance, 4, 241-57.

de Fontnouvelle, P., Deleus-Rueff, V., Jordan, J. and Rosengren, E., (2003),

“Using loss data to quantify operational risk”, Federal Reserve Bank of Boston,

Working Paper.

de Fontnouvelle, P., Jordan, J. and Rosengren, E., (2005),”Implications of

Alternative Operational Risk Modeling Techniques”, Technical report, Federal

Reserve Bank of Boston and Fitch Risk.

Dupuis, D.,(2004), Exceedances over High Thresholds: A Guide to Threshold

Selection. Springer U.S.

Dutta, K., Perry, J., (2007), “An Empirical Analysis of Loss Distribution Models

for Estimating Operational Risk Capital”, Federal Reserve Bank of Boston

working paper.

Ebnother, S., Vanini, P., McNeil, A.J. and Antolinez-Fehr, P., (2001), “Modelling

Operational Risk”, Zurich Cantonal Bank and Swiss Federal Institute of

Technology Zurich (Department of Mathematics), Working Paper.

Embrechts, P., Klϋppelberg, C., and Mikosch, T., (1997) “Modeling Extremal

Events for Insurance and Finance”, Berlin, Germany: Springer-Verlag.

Page 26: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

26

Embrechts, P. (2000). Extremes and Integrated Risk Management. London:

Risk Books, Risk Waters Group.

Guillen, M., Gustafsson, J., Nielsen, J.P. and Pritchard, P. (2007). “Using

external data in operational risk”. Geneva Papers, 32, 178-189.

Gustafsson, J., Nielsen, J.P., Pritchard, P. and Roberts, D. (2006). “Quantifying

Operational Risk Guided by Kernel Smoothing and Continuous credibility: A

Practioner’s view”. Journal of Operational Risk, 1, 1, 43-56.

Gustafsson, J., Hagmann, M., Nielsen, J.P. and Scaillet, O. (2008). “Local

Transformation Kernel Density Estimation of Loss Distributions”. Journal of

Business and Economic Statistics

Gustafsson, J., Thuring F., (2008). “A Suitable Parametric Model for

Operational Risk Applications”. Working Paper

Embrechts, P. (2004), ‘‘Extremes in Economics and Economics of Extremes’’, in

B. Finkenst ¨adt and H. Rootz´en, eds, Extreme Values in Finance,

Telecommunications and the Environment, Chapman & Hall, London, pp. 169–

183.

Embrechts, P., Furrer, H., and Kaufmann, R. (2003), ‘‘Quantifying Regulatory

Capital For Operational Risk,’’ Derivatives Use, Trading, and Regulation 9 (3),

pp. 217–233.

Frachot, A and Roncalli, T., (2002), “Mixing Internal and External Data for

Managing Operational Risk,” Groupe de Recherche Opérationnelle, Crédit

Lyonnais.

Klugman, S.A., Panjer, H.A. and Willmot, G.E. (1998). Loss Models: From Data

to Decisions. New York: John Wiley & Sons, Inc.

Page 27: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

27

Jobst, A.A. (2007): “Operational Risk - The Sting is Still in the Tail But the

Poison Depends on he Dose”, Journal of Operational Risk, 2, 2.

Matthys G., and Beirlant J., (2003) “ Estimating the Extreme Value Index and

High Quantiles with Exponential Regression Models”. Statistica Sinica

13(2003), 853-880

McNeil, A.J., Frey, R. and Embrechts, P. (2005). Quantitative Risk

Management. Princeton Series in Finance.

McNeil, A. J. (1996), “Estimating the Tails of Loss Severity Distributions Using

Extreme Value Theory”, Technical report, ETH Zurich.

Moscadelli, M., (2004), “The Modelling of Operational Risk: Experience with the

Analysis of the Data Collected the Basel Committee”, Technical report, Bank of

Italy.

Mϋller, H., (2002), “Quantifying Operational Risk in a Financial Institution”,

Master’s thesis, Istitut fϋr Statistik and Wirtschaftstheorie, Universität Karlsruhe,

In Chernobai A., Rachev S., Fabozzi F., (2007), Operational Risk: A guide to

Basel 2 Capital Requirements, Models, and Analysis. John Wiley and Son, Inc.

New Jersey.

Reiss, R., Thomas, M., (2007), Statistical Analysis of Extreme Values with

Applications to Insurance, Finance, Hydrology and Other Fields. Springer

Reiss, R., (1981), “Nonparametric estimation of smooth distribution function”,

Scandivian Journal of Statistics, 8, 116-119.

Reynolds, D., and Syer, D., (2003), “A General Simulation Framework for

Operational Loss Distributions”, In Alexander, C., et al., Operational Risk:

Regulation, Analysis and Management, Prentice Hull, Upper Saddle River, NJ,

pp-193-214.

Page 28: Kernel Alternatives to Approximate Operational Severity ... ANNUAL MEETINGS/2011-Braga/papers/0211.pdfgood alternative. The classic kernel estimation of the distribution function is

28

Rosenberg, J., and Schuermann, T., (2004), “A General Approach to Integreted

Risk Management with Skewed, Fat-Tailed Risks, Technical Report, Federal

Reserve Bank of New York.

Tukey, J., W., (1977), “Exploratory Data Analysis”, Reading, MA: Addison-Wesley.

Wand, M.P. and Jones, M.C. (1995). Kernel Smoothing. Chapman & Hall

Wand, M.P., Marron, J.S. and Ruppert, D. (1991). “Transformation in Density

Estimation” Journal of the American Statistical Association, 86, 414, 343-361.