Nonparametric Inference for VaR, CTE and Expectile with ... · Nonparametric Inference for VaR, CTE...

Nonparametric Inference for VaR, CTE

and Expectile with High-Order Precision

Zhiyi Shen†, Yukun Liu‡ and Chengguo Weng†∗

†Department of Statistics and Actuarial Science, University of Waterloo

‡School of Statistics, East China Normal University

Abstract

Value-at-Risk (VaR) and Conditional Tail Expectation (CTE) are two most frequently

applied risk measures in quantitative risk management. Recently, expectile has also at-

tracted much attention as a risk measure due to its elicitability property. In this paper,

empirical likelihood based estimation with high-order precision is established for these three

risk measures, and the superiority of our estimation is justified both in theory and via sim-

ulation study. Extensive simulation studies confirm that our method significantly improves

the coverage probabilities for interval estimation, compared to three competing methods

available in the literature.

Keywords: Empirical Likelihood, Risk Measure Estimation, Value-at-Risk, Conditional Tail

Expectation, Expectile.

∗Corresponding author. Postal address: M3-200 University Avenue West, Waterloo, Ontario, Canada, N2L

3G1.

Emails: Shen ([email protected]), Liu ([email protected]), and Weng ([email protected])

1

1 Introduction

Risk measures play a central role in modern quantitative risk management (QRM). They

are indispensable concepts that the insurance and finance industries rely on for internal

risk assessment and external solvency regulation. Value-at-Risk (VaR) and Conditional

Tail Expectation (CTE) are two most frequently applied risk measures in QRM practice

and regulatory frameworks such as Solvency II and BASEL Accords. Expectile is another

risk measure which attracts much attention from both practitioners and academicians due

to its elicitability property.

LetX be a random variable representing the loss of a given insurance or finance portfolio.

For p ∈ (0, 1), the 100(1−p)% VaR of the portfolio is defined as the (1−p)th upper quantile

of the cumulative distribution of X, i.e.,

VaR1−p(X) = infx ∈ R : Pr(X > x) ≤ p.

In the real-world application, p is usually a small number such as 5% or even 1%. VaR only

quantifies a threshold for the loss to exceed with a small probability. The CTE is defined

as the expected loss given that the loss exceeds the VaR, and thus, CTE can better reflect

the magnitude of the loss on the tail as a risk measure. Formally, the 100(1− p)% CTE of

the loss variable X is defined as

CTE1−p(X) = E [X|X > VaR1−p] .

A risk measure closely-related to CTE is the expected shortfall (ES), which at the 100(1−p)% confidence level is defined as

ES1−p(X) =1

p

∫ 1

1−pVaRt(X) dt.

It is well known that CTE1−p(X) coincides with ES1−p(X) for continuous loss variable X.

The expectile was introduced by Newey and Powell (1987) for asymmetric regression

in a statistics context, and it has been exploited in the field of QRM as a risk measure

recently due to its nice elicitability property (e.g., Ziegel, 2014; Bellini et al., 2014). For

a loss random variable X with E[X2]<∞, the expectile of X at 100(1− p)% confidence

level is defined as the following unique minimizer:

e1−p(X) := argminx∈R

(1− p)E[((X − x)+)2] + pE[((x−X)+)2]

(1)

or equivalently the unique solution to the equation

e1−p(X) = E[X] +1− 2p

pE[(X − e1−p(X))+] (2)

where the notation Y+ := maxY, 0.

2

For the practical use of risk measures, the underlying loss distribution is unknown and

often needs to be estimated from historical data, which naturally raises the problem of

statistical inference for risk measures. Traditional inference methods for VaR and CTE are

usually parametric, assuming that the loss follows a specific parametric distribution, such

as lognormal, finite mixture, Weibull and Pareto distributions. Composite models are also

proposed to mimic the heavy tail feature of loss distributions in more recent parametric

estimation literature; we refer to Cooray and Ananda (2005), Scollnik and Sun (2012),

Nadarajah and Bakar (2014) and Abu Bakar et al. (2015) for detailed discussion. Yet,

parametric methods are exposed to the model mis-specification risk, and it demands a

certain delicate model selection procedure in order to reduce such risk in practice. As

alternative resolutions, nonparametric statistical inference techniques can be used to reduce

the model mis-specification risk in the estimation of risk measures and no model selection

procedure is needed throughout the inference procedure. Examples for the estimation of

VaR and CTE include the smoothing kernel method (Chen and Tang, 2005; Chen, 2008),

the influence function method (Yamai and Yoshiba, 2002; Manistre and Hancock, 2005), the

bootstrap method (Dowd, 2005; Hardy, 2006), and the empirical likelihood (EL) method

(Baysal and Staum, 2008). Nonparametric inference for risk measures other than VaR,

CTE and expectile has also attracted intensive attentions, see for example, Jones and

Zitikis (2003), Peng et al. (2012), and Ahn and Shyamalkumar (2014), among others.

Despite the extensive studies on nonparametric estimation of the relevant risk measures,

the literature on analyzing high-order precision of estimation for risk measures is relatively

thin, and this paper aims to contribute in this direction via studying the empirical likelihood

(EL) based nonparametric inference for VaR, CTE, and expectile with high-order precision.

The EL method was first proposed by the seminal works of Owen (1988, 1990) for interval

estimation of a population mean, and extended to infer parameters defined through general

estimating equations by Qin and Lawless (1994). Since then, it has become a popular non-

parametric statistical tool due to its numerous nice properties. For example, the EL ratio

has a limiting chi-square distribution, known as Wilks’ Theorem; an EL based confidence

region has a data-driven shape and is transformation respecting (Hall and La Scala, 1990).

We refer to Owen (2001) for a more thorough review on EL methods.

The EL estimation is derived from optimizing an empirical likelihood subject to certain

estimating equations of parameters of interest. There is no estimating equation solely for

the risk measure CTE in general and thus EL estimation for CTE is not possible if not

relating it to some other relevant parameter(s) for certain joint estimating equations. One

natural choice is to utilize the joint estimating equations for the pair (VaR,CTE) as shown

in (5) in the sequel, which result from the definitions of the pair. These joint estimating

equations have been proposed by Baysal and Staum (2008) for the estimation of the EL

confidence region of (VaR,CTE). The EL methods proposed by Baysal and Staum (2008),

however, is hindered by the non-smoothness of the two equations in (5) in three aspects.

3

First, the EL confidence region for (VaR,CTE) is also non-smooth which is passed on from

the non-smoothness of the estimating equations; see Figure 1 of Baysal and Staum (2008).

Second, the one- and two-sided coverage accuracy of EL confidence regions can never be

better than O(n−1/2) (Chen and Hall, 1993), where n denotes the sample size used in

estimation. Last but not least, the non-smoothness of the estimating equations makes it

challenging, if not impossible, to profile an EL estimation for CTE from the EL. The same

comments are also applied to the EL estimation for expectile; see equation (21).

For the estimation of VaR and CTE, the application of the EL method is also hindered

by non-existence of solutions to the empirical estimating equations. The EL estimation for

a parameter θ is based on the following empirical log-likelihood ratio function:

EL(θ) = −2 sup

n∑i=1

log(npi) : pi ≥ 0,

n∑i=1

pi = 1,

n∑i=1

pig(Xi; θ) = 0

, (3)

where X1, X2, · · · , Xn is a sample of the loss variable X, and g(x, θ) is a known function

such that θ is the unique solution to the estimating equation E [g(X; θ)] = 0. For each given

θ, EL(θ) is well defined only if the convex hull of g(Xi, θ), i = 1, . . . , n contains the origin.

When n is not large, this convex hull often fails to contain the origin (see Chen et al., 2008),

and this issue becomes even more severe when g contains an indication function as specified

in (4) and (5) for the estimation of VaR and CTE, where the parameter 1 − p is usually

as large as 0.99 so that the effective sample size used for the estimation is only about a

hundredth of the sample size n. In other words, for 1−p = 0.99, even when the sample size

is as large as 1000, there is only about 10 effective data points for the estimation of the risk

measures and it is easy to lead to the non-existence of EL(θ) in this case. The convention of

solving this problem is to assign infinity blindly to the EL ratio statistic EL(θ), although it

provides no information on the relative plausibility of the parameter values and may result

in discontinuous confidence regions (Chen et al., 2008). Such a convention can severely

jeopardize the accuracy of the estimation in terms of coverage probability as confirmed by

our simulation studies in Section 3.

To develop EL estimation for VaR, CTE and expectile with high-order precision, we

smooth and adjust the estimating equations used in the estimation by borrowing tech-

niques from statistics literature. On the one hand, we adopt kernel method to smooth the

estimating equations as suggested by Chen and Hall (1993) and this leads to the smoothed

EL (SEL) estimation for the three risk measures. We show that the SEL confidence regions

improve the two-sided coverage accuracy to O(n−1

), and that the SEL also admits Bartlett

correction which further improves the confidence regions to even better O(n−2

)coverage

accuracy. On the other hand, we further exploit Chen et al. (2008)’s method to adjust the

smoothed empirical estimation to completely circumvent the non-definition problem.

In addition to the theoretical justification, in this paper we also conduct extensive

simulation studies to illustrate the superiority of the smoothed adjusted empirical likelihood

4

(SAEL) method to three competing methods which are available in the literature. They

include the raw EL method (Baysal and Statum, 2008), the influence function method

(Manistre and Hancock, 2005) and the bootstrap method (Dowd, 2005; Hardy, 2006). Our

simulation studies constantly confirm that the SAEL significantly outperforms all the three

competing methods in terms of coverage probabilities.

The remainder of the paper proceeds as follows. In section 2, we first define the SEL

for VaR and CTE, and show that it is Bartlett correctable. The SAEL is then proposed

and proven to achieve high-order precision. In the end of the section, we extend the SAEL

method for the estimation of expectile. Section 3 presents simulation studies in two different

frameworks. Section 4 is the application of the SAEL method to the Danish Fire Loss Data.

Section 5 concludes the paper, and all the proofs of theorems are relegated to the Appendix.

2 EL Estimation for VaR, CTE and Expectile

2.1 Smoothed EL for VaR and CTE

Let X1, X2, · · · , Xn be n independent and identically distributed observations from the

distribution function F (x) of a continuous loss variable X. Generally if the parameter of

interest θ is the unique solution to an estimating equation E [g(X; θ)] = 0 for some known

function g(x; θ) with dimension no less than that of θ, then the EL estimation for θ is

obtained based on the empirical log-likelihood ratio function EL(θ) defined in equation (3).

When the parameter of interest is VaR1−p(X), we may directly choose g(X;µ) = ϕ(X;µ)

with

ϕ(X,µ) = I(X − µ)− p. (4)

where I(x) = 1 if x > 0 and 0 otherwise. If both the VaR1−p(X) and CTE1−p(X) are of

interest, then g may be chosen to be

ψ(X;µ, η) =

(I(X − µ)− p

(1/p)XI(X − µ)− η

). (5)

It is necessary to point out that these estimating functions are non-smooth, which leads to at

least two undesirable consequences. Firstly, EL confidence intervals (regions) based on such

estimation equations have relatively large O(n−1/2) coverage error (Chen and Hall, 1993),

as opposed to the common O(n−1

)error. Secondly, when η is fixed, the maximization of

EL(µ, η) with respect to µ often has no solution, which means that we cannot define the

profile EL of η (Baysal and Staum, 2008).

In order to circumvent these issues, Chen and Hall (1993) proposed a smoothed EL when

only the quantile or µ is of interest. In this paper, we extend such a smoothing method to

the pair (VaR1−p(X), CTE1−p(X)) or equivalently (µ, η), which in turn makes it possible

5

to estimate CTE1−p(X) via profiling empirical likelihood. Let K be a bounded, smooth

and compactly supported density function which satisfies

∫ujK(u)du =

1 , j = 0

0 , j = 1

κ , j = 2

(6)

with κ being a non-zero constant. Define G(x) =∫ x−∞K(y)dy, and Gh(x) = G(x/h)

with h > 0 a small constant to be determined and known as bandwidth in nonparametric

statistics literature. The key of Chen and Hall (1993)’s smoothed EL is to replace the

non-smooth estimating functions by their smooth counterparts:

ϕh(X;µ) = Gh(X − µ)− p

and

ψh(X;µ, η) =

(Gh(X − µ)− p

(1/p)XGh(X − µ)− η

). (7)

The smoothed EL function (SEL) for VaR1−p(X) and CTE1−p(X) are respectively given

by

SEL1(µ) = −2 sup

n∑i=1

log(npi) : pi ≥ 0,n∑i=1

pi = 1,n∑i=1

piϕh(Xi;µ) = 0

,

and

SEL2(µ, η) = −2 sup

n∑i=1

log(npi) : pi ≥ 0,n∑i=1

pi = 1,n∑i=1

piψh(Xi;µ, η) = 0

.

If only CTE1−p(X) is of interest, the smoothness of the estimating function ψh(x, µ, η)

makes it easy to define the profile SEL of CTE1−p:

SEL3(η) = minµ

SEL2(µ, η).

The following theorem extends Chen and Hall (1993)’s result from quantile to the bivariate

parameter (µ, η) and the CTE parameter η, respectively.

Theorem 1. Let µ0 and η0 be the true values of VaR1−p(X) and CTE1−p(X). If

(1) the first and second derivatives of the underlying distribution function F (x) exist in a

neighbourhood of µ0 and are continuous at µ0;

(2) the first derivative F ′(µ0) > 0;

(3) the bandwidth h satisfies nh4 → 0 and nh/ log n→∞, as n→∞;

(4) E[X2]<∞,

6

then

PrSEL1(µ0) ≤ x =χ2

1 ≤ x

+O(n−1

), (8)

where χ2d denotes a random variable having the χ2

d distribution for positive integer d. In

addition, if E(X4) <∞, then

PrSEL2(µ0, η0) ≤ x = Prχ2

2 ≤ x

+O(n−1

), (9)

Pr SEL3(η0) ≤ x = Prχ2

1 ≤ x

+O(n−1

). (10)

Proof. See the Appendix.

Although the coverage accuracy of an EL confidence interval for quantile is onlyO(n−1/2)

(Chen and Hall, 1993), Theorem 1 indicates that the smoothing technique makes it recover

the usual coverage accuracy O(n−1

). With the smoothed estimating equations, this cover-

age accuracy can be further enhanced to O(n−2

)with the Bartlett correction technique as

we can shortly see in the next subsection.

2.2 Bartlett-Corrected SEL for VaR and CTE

We begin by defining the so-called Bartlett correction factors accompanying the SELs for

VaR1−p(X) and the pair (VaR1−p(X), CTE1−p(X)). For VaR1−p(X), we define

b1 =1

2

α4

α22

− 1

3

α23

α32

(11)

where αk = E(ϕh(X;µ0))k. For the pair (VaR1−p(X), CTE1−p(X)), we consider the

eigendecomposition Var(ψh(X;µ0, η0)) = ΓΛΓᵀ for the variance matrix of ψh(X;µ0, η0),

and define Y ≡ (Y1, Y2)ᵀ := Γᵀψh(X;µ0, η0). Put βi1i2···ik = E(Yi1Yi2 · · ·Yik) and

b2 =1

2

2∑i=1

2∑j=1

βiijjβiiβjj

− 1

3

2∑i=1

2∑j=1

2∑k=1

β2ijk

βiiβjjβkk. (12)

We summarize the Bartlett correctability of the SEL in the next theorem.

Theorem 2. Assume that the same conditions (1)-(4) in Theorem 1 are satisfied. If n3h4

is bounded from above as n→∞, then we have

Pr

SEL1(µ0)

1 + b1/n≤ x

= Pr

χ2

1 ≤ x

+O(n−2

). (13)

Furthermore, if E(X18) <∞, then

Pr

SEL2(µ0, η0)

1 + b2/n≤ x

= Pr

χ2

2 ≤ x

+O(n−2

). (14)

7


Remark 1. The Bartlett correction result of SEL2(µ0, η0), i.e. equation (14), is new, al-

though that of SEL1(µ0), i.e. equation (13), has been disclosed by Chen and Hall (1993).

Remark 2. The Bartlett correction factors b1 and b2 are generally unknown in applica-

tions, and may be replaced by their estimates. If the estimators are√n-consistent, such a

replacement does not affect high-order results in Theorem 2. This conclusion is achieved by

re-studying the proof of this theorem, and hence omitted.

For the estimation of the Bartlett correction factors b1 and b2, we recommend using

the less-biased estimator proposed by Liu and Chen (2010). This estimation strategy is

also adopted in our simulation study in a later section. Let b1 and b2 be the less-biased

estimators of b1 and b2. According to Theorem 2, we can construct 100(1 − α)%-level

confidence intervals for VaR1−p(X) and (VaR1−p(X), CTE1−p)ᵀ(X)) respectively with

second-order precision as followsµ :

SEL1(µ)

1 + b1/n≤ χ2

1(1− α)

,

(µ, η)ᵀ :SEL2(µ, η)

1 + b2/n≤ χ2

2(1− α)

,

where χ2d(1− α) denotes the 100(1− α)% quantile of the χ2

d distribution.

Theoretically, we can show that the profile SEL for CTE SEL3(η) is also Bartlett cor-

rectable. However, the accompanying Bartlett correction factor is too complicated to be

estimated (Chen and Cui, 2006). This makes it infeasible to construct high-order precise

confidence intervals for CTE through Bartlett corrected profile SEL. So, we compromise to

use Theorem 1 instead and construct a 100(1−α)% level confidence interval for CTE1−p(X)

as follows η : SEL3(η) ≤ χ2

1(1− α).

The coverage error of this confidence interval is O(n−1

)according to Theorem 1.

2.3 Smoothed AEL for VaR and CTE

An undesirable property of the aforementioned EL and SEL, which are all defined through

(3), is that they may have no definition. Given θ, if the origin lies outside the convex hull

of g(X1; θ), · · · , g(Xn; θ), then there exists no (p1, · · · , pn) to satisfy pi ≥ 0,∑n

i=1 pi = 1

and∑n

i=1 pig(Xi, θ) = 0 simultaneously. In this situation, the EL in (3) is not defined and

does not work any more. This issue becomes even more serious for the estimation of VaR

and CTE as we have previously commented in the first section.

8

To circumvent this non-definition problem, Chen et al. (2008) proposed for each θ,

adding an artificial pseudo-observation

g(θ) = −an

n∑i=1

g(Xi, θ)

to g(X1; θ), · · · , g(Xn; θ), where a > 0 is called the adjustment level. The resulting

EL defined based on the expanded data-set Ω(θ) = g(X1; θ), · · · , g(Xn; θ), g(θ) is called

adjusted empirical likelihood (AEL), i.e.,

AEL(θ; a) = −2 sup

n+1∑i=1

log(npi) : pi ≥ 0,

n+1∑i=1

pi = 1, (15)

n∑i=1

pig(Xi; θ) + pn+1g(θ) = 0

. (16)

Clearly, given any θ, Ω(θ) always contains the origin as an interior point, therefore the AEL

is always well-defined. It was disclosed by Chen et al. (2008) that the AEL inherits all the

first-order properties of the EL if a = o(n2/3). Furthermore, Liu and Chen (2010) found

that the AEL can have the same high-order precision as the Bartlett-corrected EL when

the adjustment level a is chosen to be half of the Bartlett correction factor.

Enlightened by Chen et al. (2008), and Liu and Chen (2010), we introduce pseudo

observation to circumvent the non-definition problem suffered by the SEL method discussed

in the previous subsection and abbreviate this as SAEL method. In relative to SEL1(µ; a)

and SEL2(µ, η; a) defined previously, we denote their adjusted counterparts by SAEL1(µ; a)

and SAEL2(µ, η; a), respectively. That is, SAEL1(µ; a) (resp., SAEL2(µ, η; a)) is defined in

line with equations (15) and (16) with the function g(Xi; θ) replaced by ϕh(Xi;µ) (resp.,

ψh(Xi;µ, η)). In the following theorem, we show that conclusions from Liu and Chen (2010)

are still valid in terms of estimating VaR and CTE.

Theorem 3. Assume that conditions (1)-(4) in Theorem 1 are satisfied. If n3h4 is bounded

and a = b1/2 + op (1), then we have

Pr SAEL1(µ0; a) ≤ x = Prχ2

1 ≤ x

+O(n−2

), (17)

where b1 is given in (11). Furthermore, if E[X4]<∞ and a = b2/2 + op (1), then

Pr SAEL2(µ0, η0; a) ≤ x = Prχ2

2 ≤ x

+O(n−1

), (18)

where b2 is given in (12). Finally, under the condition that E[X18

]<∞,

Pr SAEL2(µ0, η0; a) ≤ x = Prχ2

2 ≤ x

+O(n−2

)(19)

with a = b2/2 + op (1).

9


Remark 3. The condition E[X18] < ∞ is imposed for technical reason. It is a strong

condition for applications involving a heavy-tailed loss distribution. If this condition is

violated and we only have E[X4] <∞, then SAEL2(µ0, η0; a) still has an accuracy of O(n−1)

as indicated by equation (18). Furthermore, if we only have E[X2] <∞ and take a = O(1),

the SAEL2(µ0, η0; a) still attains a square-root-n convergence rate as a direct consequence of

the central limit theorem.

When only CTE1−p(X) is of interest, we can make inference through the profile SAEL

of CTE1−p(X):

SAEL3(η; a) = minµ

SAEL2(µ, η; a). (20)

Theorem 4. Assume that conditions (1)-(4) in Theorem 1 are satisfied. If n3h4 is bounded,

E[X4]<∞, and a = O(1), we have

PrSAEL3(η0; a) ≤ x = Prχ2

1 ≤ x

+O(n−1

).


Remark 4. Theoretically, we can prove that SAEL3(η0; a) also achieves the same high-

order precision as the Bartlett corrected EL if we set a = b3/2 where b3 is the Bartlett

correction factor accompanying SEL3(η0). However, the far complicated form of b3 hinders

the application of SAEL3(η0; b3/2). Thus, we do not consider Bartlett correction for the

estimation of CTE1−p(X) in this paper.

By virtue of the above Theorem 3, second-order precise confidence intervals for VaR1−p(X)

and (VaR1−p(X), CTE1−p(X)) with a confident level 100(1 − α)% can be constructed as

follows µ : SAEL1

(µ; b1/2

)≤ χ2

1(1− α),

(µ, η) : SAEL2

(µ, η; b2/2

)≤ χ2

2(1− α),

for which both have a coverage error O(n−2

). Further, in view of Theorem 4, an SAEL

based confidence interval for CTE1−p(X) is given byη : SAEL3(η; a) ≤ χ2

1(1− α)

which has a coverage error of O(n−1

).

10

2.4 EL Estimation for Expectile

As we have introduced in the first section, the expectile has become one of interesting risk

measures in the field of QRM due to its elicitability property (e.g., Ziegel, 2014; Bellini, et

al. 2014). For a loss random variable X with E[X2]<∞, the expectile e1−p(X) is defined

as the unique minimizer in equation (1) or equivalently the unique solution to equation (2).

Denote ξ = e1−p(X), and by virtue of equation (2), we can choose g(X; ξ) = φ(X; ξ)

for the EL estimation of the expectile, where

φ(X; ξ) = X + β(X − ξ)I(X − ξ)− ξ. (21)

The smoothing counterpart is given by

φh(X; ξ) = X + β(X − ξ)Gh(X − ξ)− ξ, (22)

Then we can define the smoothed adjusted EL, SAEL4(ξ; a), for e1−p(X) based on the

smoothed function φh(X; ξ) following the same procedure as we conducted for VaR1−p(X) in

the last section. We establish a high-order asymptotic result for the SAEL based estimation

of the expectile in the theorem below.

Theorem 5. Assume that conditions (1)-(3) in Theorem 1 are satisfied, and that E[X4]<

∞. If we choose a = b4/2 + op (1) and let ξ0 denote the true value of e1−p(X), then

PrSAEL4(ξ0; a) ≤ x = Prχ2

1 ≤ x

+O(n−1

),

where b4 is the Bartlett correction factor for e1−p(X) and its expression is similar to (11),

except that ϕh(X;µ0) is replaced by φh(X; ξ0).

Furthermore, if E[X18

]<∞, then

PrSAEL4(ξ0; a) ≤ x = Prχ2

1 ≤ x

+O(n−2

).


We summarize the convergence rates of the proposed SAEL method for the three in-

volved risk measures in Table 1. We have some interesting observations from the table.

Firstly, when the parameter of interest is the VaR, the SAEL method achieves a second-

order accuracy under all moment conditions. Secondly, when only the second moment of

the loss random variable exists, the SAEL method exhibits a square-root-n convergence

rate for CTE and Expectile, and this rate is same as that of raw EL, influence function,

and Bootstrap methods. Thirdly, if the fourth moment of the loss variable exists, the SAEL

achieves at least first-order convergence rate for all risk measures and outperforms the three

competitors (i.e., raw EL, influence function, and Bootstrap methods). It is notable that

11

Table 1: Summary of the convergence rates of SAEL method for VaR, CTE, and Expectiles

under different moment conditions. Assume all the other conditions in Theorems 3–5 are

satisfied.

ParameterMoment conditions

Unconditional E[X2]<∞ E

[X4]<∞ E

[X18

]<∞

VaR O(n−2

)O(n−2

)O(n−2

)O(n−2

)(VaR,CTE) −− O

(n−1/2

)O(n−1

)O(n−2

)CTE −− O

(n−1/2

)O(n−1

)O(n−1

)Expectiles −− O

(n−1/2

)O(n−1

)O(n−2

)

only loss data sets are available in practice, and one does not exactly know whether the un-

derlying probability distribution has a finite certain high-order moments or not. The SAEL

method automatically gains the extra estimation accuracy when the high-order moments

exist for the loss distribution.

2.5 Numerical Implementation

The implementation of the SAEL method can be decoupled into two stages1:

(i) Data transformation Given a sequence of loss observations Xi, i = 1, 2, . . . , n,we transform the data into a set

Y θi , i = 1, 2, . . . , n+ 1

with Y θ

i = gh (Xi; θ) for

i = 1, 2, . . . , n, and Yn+1 = −a/n∑n

i=1 gh (Xi; θ), where gh(·; ·) is a certain estimating

function, and its specific form depends on the risk measure of interest; for example, if

the Expectile is the parameter one intends to infer, gh(·; ·) is chosen as φh(·; ·) in Eq.

(22).

(ii) Raw EL inference Given the new data setY θi , i = 1, 2, . . . , n+ 1

, the evaluation

of the SAEL ratios (SAELj(·; a), j = 1, 2, 4) at a certain parameter θ is equivalent to

computing the raw EL ratio for a null mean parameter. This can be achieved by

employing the R package “emplik” (Zhou and Yang (2016)) for the raw EL method.

The SAEL3(·; a) can be obtained by first evaluating SAEL2(·; a) and then solving the

optimization problem in Eq. (20).

As shown in the above, the implementation of the SAEL method is the same as the raw

EL method applied to a certain transformation of the raw data, and thus, the existing R

package “emplik” can be directly applied.

1The authors are grateful to an anonymous referee for suggesting us to discuss this practical issue.

12

3 Simulation Studies

This section provides extensive results for the finite-sample behavior of the proposed SAEL

method in terms of coverage accuracy. The simulation studies focus on the estimation of

VaR and CTE. We use Bartlett corrected SAEL method for the estimation of VaR and the

pair (VaR,CTE) by virtue of Theorem 3. For the estimation of CTE, the SAEL method

is applied without Bartlett correction because of the overwhelming intricacy in estimating

the Bartlett correction factor; see Remark 4.

Three competing methods, including the raw EL method (Baysal and Statum, 2008),

the influence function method (Manistre and Hancock, 2005), and the bootstrap method

(Dowd, 2005), are considered for comparison. The confidence level (1− p) associated with

the risk measures is set to be 0.99 and 0.95, respectively, since they are commonly used in

insurance and financial risk management practice. We choose the kernel function involved

in the SAEL method to be

K(u) =

3√

5

20

(1− u2

5

), if |u| ≤

√5

0, otherwise

which is also employed by Chen and Hall (1993). We set the bandwidth h = n−1/2 as the

impact of a different bandwidth on the coverage probability is not significant (Chen and

Hall, 1993).

Remark 5. As previously discussed, the raw EL method is not always well-defined. When

the ill-deffiniton occurs, we follow the convention in literature (see Chen et al. (2008))

and set the value of the log likelihood ratio at infinity. Accordingly, the confidence interval

(region) of raw EL method is not well-posed in general and the length of confidence inter-

val is intractable. The coverage probabilities of all competing methods in sequel numerical

experiments are calculated as the number of statistics (e.g., the SAEL ratios) smaller than

the critical value divided by the total number of replications.

We focus on two-sided confidence intervals (regions) for VaR and CTE with nominal level

95%, and the simulated coverage probabilities are obtained based on 10,000 replicaitons.

For the estimation of CTE, the raw EL is excluded from comparison because the raw EL

does not apply for the estimation of CTE. We choose the number of bootstrap replicates

for the bootstrap method to be B = 2000 as the improvement in coverage accuracy by

using larger bootstrap size is marginal (Baysal and Statum, 2008). Two examples from

Manistre and Hancock (2005), the Pareto distribution and an “in-the-money” European

put option, are considered to generate random samples of loss data for our simulation study.

In each combination of parameter and example, a number of sample sizes are considered to

demonstrate how the coverage accuracy may vary along the sample size.

13

3.1 The Pareto Distribution

As a commonly-used distribution to fit heavy-tailed loss data in insurance, the Pareto

distribution is defined by

F (x) = 1−(

λ

λ+ x

)ν, x > 0,

where ν and λ are the shape and scale parameters, respectively. The theoretical VaR and

CTE at the confidence level of (1− p) for the Pareto loss are respectively given by

VaR1−p = λ(p−1/ν − 1),

CTE1−p = λ

(ν

ν − 1× p−1/ν − 1

).

Depending on the value of the parameter ν, the high-order moments of the Pareto

distribution may not exit and the condition E(X18) < ∞ in Theorems 2 and 3 may be

violated. In such a situation, the SAEL-based confidence interval for VaR and CTE cannot

ensure the theoretical higher order asymptotic coverage precision, but the likelihood ratio

functions still have the asymptotic Chi-square distributions as long as its second moment

exists. In this sense, the SAEL method is fairly robust. In our simulation study, we follow

the numerical setting in Manistre and Hancock (2005) and choose ν = 2.5 and λ = 25 so

that the Pareto distribution is quite heavy-tailed, and the condition E(X18) <∞ is indeed

violated. We choose such a numerical setting because we are interested in the finite sample

performance of SAEL method and making a comparison between the SAEL method and the

other three competing methods in a situation which is unfavourable to the SAEL method.

1. Coverage comparison for VaR0.99. When VaR0.99 is taken as the only parameter

of interest, we choose the sample size from 250 to 1250 in multiple of 250. The coverage

probabilities are reported in the left panel of Table 2. The portion of non-definition of the

raw EL method occurring among the 10000 replications of simulation is reported in the

“ND (%)” rows. We have several interesting observations from Table 2. Firstly, the SAEL

method has the closest-to-nominal coverage probability (i.e., 95%) among the four methods

under consideration. This means that the coverage accuracy of the SAEL confidence inter-

vals are uniformly the best. Secondly, the SAEL method has almost the exact 95% coverage

probability when the sample size n reaches 500 or larger, even though the effective sample

size is only 500× 1% = 5 in the estimation for a sample size of 500. Thirdly, although the

coverage probabilities are pretty close to each other between the raw EL and the SAEL,

the SAEL method is better and this is probably due to the non-definition issue associated

with the raw EL. The portion of non-definition for the raw EL ratios is significantly large

(at a level of 8.05%) when the sample size is relatively small (at n = 250). Fourthly, the

raw EL performs significantly better than the bootstrap method and the influence function

method. The coverage probabilities of the influence function method are quite far away

14

Table 2: Coverage probability (%) of confidence intervals (resp., regions) for VaR1−p (resp.,

(VaR1−p,CTE1−p)) of the Pareto distribution.

Parameter VaR0.99 (VaR0.99,CTE0.99)

Sample Size n 250 500 750 1000 1250 500 1000 1500 2000 2500

raw EL 90.39 92.93 96.12 94.54 93.32 63.33 77.86 82.79 84.97 85.89

SAEL 91.80 94.66 96.11 94.55 94.62 75.47 81.95 85.25 87.29 87.44

Influence Function 69.21 81.06 82.46 86.32 86.01 55.73 66.60 71.23 74.61 77.07

Bootstrap 87.35 91.34 91.77 92.01 91.68 64.98 73.39 76.83 79.30 80.73

ND (%) 8.05 0.56 0.03 0.00 0.00 27.13 6.13 1.45 0.36 0.08


Sample Size n 250 500 750 1000 1250 500 1000 1500 2000 2500

raw EL 94.38 94.83 94.47 94.79 95.57 86.64 88.94 90.75 91.35 91.38

SAEL 94.45 94.88 94.53 94.86 95.44 87.91 89.88 91.23 91.69 91.74


Bootstrap 92.15 92.73 93.27 93.33 93.84 81.72 84.89 87.40 88.35 88.72

ND (%) 0.00 0.01 0.00 0.00 0.00 0.10 0.01 0.00 0.00 0.00

from the nominal level even when the sample size is more than 1000. The inferiority of

these two methods may result from the fact that the effect sample size is rather small in

the estimation of the risk measure and the asymptotic variance of the point estimator is

large and cannot be estimated well enough.

2. Coverage comparison for (VaR0.99,CTE0.99) When both the two risk measures are

of interest, we choose larger sample sizes, from 500 to 2500 with an increment of 250, in our

simulation study due to the well known fact that CTE is more difficult to estimate than

VaR because it involves the whole tail of the loss distribution. The simulation results are

presented in the last panel of Table 2. We observe that all the reported coverage probabilities

deviate from the nominal level of 95% by much more than those on the left panel in Table 2

which is for the estimation of VaR0.99 solely. Nevertheless, the SAEL method still produces

the most accurate confidence regions among the four competing methods. In particular,

when the sample size is 1000, the SAEL method has a coverage probability of 81.95%,

which is even much larger than those of the influence method and the bootstrap method

with a favourably large sample size of 2500. For the raw EL method, the non-definition

problem is impressively severe. In particular, the portion of non-definition is as large as

27.13% for sample size n = 500 and more than 6% for a sample size even as large as 1000.

In contrast, the SAEL is free of the non-definition issue and produces much better coverage

probabilities across all the sample sizes considered.

15

Table 3: Coverage probability (%) of confidence intervals for CTE1−p of the Pareto distri-

bution.

Parameter CTE0.99

Sample Size n 500 1000 1500 2000 2500

SAEL 65.34 78.74 83.40 86.44 87.88

Influence Function 71.69 78.68 80.99 83.71 84.63

Bootstrap 65.20 73.97 77.70 80.85 82.21

Parameter CTE0.95

Sample Size n 500 1000 1500 2000 2500

SAEL 87.94 90.27 91.02 91.76 92.18

Influence Function 85.35 87.88 89.28 89.86 90.37

Bootstrap 83.03 86.51 88.18 89.00 89.52

3. Coverage comparison for CTE0.99 For the estimation of CTE0.99, the raw EL does

not apply because there is neither sole estimating equation for the parameter nor can it

be profiled from the EL jointly for VaR and CTE. So, the comparison is among the three

candidates only: the SAEL method, the influence function method, and the bootstrap

method. In our simulation study, the sample size is chosen from 500 to 2500 with an

increment of 500 and the simulation results are reported in Table 3. From Table 3, we can

see that the superiority of the SAEL method over the other two is not as impressive as it

does for the estimation of VaR as shown in Table 2, which can be explained by the fact that

the theoretical coverage precision of SAEL confidence interval for CTE is only of O(n−1

)as opposed to O

(n−2

)for the estimation of VaR; see Theorems 2 and 4. Under a sample

size of 500, the SAEL method does not perform as well as the influence function method

in terms of the coverage probabilities. This probably due to the fact that the influence

function method relies on a certain normal approximation for the underlying distribution,

and the distribution of the associated point estimate is relatively closer to its asymptotic

benchmark (i.e., normal distribution) under a small sample size. The phenomenon that

normal-approximation-based methods perform better than EL-based methods under a small

sample size is not solely for the estimation of VaR and CTE as we observed here, but also

for that of the mean of a distribution. For example, the Hotelling’s T 2 method, which is

a normal-approximation-based method, has been observed to beat the EL-based methods

under a small size, see e.g., Table 4 of Liu and Chen (2010). When the sample size steps up to

1000, the SAEL method surpasses the influence function method with a coverage probability

of 78.74%. The advantage of the SAEL method becomes significant when sample size is

1500 or larger. In particular, for a sample size of n = 2500, the SAEL has a coverage

probability of 87.88% which is much better than that of 84.63% for the influence function

method and 82.21% for the bootstrap method.

4. Confidence intervals We also compare the confidence intervals (CIs) produced by

the SAEL method, the influence function method and the bootstrap method. We exclude

16

the raw EL method from the comparison, because as we have pointed out in Remark 5

that the confidence interval of the raw EL method is intractable due to the thorny issue of

non-definition. For the SAEL method and the influence function method, computing the

confidence intervals involves a root-searching procedure and is rather time-consuming. So,

we only conduct 100 replications of simulation and compute the point estimate and the

lower/upper limit of the CIs for each. The calculation procedure of the point estimate will

be described in detail in Section 4. Two different sample sizes, 250 and 1250 (resp., 500

and 2500) are considered for VaR1−p (resp., CTE1−p). The average results from the 100

replications are reported in Table 4. As one may expect, the lengths of the CIs produced by

all methods are considerable, in particular for CTE0.99, as the Pareto distribution considered

here has an extraordinarily heavy tail. It is also notable that the influence function method

gives relatively shorter CIs compared with the other two methods but at the cost of less

accurate coverage probabilities as shown in Table 2. Furthermore, the lengths of CIs shrink

as the sample size increases for all the three methods, which confirms the convergence of

the associated statistics to their limiting distributions. However, under a relatively small

size, e.g., 250, the lower limit of the bootstrap CI is negative (see the numbers in Table

4 with bold-face) which shows the bootstrap method has abysmal performance at small

sample size in the sense of quite lengthy CIs. This thorny issue might be sparked by the

poor estimation for the standard deviation of the bootstrap statistic under a small sample

size.

5. Simulation results for p = 0.05 We also report the simulation results for p = 0.05

in Tables 2, 3, and 4. As one may expect, the coverage probabilities of all the three methods

are closer to the nominal level (i.e., 95%) than those for p = 0.01 since the effective sample

size is enlarged by shifting p from 0.01 to 0.05. Moreover, Tables 2 and 3 show that the

SAEL method still performs the best in terms of coverage accuracy, but its improvement

on the raw EL method is not as significant as it was under the case of p = 0.01. We also

observe that the problem of non-definition for the raw EL is soothed to a large extent under

a relative large value of p, e.g., 0.05 considered here, because the effective sample size is

sizable. Furthermore, Table 4 shows that the bootstrap method produces more reliable CI

for VaR0.95 (resp., CTE0.95) than it does for VaR0.99 (resp., CTE0.99). Finally, it is palpable

from Table 4 that all the three methods give less volatile point estimates for VaR and CTE

at a 0.95-level as revealed by shorter CIs.

3.2 An “in-the-money” European put option

As a second example, we construct confidence intervals/regions for VaR0.99 and CTE0.99 of

an “in-the-money” European put option which has been studied by Manistre and Hancock

(2005). Assume that the option matures in T = 10 years with a strike price of X = 110

and that the annual continuously compounded interest rate is r = 6%. Suppose that the

17

Table 4: 95%-level confidence intervals for VaR1−p and CTE1−p of the Pareto distributions

under 100 replications.

Parameter VaR0.99 VaR0.95 CTE0.99 CTE0.95

Sample Size n 250 1250 250 1250 500 2500 500 2500

Lower Limit 73.69 106.29 43.65 50.57 170.12 182.34 85.64 97.53

SAEL Point Estimate 136.01 135.46 58.19 57.25 230.87 234.34 112.51 113.90

Upper Limit 205.19 185.75 83.55 66.82 369.34 334.77 169.02 141.02

Length 131.51 79.46 39.89 16.25 199.22 152.43 83.38 43.49

Lower Limit 91.06 102.93 42.75 50.39 123.45 161.93 74.81 93.07

Influence Point Estimate 135.98 135.45 58.18 57.25 230.14 233.50 112.45 113.43

Function Upper Limit 180.89 167.98 73.61 64.11 357.45 305.06 152.04 133.80

Length 89.83 65.04 30.86 13.72 234.01 143.12 77.22 40.73

Lower Limit -18.85 92.74 36.18 49.15 106.01 162.54 72.98 93.13

Bootstrap Point Estimate 135.98 135.45 58.18 57.25 230.14 233.50 112.45 113.43

Upper Limit 290.80 178.16 80.18 65.35 354.27 304.45 151.92 133.73

Length 309.65 85.42 43.99 16.20 248.26 141.92 78.94 40.60

log return of stock price follows a normal distribution with standard deviation σ√T and

mean ζT so that the discounted payoff of the put option is given by

P = e−rT max(

0, X − S · eζT+σ√T ·Z), (23)

where Z denotes a standard normal random variable. As a remark, this kind of long-term

put option may correspond to an insurer’s exposure to a Guaranteed Minimum Maturity

Benefit and is of particular interest to actuaries in the risk management of variable annuities,

see e.g., Hardy (2000).

Straightforward calculation yields the following explicit expressions of VaR1−p and

CTE1−p for the option’s discounted payoff:

VaR1−p = e−rT ·max(

0, X − S · eζT+σ√T ·zp), (24)

CTE1−p =e−rT

p·[XΦ(d1)− S · Φ

(d1 − σ

√T)· e(ζ+σ2/2)T

], (25)

where zp = Φ−1(1 − p), and Φ−1(·) is the inverse of the standard normal distribution

function Φ(·), and d1 = min(zp,

ln(X)−ln(S)−ζTσ√T

).

In our simulation study, we set the current stock price S = 100, parameter σ = 15%

and vary ζ from 4% to 8% with an increment of 2%. For each set of parameter values, we

can use the expressions in equations (24) and (25) to compute the true values of VaR1−p

and CTE1−p.

1. Coverage comparison for VaR0.99 and (VaR0.99,CTE0.99)ᵀ. Our simulation re-

sults for the parameter VaR0.99 and the pair (VaR0.99,CTE0.99)ᵀ are reported in Table 5.

Clearly, the SAEL is still the winner among the four rivals in terms of coverage accuracy.

It produces perfect coverage probabilities for VaR0.99 when the sample size is at 250 or

18

larger, and for (VaR0.99,CTE0.99)ᵀ when the sample size is at 2000 or larger. The influence

function method always gives the farthest-to-nominal coverage probabilities, indicating a

less competitive estimation approach. As the parameter ζ increases from 4% to 8%, the

performance of influence function method deteriorates, while other methods have very sta-

ble performance. When the sample size is small, the non-definition problem of the raw

EL is severe and causes large coverage error. For example, when ζ = 0.04 and n = 500,

the simulated probability of non-definition of raw EL ratio can be as large as 17.93%, and

meanwhile, the coverage accuracy is only 74.87% which is nearly 12% less than the SAEL

method.

2. Coverage comparison for CTE0.99 Table 6 contains the simulation results for the

estimation of CTE0.99, which show that the SAEL method still outperforms all the other

three competitors and the coverage probabilities are quite stable in response of the change

in parameter ζ. Interestingly, the influence function method produces more accurate con-

fidence intervals than the bootstrap method in this case, although its performance is not

competitive in those aforementioned simulation examples.

3. Confidence intervals We also investigate the CIs produced by the three methods

and the results are reported in Table 7. For the brevity of presentation, we only consider the

case with ζ = 0.06. In terms of the length of CIs, it is palpable from Table 7 that, in most

cases, the influence function method produces the shortest CIs at the price of least accurate

coverage probability (see Tables 5 and 6). The bootstrap method gives the lengthiest CIs for

a small sample size of 250 and its coverage accuracy is less competitive compared with the

SAEL method as disclosed by the previous discussion. The SAEL-based CIs have medium

length but highest coverage precision in most cases considered in the present example.

4. Results under p = 0.05 Similar to the simulation results in the previous subsection,

the nuisance of non-definition problem of the raw EL method is eased to a large extent for

p = 0.05 as disclosed by the “ND (%)” rows of the bottom panel of Table 5. Despite this,

the merit of the SAEL method is still extant, for example, even under a small sample size

of 500, the SAEL confidence regions reach almost nominal level coverage accuracy for the

estimation of (VaR0.95,CTE0.95), see the “(VaR0.95,CTE0.95)” panel of Table 5, which is

probably due to its theoretical high-order accuracy. For the interval estimation of VaR0.95,

the raw EL method and the SAEL method produce close-to-nominal level coverage probabil-

ities for most sample sizes we consider in the numerical setting. Such coverage probabilities

do not necessarily exhibit a monotone trend and they show certain perturbation around the

nominal level due to the random error of simulation. It is also notable that the EL-based

methods constantly beat the other two competing methods in most cases at different drift

rate ζ and sample size, since the hiking of coverage probabilities is rather creeping for the

later two methods. The only case where the SAEL method is less competitive is for the

estimation of CTE0.95 when ζ = 0.08. It is notable that, under a relatively large drift

rate, an overwhelming portion of the simulated stock paths are out-of-the-money (OTM)

19

triggering zero payoffs (see Eq. (23)). In view of this, the distribution function of the

discounted payoff random variable P has a point mass at zero and might not be smooth

around VaR1−p for some large p which voilates Condition (1) of Theorem 1. In the present

context, we observe that the distribution function is still smooth around VaR0.95 = 4.3919,

however, under a limited sample size such as 500, the portion of OTM paths can be larger

than 95% which yields a point estimate of VaR0.95 equal zero. This clearly impairs the

merit of the SAEL method built on the theoretical results conveyed in Theorems 3–5. A

direct application of EL-based methods to zero-inflated data might not be efficient and

tailor-designed approaches should be developed for better efficiency.

In summary, the advantages of SAEL method over the other three competing methods

are also remarkable in estimating the risk measures for the in-the-money European put

option. All the three competing methods have a better performance in terms of coverage

probabilities for the estimation of the risk measures for put options compared to their

performance for the Pareto distribution which we studied in the last subsection. The

possible reason is that the underlying loss distribution is less skewed in the put option

example than in the Pareto distribution example.

4 Real-data analysis

We apply the SAEL method to to analyze the Danish Fire Loss Data collected by Copen-

hagen Reinsurance. A detailed description for the data can be found in McNeil (1997).

The data consists of 2167 fire losses over the period 1980 to 1990 with losses values being

adjusted due to inflation. The unit of the losses is millions of Danish Krone. The his-

togram of this data-set is illustrated in Figure 1, which clearly shows a severe skewness and

a typical heavy-tailed feature of the loss distribution. It is noteworthy that there are three

extreme losses (i.e. 144.66, 152.41, and 263.30 millions) which compose 0.138% of the data,

whereas the remaining observed losses are all less than 70 million. Such catastrophic losses

can be seen as outliers and thus, as a convention (see e.g. Tan and Weng (2014)), they are

excluded in the analysis as well as in the histogram for the ease of presentation.

As we have previously mentioned in Section 2.4, when the condition E[X18

]< ∞ or

even E[X4]<∞ in Theorems 2 and 3 is violated, the SAEL-based confidence interval for

VaR and CTE cannot ensure the theoretical high-order asymptotic coverage precision, but

the standard square-root-n convergence still holds; see the discussion at the end of Section

2.4. The numerical study in section 3.1 has confirmed the superiority of the SAEL method

to three competing methods in estimating risk measures for a Pareto loss distribution which

violates such high-order moment conditions. So, in this section we apply our SAEL method

to the Danish Fire Loss Data even though the data shows a clear heavy-tailed future. We

set the parameter (1− p) to be 0.95 and 0.99 respectively.

20

Table 5: Coverage probability (%) for confidence intervals of VaR1−p and (VaR1−p,CTE1−p)ᵀ

of the European put option.


Sample Size n 250 500 750 1000 1250 500 1000 1500 2000 2500

raw EL 90.76 92.65 96.09 94.43 93.55 74.87 88.83 92.10 93.15 93.39

ζ = 4% SAEL 91.82 94.56 96.05 94.50 95.05 86.67 92.75 94.00 94.47 94.45


Bootstrap 88.93 91.24 91.88 92.61 92.42 83.00 88.11 90.40 91.17 91.47

ND (%) raw EL 8.10 0.72 0.07 0.00 0.00 17.93 1.68 0.23 0.03 0.02

raw EL 91.03 92.68 95.91 94.26 93.70 74.46 88.86 91.65 92.62 93.52

ζ = 6% SAEL 92.32 94.65 95.84 94.46 95.14 85.97 92.41 93.86 94.53 94.68


Bootstrap 89.05 91.29 91.65 92.36 92.46 83.27 88.17 90.04 90.83 91.97

ND (%) raw EL 7.59 0.55 0.03 0.01 0.00 17.82 1.97 0.31 0.05 0.03

raw EL 90.42 93.06 96.02 94.46 93.29 75.99 89.68 92.28 92.78 93.60

ζ = 8% SAEL 91.75 94.70 95.94 94.58 94.62 85.62 92.53 93.89 94.41 94.79


Bootstrap 88.32 91.45 91.08 92.49 92.26 83.81 88.71 90.54 90.95 92.17

ND (%) raw EL 8.14 0.65 0.03 0.00 0.00 17.00 1.81 0.15 0.05 0.01


Sample Size n 250 500 750 1000 1250 500 1000 1500 2000 2500

raw EL 94.51 94.43 94.60 94.59 95.65 93.92 94.23 94.62 95.15 94.98

ζ = 4% SAEL 94.78 94.55 94.80 94.73 95.41 95.09 94.77 94.90 95.32 95.10


Bootstrap 92.72 93.49 93.86 94.12 93.74 92.69 92.85 93.48 94.01 93.98

ND (%) raw EL 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

raw EL 94.04 94.94 95.00 94.08 95.66 93.97 94.52 94.82 94.74 95.27

ζ = 6% SAEL 94.23 94.91 95.07 94.82 95.54 94.87 94.97 95.01 94.96 95.42


Bootstrap 92.35 93.56 93.80 93.52 93.95 92.28 93.27 93.63 93.87 94.29

ND (%) raw EL 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.01 0.00

raw EL 94.06 94.65 94.85 94.93 95.39 93.63 94.92 94.63 94.56 95.17

ζ = 8% SAEL 94.19 94.75 94.94 94.94 95.24 94.22 95.16 94.75 94.72 95.27


Bootstrap 86.14 88.48 90.54 91.31 92.50 88.18 91.64 92.61 93.44 93.93

ND (%) raw EL 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00

21

Table 6: Coverage probability (%) for confidence interval of CTE1−p of the European put

option.

Sample Size n 500 1000 1500 2000 2500

SAEL 82.26 91.46 93.36 93.58 93.58

ζ = 4% Influence Function 83.04 89.50 91.94 91.92 92.64

Bootstrap 82.74 88.28 91.24 91.04 91.74

SAEL 79.14 91.34 93.82 93.34 94.08

CTE0.99 ζ = 6% Influence Function 84.24 89.52 92.12 92.04 92.58

Bootstrap 82.80 88.28 90.94 91.18 92.20

SAEL 77.18 89.32 92.12 94.18 94.08


Bootstrap 83.98 89.00 90.16 92.18 91.70

SAEL 94.68 94.83 94.70 94.90 95.09


Bootstrap 92.64 93.65 93.99 94.47 94.34

SAEL 92.95 94.40 94.90 94.61 94.75

CTE0.95 ζ = 6% Influence Function 92.92 93.99 94.25 94.21 94.57

Bootstrap 92.26 93.52 94.01 94.05 94.40

SAEL 86.38 90.31 91.70 92.86 93.50


Bootstrap 91.20 93.59 94.06 93.86 94.36

Table 7: 95%-level confidence intervals for VaR1−p and CTE1−p of the put option under 100

replications. The drift parameter ζ is set as 0.06.

Parameter VaR0.99 VaR0.95 CTE0.99 CTE0.95

Sample Size n 250 1250 250 1250 500 2500 500 2500

Lower Limit 18.81 23.76 8.84 12.08 26.63 29.70 18.51 20.58

SAEL Point Estimate 26.72 27.07 14.02 14.62 30.28 31.84 21.76 22.19

Upper Limit 29.98 30.29 20.25 17.19 33.91 34.39 25.55 23.93

Length 11.17 6.54 11.41 5.12 7.28 4.69 7.04 3.35

Lower Limit 21.32 24.01 8.65 12.12 26.04 29.54 18.19 20.53

Influence Point Estimate 26.72 27.07 14.02 14.62 30.28 31.84 21.76 22.19

Function Upper Limit 32.12 30.13 19.39 17.13 34.45 34.14 25.34 23.86

Length 10.79 6.13 10.74 5.01 7.91 4.60 7.14 3.33

Lower Limit 19.19 23.72 7.86 12.07 26.10 29.54 18.17 20.52

Bootstrap Point Estimate 26.72 27.07 14.02 14.62 30.28 31.84 21.76 22.19

Upper Limit 34.25 30.42 20.18 17.18 34.45 34.14 25.35 23.86

Length 15.06 6.70 12.32 5.11 8.35 4.60 7.18 3.34

22

Figure 1: Histogram for Danish fire losses with values less than 70 (millions of Danish Krone).

0 10 20 30 40 50 60

0200

400

600

800

Loss value

Frequency

The point estimates of the influence function method and the bootstrap method for

VaR1−p and CTE1−p are exactly the same and given by the following moment estimates:

VaR1−p := X(dn(1−p)e), and CTE1−p :=1

n− dn(1− p)e+ 1

n∑i=dn(1−p)e

X(i),

whereX(i)

ni=1

are the order statistics of the loss data and⌈·⌉

denotes the ceiling function.

For the SAEL method, the associated point estimate is a root of the equation (Qin and

Lawless (1994)): n−1∑n

i=1 ψh (Xi;µ, η) = 02, with ψh(·;µ, η) given in equation (7) and 02

denoting a two-dimensional null vector. To find such a root, one may employ the standard

scant method and use(

VaR1−p, CTE1−p

)ᵀas an initial guess. We use different bandwidths

that satisfy Condition (3) of Theorem 1 to compute such SAEL-based point estimates and

compare them with the point estimates of the other two counterparts in Table 8. Since

the sample size (n = 2164) is considerably large, the effect of bandwidth h on the point

estimates is not significant as revealed by Table 8. As the bandwidth h becomes smaller, the

point estimate of SAEL is closer to that of the influence function/bootstrap method because

the smoothed estimating equation (ψh(·;µ, η)) reduces to the primal nonsmooth equation

(ψ(·;µ, η)) as h tends to 0, and the moment estimate is a root of such an estimating equation

(Qin and Lawless (1994)). In the following analysis, we fix the bandwidth h = n−1/2 as

commonly used in the literature, see e.g., Chen and Hall (1993).

23

Table 8: Point estimates for (VaR1−p,CTE1−p) of Danish fire data-set.

1− p Influence function/BootstrapSAEL

h = n−1/2 h = n−3/4 h = n−1

0.99 (24.9703, 36.5808) (24.9612, 36.7740) (24.9689, 36.7740) (24.9701, 36.7740)

0.95 (9.3988, 19.2049) (9.4187, 19.2774) (9.4018, 19.2774) (9.3993, 19.2774)

Table 9 reports the confidence intervals of VaR1−p and CTE1−p of the Danish fire

loss data. We have two major comments on the results in Table 9. Firstly, the SAEL

confidence intervals are skewed to the right, whereas the confidence intervals produced by

the influence function method and the bootstrap method are always symmetric. In other

words, the SAEL method prefers to produce conservative estimation of the potential loss,

which is more appealing in the practice from the perspective of regulators. Secondly, the

length of confidence intervals of CTE0.99 can be extremely large (approximately 60 million

Danish dollars), which means that the corresponding point estimates are very volatile under

this case although the sample size is already appreciable. This impressive phenomenon

is probably due to the un-boundedness of the estimating equation ψh(Xi;µ0, η0), which

brings large asymptotic standard deviation of the point estimate and the estimating result

is sensitive to large values in sample. Such an issue is often referred as the robustness in

the statistical literature, and has also raised concern in risk management research (Kou,

Peng and Heyde (2013); Kratschmer, Schied and Zahle (2014)). According to BCBS (2012),

CTE is promoted to replace VaR, however, such a volatile estimate is unacceptable to be

used to set capital requirement.

We further construct the confidence regions for the pair (VaR0.95,CTE0.95)ᵀ based on

the influence function method and our SAEL method. Since the bootstrap method and raw

EL method cannot produce continuous confidence region, we do not take these two methods

into comparison. The produced confidence regions are shown in Figure 2. It is known that

the EL based methods leave the observed data to determine the shape of the confidence

region, and the confidence region tends to be concentrated in places where the density of

the parameter estimator is greatest (Hall and La Scala, 1990). Clearly, the solid contour

curve in Figure 2 is skewed to upper right corner, which confirms the nice property of SAEL

method and shows the heavy tail feature of the underlying loss distribution. On the other

hand, it is well known that the influence function method always produces elliptical-shape

region and it is confirmed by the dash-dot contour curve in Figure 2.

5 Conclusions

In this paper, we propose the SAEL method to construct nonparametric confidence inter-

vals (regions) for three popular risk measures: VaR, CTE and Expectile. We show that the

24

Table 9: Confidence intervals of VaR1−p and CTE1−p at the confidence level of 95% based

on the Danish fire data-set. The parameter (1−p) is chosen to be 0.95 and 0.99, respectively.

Parameter Method1− p = 0.95 1− p = 0.99

Lower Limit Upper Limit Lower limit Upper Limit

SAEL 8.220 11.651 20.981 32.402

VaR1−p Influence Function 8.620 11.402 21.552 30.875

Bootstrap 8.265 11.776 21.093 30.926

SAEL 19.383 33.142 40.314 99.393

CTE1−p Influence Function 17.720 30.443 31.311 85.856

Bootstrap 17.718 30.538 30.690 83.021

Figure 2: 95%-level confidence regions for (VaR0.95,CTE0.95)ᵀ of Danish fire loss data.

0 5 10 15 20

VaR

15

16

17

18

19

20

21

22

23

24

CT

E

Influence Function

SAEL

25

SAEL ratios have chi-square asymptotic distributions, according to which the confidence

intervals for the three risk measures can be constructed. We further disclose that the cov-

erage error of the SAEL confidence intervals for VaR and Expectile can be reduced from

O(n−1

)to O

(n−2

)by Bartlett correction, while the confidence interval for CTE can only

have O(n−1

)coverage precision in general. This gives theoretical support to the advan-

tages of our method over the existing nonparametric methods such as the raw EL method,

the influence function method and the bootstrap method. To the best of our knowledge,

we are the first to clearly analyze the high-order asymptotic properties of nonparametric

inference on these risk measures, which is nontrivial due to the nonsmoothness of estimat-

ing equations. Via extensive simulation studies, we also find that the SAEL method has

overwhelmingly better finite sample performance in terms of empirical coverage probability

than those competing methods.

A Appendix

This section collects the proofs of Theorems 1-5. Their proofs rely on Lemma 1 below,

which quantifies the closeness of the moments of ψh(X1;µ0, η0) and those of ψ(X1;µ0, η0).

ψh(X1;µ0, η0) and ψ(X1;µ0, η0) are respectively defined in (5) and (7).

Lemma 1. If E|X| <∞ and the first-order derivative of the distribution function of X ex-

ists in a neighborhood of µ0 and is positive at µ0, then Eψsh(X1;µ0, η0) = Eψs(X1;µ0, η0)+O(h2) for s = 1, 2, where ψs(X1;µ0, η0) denotes the sth component of ψ(X1;µ0, η0).

Proof. Since the case of s = 1 has been shown by Chen and Hall (1993), we only need prove

the case of s = 2, i.e., E[XGh(X − µ0)] = E[XI(X − µ0)] +O(h2). It can be verified that

E[XGh(X − µ0)] =

∫ +∞

−∞

∫ +∞

−∞xI(x− µ0 − uh)dFX(x)

K(u)du

=

∫ +∞

−∞v(uh+ µ0)K(u)du,

where v(y) =∫ +∞−∞ xI(x− y)dFX(x). For the integrand in the above display, we use Taylor

expansion to obtain

v(uh+ µ0)− v(µ0) = v′(µ0)(uh) +1

2v′′(τ)(uh)2,

where τ depends on uh and lies between µ0 and µ0 + uh. Since E[XI(X −µ0)] = v(µ0), we

arrive at

E[XGh(X − µ0)] = E[XI(X − µ0)] +h2

2

∫ +∞

−∞v′′(τ)u2K(u)du,

26

where the property in equation (6) of the kernel function is applied. It remains to quantify

the remainder term in above equation. Because the support of K is finite and compact and

v′′(τ) = −f(τ)− τf ′(τ), it follows that∫ +∞−∞ v′′(τ)u2K(u)du is bounded and the remainder

term is O(h2). This completes the proof.

A.1 Proof of Theorem 1

Proof. The result in equation (8) is from Theorem 3.2 of Chen and Hall (1993). We only

need to prove (9) and (10). Let Vh be the variance matrix of ψh(Xi;µ0, η0). It can be shown

that Vh = V +O(h), where V = (V ij)1≤i,j≤2 and

V 11 = p(1− p) +O(h), V 12 = η0(1− p) +O(h)

V 22 = (1/p2)EX2I(X ≥ µ0) − η20 +O(h).

Because EX2I(X ≥ µ0) − pη20 > 0, V is non-singular and so is Vh as h → 0 with

probability tending to 1.

Let Zi = ψh(Xi;µ0, η0). By Lagrange’s multiplier method (See Qin and Lawless, 1994),

it can be shown that if SEL2(µ0, η0) is well defined, then

SEL2(µ0, η0) = 2

n∑i=1

log(1 + λᵀhZi),

where λh is the solution ton∑i=1

Zi1 + λᵀhZi

= 0. (26)

With similar argument to the proof of Theorem 3.1 in Chen and Hall (1993), we can

conclude that λh = Op(n−1/2 +h2) in view of the condition E[X2] <∞. By approximating

equation (26), we have

λh = V −1h

1

n

n∑i=1

Zi +Op(n−1/2 + h2)2

and accordingly the likelihood ratio can be approximated as

SEL2(µ0, η0) =

(n−1/2

n∑i=1

Zi

)ᵀ

V −1h n−1/2

n∑i=1

Zi +Opn(n−1/2 + h2)3.

It can be shown that

E(√nZi) = O(

√nh4) = o(1)

under the assumption nh4 = o(1), and accordingly, the application of central limit theorem

leads to n−1/2∑n

i=1 Zid−→ N(0, V ), which implies SEL2(µ0, η0)

d−→ χ22. This proves the

result in equation (9).

The result in equation (10) can be proved along the same lines of the proof of Corollary

5 of Qin and Lawless (1994), and thus omitted.

27


Proof. We shall only prove equation (14), because equation (13) is the result of Theorem

4.1 of Chen and Hall (1993).

For i = 1, 2, . . . , n, denote Yi ≡ (Y 1i , Y

2i )ᵀ := V

−1/2h ψh(Xi;µ0, η0), where Vh is the

variance matrix of ψh(Xi;µ0, η0). Let Br = (1/n)∑n

i=1 Yri , βr = E(Y r

1 ) and for k > 2

βj1j2···jk = E(Y j11 Y j2

1 · · ·Yjk

1 ), Bj1j2···jk =1

n

n∑i=1

Y j1i Y j2

i · · ·Yjki − β

j1j2···jk ,

where j1, j2, . . . , jk ∈ 1, 2. It can be verified that βr = O(h2) and Br = Op(n−1/2 + h2).

It follows from

‖ψh(x;µ0, η0)‖ ≤√

1 + (x2/p) + η0 ≤ (|x|/p) +√

1 + η0,

that if E|X|s < ∞, then both E‖ψh(Xi;µ0, η0)‖s and E‖Yi‖s are finite. Accordingly, for

k > 1, if E|X|2k <∞, then Bj1j2···jk = Op(n− 1

2 ) and βrs = δrs with δrs being 1 if r = s and

0 otherwise.

In the proof of Theorem 1, it has been shown that the solution to (26) satisfies λh =

Op(n−1/2 + h2). Further, by DiCiccio et al. (1991), λh can be expanded as

λh = λh1 + λh2 + λh3 +Op((n−1/2 + h2)4)

where λrh1 = Br, λrh2 = −BrsBs + βrstBsBt and

λrh3 = BrsBtuBu +BrstBsBt + 2βrstβtuvBsBuBv

−3βrstBtuBsBu − βrstuBsBtBu.

Here we have used the summation convention according to which, if an index occurs more

than once in an expression, summation over the index is understood. For k = 1, 2, 3,

λhk = Op(n−1/2 + h2)k.Substituting λh into the expression for SEL(µ0, η0), we have

SEL2(µ0, η0) = nRᵀR+Op(n(n−1/2 + h2)5)

where the sign root R = R1 +R2 +R3 with Rr1 = Br, Rr2 = 13β

rstBsBt − 12B

rsBs, and

Rr3 =3

8BrsBstBt − 5

12βrstBtuBsBu − 5

12βstuBrsBtBu

+4

9βrstβtuvBsBuBv +

1

3BrstBsBt − 1

4βrstuBsBtBu.

Clearly R is a polynomial of U = (B1, B2, ..., B111, ..., B222). The assumption E[|X|18

]<∞

guarantees that U has finite first 6 cumlants. After lengthy algebra, it can be found that the

third-order cumulants of√nR is of order O(n−1/2h2) and that the fourth-order cumulants of

28

√nR is of order O(n−1h2). Under condition n3h4 = o(1), this guarantees that SEL2(µ0, η0)

is Bartlett correctable (Chen and Wood, 1996). The Bartlett correction factor is the leading

term of n E(nRᵀR)− 2 /2, which is equal to

b =1

2βiijj − 1

3βijkβijk.

By Lemma 1, b = b2 +O(n−1

). This proves equation (14).


Proof. We carry those notations used in the proof of Theorem 2, and we further define

Yn+1 = − an

∑ni=1 Yi. Then, applying the Lagrangian multiplier method, we can easily verify

that

SAEL2(µ0, η0; a) = 2n+1∑i=1

log(1 + λᵀhaYi)

with λha being the solution to

n+1∑i=1

Yi1 + λᵀhaYi

= 0. (27)

The rest proof is the largely same as that of Theorem 1 of Liu and Chen (2010), except

that Lemma 1 will be extensively used here. To save lines, we only present a sketch in the

rest of the proof.

By comparing equations (26) and (27), we shall obtain

λha = (1− a/n)λh +Op((n−1/2 + h2)4).

Substituting this approximation to SAEL2(µ0, η0; a) and matching the approximation of

SEL2(µ0, η0), we find that

SAEL2(µ0, η0; a) = nRᵀaRa +Op(n(n−1/2 + h2)5)

where Ra = R− (a/n)R1. Under the assumption n3h4 = o(1), the additional term (a/n)R1

is of order Op(n−3/2), and it is so small that the leading terms of the third and fourth

cumulants of Ra are of the same order as those of R. This implies that SAEL2(µ0, η0; a) is

also Bartlett correctable. By taking expectation, we have

E(nRᵀaRa) = E(nRᵀR)− 2aE(Rᵀ

1R1) +O(n−2

)= 2 +

2b2n− 4a

n+O

(n−2

).

This implies that the Bartlett correction factor accompanying SAEL2(µ0, η0; a) is b2 − 2a.

When a = b2/2, the smoothed AEL SAEL2(µ0, η0; a) calibrated by χ22 has the same precision

as the Bartlett-corrected smoothed EL SEL2(µ0, η0). If a = b2/2 + Op(n−1/2

), we achieve

the same conclusion by re-studying the above proof. This proves equation (19).

29

A.4 Proof of Theorems 4 and 5

Proof. The proofs of Theorems 4 and 5 are very similar to those of Theorems 1, 2 and 3.

Hence we only outline the proof of Theorem 4. Similar to the proof of Theorem 3, it can

be shown that

SAEL3(η0; a) = SAEL3(η0)− 2aR1∗R1∗ +Op(n(n−1/2 + h2)5)

= SAEL3(η0) +Op(n(n−1/2 + h2)2).

Then along the proof of Corollary 5 of Qin and Lawless (1994), we shall conclude that

Pr(SAEL3(η0; a) ≤ x) = P (SEL3(η0) ≤ x) +O(n−1

)= P (χ2

1 ≤ x) +O(n−1

).

This proves Theorem 4.

References

Abu Bakar, S.A., Hamzaha, N.A., Maghsoudia, M. and Nadarajah, S. (2015). Modeling

loss data using composite models. Insurance: Mathematics and Economics, 61, 146–

154.

Ahn, J. and Shyamalkumar, N. (2014). Asymptotic theory for the empirical Haezendonck-

Goovaerts risk measure. Insurance: Mathematics and Economics, 55, 78-90.

Baysal, R.E. and Staum, J. (2008). Empirical likelihood for value-at-risk and expected

shortfall. Journal of Risk, 11, 3–22.

BCBS (2012). Consultative Document May 2012. Fundamental review of the trading book.

Basel Committee on Banking Supervision. Basel: Bank for International Settlements.

Bellini, F., Klar, B. Muller, A. and Gianin, R. (2014). Generalized quantiles as risk

measures. Insurance: Mathematics and Economics, 54, 41–48.

Chen, S. X. and Cui, H. J. (2006). On Bartlett correction of empirical likelihood in

presence of nuisance parameters. Biometrika, 93, 215–220.

Chen, S. X. and Hall, P. (1993). Smoothed empirical likelihood confidence intervals for

quantiles. The Annals of Statistics, 21, 1166–1181.

Chen, S. X. and Tang, C. Y. (2005). Nonparametric inference of Value-at-risk for depen-

dent financial returns. Journal of Financial Econometrics, 3, 227–255.

Chen, S. X. (2008). Nonparametric estimation of expected shortfall. Journal of Financial

Econometrics, 6, 87–107.

Chen, J., Variyath, A. M., and Abraham, B. (2008). Adjusted empirical likelihood and

its properties. Journal of Computational and Graphical Statistics, 17, 426–443.

30

Cooray, K. and Ananda, M.M. (2005). Modeling actuarial data with a composite lognormal-

pareto model. Scandinavian Actuarial Journal, 2005 (5), 321–334.

DiCiccio, T. J., Hall, P., and Romano, J. P. (1991). Empirical likelihood is Bartlett-

correctable. The Annals of Statistics, 19, 1053–1061.

Dowd, K. (2005). Estimating risk measures. Financial Engineering News, 43, 13.

Hardy, M. (2000). Hedging and reserving for single-premium segregated fund contracts.

North American Actuarial Journal, 4, 63–74.

Hall, P. and La Scala, B. (1990). Methodology and algorithms of empirical likelihood.

International Statistical Review, 58, 109–127.

Jones, B.L. and Zitikis, R. (2003). Empirical estimation of risk measures and related

quantities. North American Actuarial Journal, 7, 44–54.

Kou, S., Peng, X. and Heyde, C.C. (2013). External risk measures and Basel accords.

Mathematics of Operations Research, 38(3), 393-417.

Kratschmer, V., Schied, A. and Zahle, H. (2014). Comparative and qualitative robustness

for law-invariant risk measures. Finance and Stochastics, 18, 271–295.

Liu, Y. and Chen, J. (2010). Adjusted empirical likelihood with high-order precision. The

Annals of Statistics, 38, 1341–1362.

Manistre, B. J. and Hancock, G. H. (2005). Variance of CTE estimator. North American

Actuarial Journal, 9, 129–154.

McNeil, A. (1997). Estimating the Tails of Loss Severity Distributions Using Extreme

Value Theory. ASTIN bulletin, 27, 117–137.

Nadarajah, S., and Bakar, S. (2014). New composite models for the danish fire insurance

data. Scandinavian Actuarial Journal, 2014 (2), 180–187.

Newey, W., and Powell, J. (1987). Asymmetric least squares estimation and testing.

Econometrica, 55, 819–847.

Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional.

Biometrika, 75, 237–249.

Owen, A. B. (1990). Empirical likelihood ratio confidence regions. The Annals of Statis-

tics, 18, 90–120.

Owen, A. B. (2001). Empirical Likelihood. Chapman and Hall/CRC, New York.

Peng, L., Qi, Y., Wang, R. and Yang, J. (2012). Jackknife Empirical Likelihood Method for

Some Risk Measures and Related Quantities. Insurance: Mathematics and Economics,

51, 142–150.

Qin, J. and Lawless, J. (1994). Empirical likelihood and general equations. The Annals

of Statistics, 22, 300–325.

31

Scollnik, D.P. and Sun, C. (2012). Modeling with weibull-pareto models. North American

Actuarial Journal, 16 (2), 260–272.

Tan, K. S. and Weng, C. (2014). Empirical approach for optimal reinsurance design.

North American Actuarial Journal, 18 (2), 315–342.

Yamai, Y. and Yoshiba, T. (2002). Comparative analyses of expected shortfall and value-

at-risk: Their estimation error, decomposition, and optimization. Monetary and Eco-

nomic Studies, 21, 87–121.

Zhou, M. and Yang, Y. (2016). emplik: Empirical likelihood ratio for censored/truncated

data. R package version 1.0-3. URL: https://CRAN.R-project.org/package=emplik.

Ziegel, J.F. (2014). Coherence and elicitability. Mathematical Finance, 26, 901-918.

32

https://CRAN.R-project.org/package=emplik

Nonparametric Inference for VaR, CTE and Expectile with ... · Nonparametric Inference for VaR, CTE...

Documents

Transcript of Nonparametric Inference for VaR, CTE and Expectile with ... · Nonparametric Inference for VaR, CTE...