APPLICATIONS OF ROBUST STATISTICS IN FINANCE

APPLICATIONS OF ROBUST STATISTICS IN FINANCE

R. Douglas Martin*

Current and Future Challenges in Robust Statistics Banff Center, Canada

Wed. November 18, 2015

11/18/2015 1

11/18/2015 2

THREE TOPICS: TWO SHORT, ONE LONG

1. Robust detection of fundamental factor model

exposures outliers (skipped)

2. Robust covariances for mean-variance portfolio

optimization: Classic multivariate outliers model

versus independent outliers in assets (IOA) model

(short)

3. Non-parametric versus parametric expected shortfall:

the influence function approach

11/18/2015 3

1. ROBUST DETECTION OF FUNDAMENTAL

FACTOR MODEL EXPOSURES OUTLIERS*

1. The fundamental factor model context

2. The data for this example

3. The MCD estimator

4. The results

* Joint work with Chris Green, Washington State Investment Board and

UW Statistics PhD Candidate

V2 5/11/2015

1. Fundamental Factor Model

1 t t t t r B f ε

Exposures matrix (N x K)

4

Factor returns (K x 1)

Copyright R. Douglas Martin

1 1 1( , , , , )t q q K B b b b b

Industry exposures, each entry a 1 or 0 Continuous risk factor exposures

Work-horse of commercial portolio optimization and risk management

software (MSCI Barra, Axioma, Northfield).

For example: firm size, E/P, B/M, momentum, leverage, etc.

V2 5/11/2015

2. The Data

5 Copyright R. Douglas Martin

Number of companies:

Fundamental data (exposures)

- Large-cap stocks (from CRSP database)

- Size/ME (log. mkt. cap.), B/M, E/P, momentum (Compustat)

- 325 months of data, Dec. 1985-Dec. 2012

V2 5/11/2015

2. The MCD Estimator


• Cerioli’s IRMCD method

• Test for outliers using Cerioli approach and Bonferroni-corrected significance level of 0.025/325 ≈ 0.00008.

• Use a conservative version of MCD that uses approx. 95% of the data to estimate the robust dispersion matrix

• R package: CerioliOutlierDetection

http://cran.r-project.org/web/packages/CerioliOutlierDetection/index.html

• Working Paper

Green and Martin, “Diagnosing the Presence of Multivariate Outliers in Financial Data using Calibrated Robust Mahalanobis Distances” (2015). Available from http://students.washington.edu/cggreen/uwstat/papers/mvoutliersfinance.pdf




http://students.washington.edu/cggreen/uwstat/papers/mvoutliersfinance.pdf

V2 5/11/2015

3. The Results


11/18/2015 8

11/18/2015 9

2. ROBUST COVARIANCES FOR MEAN-

VARIANCE OPTIMIZATION

Classic Multivariate Outliers Model vs.

Independent Outliers in Assets (IOA) Model

10

(1 ) ( , ) t F N H r μ Σ

Model for common factor outliers/market crashes

The Two Outliers Models

tT NR rtable of returns with rows

, , , 1, 2, , , 1, 2, ,, t i i i M t t i T i Nr r t

Single Factor Market Model for a Portfolio of Assets:

A market return outlier at time t causes an outlier in all assets (to various degrees), i.e., an outlying row.

11

Model for independent outliers across assets (IOA) Alqallaf, Van Aelst, Yohai and Zamar (2009)

(AVYZ, 2009) 1 (0) i iB Let if asset is (is not) an outlier.

1 2 , , , ( )N i iB B B P B Assume are independent with

Probability of an outlier-free row : For and the probability of a clean row is .36

tr 1(1 )

i

N

i

, , , 1, 2, , , 1, 2, ,, t i i i M t t i T i Nr r t

Single Factor Market Model for a Portfolio of Assets:

Asset specific (risk) time t outliers tend to be independent

.05 20N

12

Empirical Study of IOA Model Validity

Four market-cap groups of 20 stocks, weekly returns

1997 – 2010 in three regimes:

– 1997-01-07 to 2002-12-31

– 2003-01-07 to 2008-01-01

– 2008-01-08 to 2010-12-28

1. Estimate outlier probability for each asset, and hence the probability that a row is free of outliers under the IOA model.

2. Directly estimate the probability that a row is outlier free.

3. Compare results from 1 and 2 across market-caps and regimes.

i

1

1n

i i

1 row

13

-0.3

-0.1

0.1

0.3

PLX

S

-0.2

0.0

0.1

0.2

BW

INB

-0.2

0.0

0.1

0.2

HG

IC

2000 2005 2010

Index

-0.1

0.0

0.1

0.2

WT

S

SMALL-CAPS

4 of the 20 Small-Caps for Entire History

Outlier Detection Rule for Counting

optimal 90% efficient bias robust location estimate*

associated robust scale estimate*

Outliers: returns outside of

Probability of normal return being an outlier: 0.5%

* Use lmRob with intercept only in R package robust

14

ˆ ˆ ˆ ˆ( 2.83, 2.83)s s

s

15

1 3 5 7 9 11 14 17 20

% OUTLIERS IN EACH ASSET

ASSETS

PE

RC

EN

T

02

46

8

2008 2009 2010 2011

02

46

810

12

Index

CO

UN

T

# OF ASSETS WITH AN OUTLIER

Small-Caps Outliers in Third Regime

16

Large-Caps Outliers in Third Regime

2008 2009 2010 20110

510

15

Index

CO

UN

T

# OF ASSETS WITH AN OUTLIER

1 3 5 7 9 11 14 17 20

% OUTLIERS IN EACH ASSET

ASSETS

PE

RC

EN

T

01

23

45

6

17

Evaluation of IOA Model for Weekly Returns

1997-01-07 to 2002-12-31 MICRO SMALL MID LARGE

% Clean Rows IOA Model 32 39 46 59

% Clean Rows Direct Count 37 48 58 69







Main Result: The differences % Clean Rows Direct Count - % Clean Rows IOA

Model tend to be larger during the first and third time periods, which is apparently

due to the fact that market crashes cause highly correlated outliers in most

assets, for which the IOA model does not hold. However the IOA model seems to

describe much of the behavior, particularly so for the middle time period.

18

1.0

1.5

2.0

2.5

gmv.lo

gmv.lo.mcd

gmv.lo.qc

mkt

Cum

ula

tive R

etu

rn

Weekly Returns, Window = 60, Rebalance = Weekly

LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT

-0.1

50.0

0

Weekly

Retu

rn

1998-03-03 2001-07-03 2005-01-04 2008-07-01

Date

-0.5

-0.1

Dra

wdow

n

19

1.0

1.5

2.0 gmv.lo

gmv.lo.mcd

gmv.lo.qc

mkt

Cum

ula

tive R

etu

rn

Weekly Returns, Window = 60, Rebalance = Monthly

LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT

-0.1

50.0

0

Weekly

Retu

rn

1998-03-03 2001-07-03 2005-01-04 2008-07-01

Date

-0.5

-0.1

Dra

wdow

n

20

“Statistics is a science in my opinion, and it is no more a branch of mathematics than are physics, chemistry and economics; for if its methods fail the test of experience – not the test of logic – they will be discarded”

- J. W. Tukey

References

Alqallaf, Konis, Martin and Zamar (2002). “Scalable robust covariance and correlation

estimates for data mining”, Proceedings of the eighth ACM SIGKDD International

Conference on Knowledge Discovery and Data Mining, pp. 14-23. ACM.

Alqallaf, Van Aelst, Yohai and Zamar (2009). “Propagation of Outliers in Multivariate

Data”, Annals of Statistics, 37(1). p.311-331.

Boudt, Peterson & Croux (2008). “Estimation and Decomposition of Downside Risk for

Portfolios with Non-Normal Returns”, Journal of Risk, 11, No. 2, pp. 79-103.

Maronna, Martin, and Yohai (2006). Robust Statistics : Theory and Methods, Wiley.

Martin, R. D., Clark, A and Green, C. G. (2010). “Robust Portfolio Construction”, in

Handbook of Portfolio Construction: Contemporary Applications of Markowitz

Techniques, J. B. Guerard, Jr., ed., Springer.

Martin, R. D. (2012). “Robust Statistics in Portfolio Construction”, Tutorial

Presentation, R-Finance 2012, Chicago.

Scherer and Martin (2005). Modern Portfolio Optimization, Chapter 6.6 – 6.9,

Springer

21

11/18/2015 22

3. NON-PARAMETRIC VERSUS PARAMETRIC

EXPECTED SHORTFALL:

The Influence Function Approach*

*Joint work with Shengyu Zhang, PhD, First Vice President, Homestreet Bank Seattle, WA

1. VALUE-AT-RISK

Valued-at-Risk (VaR)

– A capital weighted 1% or 5% tail probability quantile

23 11/18/2015 23

VaR is the de facto risk management standard

– Originated in the 1990’s at JP Morgan

– RiskMetrics™ Technical Document (1994)

– Value-at-Risk, Jorion 3 editions: (1996), (2000), (2006)

VaR is simply not informative enough

– Doesn’t tell you anything about the size of losses beyond VaR

1( ; ) ( )q F F ( ; ) ( ; )VaR F W q F

What it is

– The average of the losses beyond VaR

– Much more informative than VaR

– ES a coherent risk measure, VaR is not

24 11/18/2015 24

Alternative names for ES

– Conditional Value-at-Visk (CVaR)

– Expected Tail Loss (ETL)

Basel III move to ES instead of VaR

– “Fundamental review of the trading book - second consultative document” (2013). Basel

Committee on Banking Supervision. www.bis.org/publ/bcbs265.htm.

– “Fundamental Review of the Trading Book: Outstanding Issues” (2014). Basel Committee

on Banking Supervision. www.bis.org/bcbs/publ/d305.htm.

2. EXPECTED SHORTFALL (ES)

11/18/2015 25

Apr 1994 Jan 1996 Jan 1998 Jan 2000 Jan 2002 Jan 2004

-0.1

0-0

.05

0.0

0

ES vs VaR: Event Driven Hedge Funds Index

11/18/2015 26

ES vs VaR: Event Driven Hedge Funds Index

27

1( ; ) ( )q F F

( ; )

1( ; ) ( )

r q F

ES F r dF r

Taking loss as negative for math but positive for some plots

11/18/2015 27

( ; ) ( | ( ; ))ES F E r r q F

Expected Shortfall Formula

28

Non-Parametric Estimators

( )

1( ; )

1 1( ) ( )

n

n

n i

ir q Fn

ES r dF r r

Represent the asymptotic value of any estimator as a functional

on a space of distribution functions:

11/18/2015 28

( )F

where is the empirical distribution function (edf). nF

ˆ ( )nF

Expected shortfall estimator:

29

Parametric Estimators

First get a parametric representation of the risk measure:

( )( , )

zES

Normal distribution ES

t-distribution ES ,

( , , )g

ES s s

, 2 ,

2

2g f q

11/18/2015 29

30

Then substitute estimators for the unknown parameters

ES Parametric Estimators

Normal Distribution t-Distribution

( )ˆ ˆ

z

ˆ,ˆ ˆg

s

MLE parameter estimators MLE risk estimator

11/18/2015 30

11/18/2015 31

OXM Returns

0 100 200 300 400 500

-0.1

5-0

.10

-0.0

50.

00.

050.

10

Normal vs Fat-Tailed Parametric ES/ETL Estimator

11/18/2015 32

-0.2 -0.1 0.0 0.1 0.2

05

10

15

20

25

30

OXM DAILY RETURNS

1% STABLE ETL vs. NORMAL VAR AND ETL: $1M OVERNIGHT

Normal VaR = $47K

Normal ETL = $51K

Stable ETL = $147K

STABLE DENSITYNORMAL DENSITY

Normal vs Fat-Tailed Parametric ES/ETL Estimator

11/18/2015 33

Non-Parametric ES IF Formulas

Straightforward but tedious calculations give complex formulas.

But the plots tell the story.

( ; )

1( ; ) ( )

p

p p

r q F

ETL F r dF r

(1 ) , 0 1p rF p F p p

0

( ; ) ( ) pETL

p

dIF r F ETL F

dp

3. ES INFLUENCE FUNCTIONS

11/18/2015 34

IF of VaR

r

-3 -2 -1 0 1 2 3

01

02

03

04

05

06

07

0

IF of ES

r

-3 -2 -1 0 1 2 3

01

02

03

04

05

06

07

0

p=0.1p=0.05p=0.01

Non-Parametric VaR and ES Influence Functions

Straightforward formula calculations available in paper

11/18/2015 35

General IF Formula

Parametric Risk Estimators

ˆˆ ( ; ) ( ) ( )IF r F r θ

θ IF

MLE IF Formula 1ˆ( ; , ) ( ) ( ; )IF r F r θ

θ I ψ θ

Risk Measure 1 2( ) ( , , , )K θ

Risk Estimator 1 2ˆ ˆ ˆˆˆ ( ) ( , , , )K θ

11/18/2015 36

Normal Distribution

t-Distribution

known unknown

2 2( ) ( )( )

2

z rr

,1( ) ( )s

gr r

B C

,

1

, ,

,

( )

1 ( )

( )

s

s

gs

r

r

g r

I

ES IF Formulas for Normal and t-Distributions

2 2

( 1)( )( )

( )

v rr

vs r

2

3 2

1 ( 1)( )( ; , )

( )s

v rr s

s vs s r

11/18/2015 37

2 2 2

2

2 2 2

, , 2

2 2 2

2

2

1 1 1 1 1 1 1 10

4 2 2 1 2( 3) 3 1)

1 2 0 1 0

3

1 1 1 20

3 1)

v s

l l l

v v v s

l l lI E

v s

l l l

s v s s

v v

v v s v v

s v

s v v s

2 3

v

v

Information Matrix for t-Distribution

11/18/2015 38

2 2

2 2 2 2

2 2

2

3 2

1 1 1 1 ( ) ( 1)( )log 1

2 2 2 2 2 2 2 ( )

( 1)( )

( )

1 ( 1)( )

( )

s

l

l

l

s

v v r v r

v vs v s v r

v r

vs r

v r

s vs s r

Score Vector for t-Distribution

11/18/2015 39

IF’s of Normal Distribution VaR and ES MLE’s

Positive returns contribute to risk! A serious short-coming!

These IF are quadratically unbounded. Furthermore:

11/18/2015 40

IF’s of t-Distribution Risk MLE’s for Known d.o.f. = 5

These influence function are all bounded. But still:


11/18/2015 41

IF’s of t-Distribution Risk MLE’s for Unknown d.o.f. = 5

IF of VaR

r

-5 -3 -1 0 1 2 3 4 5

-10

01

02

03

04

05

06

07

0

IF of ES

r

-5 -3 -1 0 1 2 3 4 5

-10

01

02

03

04

05

06

07

0

p=0.1

p=0.05

p=0.01

These influence function are unbounded, but only logarithmically.


11/18/2015 42

IF of ES

r

-10 -8 -6 -4 -2 0 2 4 6 8 10

02

04

06

08

01

00

12

0

Par-Normal

Par-t known d.o.f.

Par-t unknown d.o.f.

IF’s of ES Parametric Estimators (MLE’s)

2.5% tail probability (Basel III recommendation)

t-distribution d.o.f. = 5

log | | for large | |r r

quadratic

bounded

11/18/2015 43

2

ˆ ˆ( ) ( ; , ) ( )V IF r F dF r

Applicable to both non-parametric and parametric MLE estimators.

Complicated formulas are available in paper.

1/2

ˆˆ( )

ˆ. .( )n

VS E

n

4. ES ESTIMATOR ASYMPTOTIC VARIANCE

Influence function based asymptotic variance:

11/18/2015 44

0.0 0.1 0.2 0.3 0.4 0.5

01

02

03

04

05

0

Tail Probability

S.E

.

d.o.f.=3

d.o.f.=5

d.o.f.=7

d.o.f.=inf (normal)

Non-Parametric ES S.E.’s for t-Distributions

11/18/2015 45

0.0 0.1 0.2 0.3 0.4 0.5

05

10

15

20

25

30

Tail Probability

S.E

.

d.o.f.=3

d.o.f.=5

d.o.f.=7

d.o.f.=inf (normal)

Parametric ES S.E.’s for t-Distributions

11/18/2015 46

0.0 0.1 0.2 0.3 0.4 0.5

01

02

03

04

05

0

Tail Probability

S.E

.

Nonparametric

Parametric unknown d.o.f.

Parametric known d.o.f.

Non-Parametric vs Parametric ES S.E.’s

t-distribution with d.o.f. = 3

11/18/2015 47

Ratio of S.E. of ES Parametric Estimators to

S.E. of ES Non-Parametric Estimators

11/18/2015 48

Main Expected Shortfall Messages

Use the influence function to analyze risk and

performance estimators

- Understand data behavior of the estimators

- Get asymptotic standard errors

- Can apply to other risk and performance measures

Parametric estimators are more accurate than non-

parametric estimators, but:

- Defect is that positive returns contribute to downside risk

- Variability price of estimating t-distribution d.o.f. is high

11/18/2015 49

5. A WAY TO FIX THE PROBLEM

The problem with parametric ES that positive returns

contribute to risk is due to the symmetric nature of the

standard deviation and scale estimators

A potential solution: Use a semi-scale estimator

Initial investigation for normally distributed returns is to

use a semi-standard deviation estimator instead of the

standard deviation (volatility) estimator

1/2

1/22 2

1( ) ( ( ) ) ( ) ( )

i

i

r r

F F r dF r SSD r rn

11/18/2015 50

IF of Semi-Scale Parametric CVaR

r

IF(r

)

-3 -2 -1 0 1 2 3

0

10

20

CVaR (.01)CVaR (.05)CVaR (.1)

CVaR = ES

11/18/2015 51

Standard Error of CVaR under Normal Distribution

p (tail probability)

SE

of

Va

R

0.0 0.1 0.2 0.3 0.4 0.5

12

34

Par. CVaR with Std. Dev.Par. CVaR with Semi-SDNonparametric CVaR

CVaR = ES

11/18/2015 52

6. INEFFICIENCY OF MODIFIED ES

Modified VaR (mVaR) was proposed by Zangari (1996)* to

improve normal VaR by adding skewness and kurtosis corrections

2 3 2 31 1 1[ 1] [2 5 ] [ 3 ]

6 36 24g z z S z z S z z K

( )mVaR g

(Joint work with Rohit Arora, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2692543)

*Zangari, P. (1996). “A VaR methodology for portfolios that include options”,

RiskMetrics Monitor 1, First Quarter, pp. 4–12.

Reduces to usual normal VaR formula under normality

Widely publicized and used in a commercial portfolio product

called AlternativeSoft, but not a very good idea ….

where

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2692543

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2692543

11/18/2015 53

Modified Expected Shortfall (mES)

Modified ES (mES) was proposed and studied by Boudt et al.

(2008)* as a way to improve normal distribution ES when returns

are non-normal.

*Boudt, K., Peterson, B., and Croux, C. (2008). “Estimation and Decomposition of Downside

Risk for Portfolios with Non-normal Returns” , Journal of risk 11, pp. 79–103.

Reduces to usual normal mES formula under normality

2

1 2 31mETL g c SK c SK c K

where are polynomials in . 1 2 3, , c c c g

This turns out to perform even worse than mVaR

54

Asymptotic Inefficiency of mES

The parametric ES estimators for normal and t-distributions are

maximum-likelihood estimators (MLE’s), hence they attain the

information lower bound on asymptotic variance

Martin and Rohit (2015)* develop the large sample variance of

mES, and calculate the asymptotic efficiency of mES estimator,

defined as the ratio of the asymptotic standard deviation of the ES

MLE to that of the mES estimator for normal and t-distributions.

The results are on the next slide.

11/18/2015 54

*Martin, R. D. and Rohit, Arora (2015). “Inefficiency of Modified VaR and ES”, submitted

to Journal of Risk . Will be posted to SSRN.

11/18/2015 55

Asymptotic Inefficiency of mES

11/18/2015 56

OPEN QUESTIONS & FUTURE RESEARCH

Will robust semi-scale bound the undesirable influence

of positive returns on parametric risk estimators?

Need results for skewed t-distributions

Need correct asymptotic variances and standard

errors in the presence of serial correlation

- Recall Lo (2002) results for Sharpe ratio

- Incipient work on this with Xin Chen

How do you back-test ES estimators?

APPLICATIONS OF ROBUST STATISTICS IN FINANCE

Documents

Transcript of APPLICATIONS OF ROBUST STATISTICS IN FINANCE