Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue...

29
Conquering Big Data in Volatility Inference and Risk Management Jian (Frank) Zou Worcester Polytechnic Institute Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 1 / 29

Transcript of Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue...

Page 1: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Conquering Big Data in Volatility Inference and RiskManagement

Jian (Frank) Zou

Worcester Polytechnic Institute

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 1 / 29

Page 2: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

Volatility Modeling and Estimation

Volatility is the conditional variance of the asset price.Volatility modeling is concerned with studying the evolution of thevolatility over time.Critical role in finance.

ExamplesPortfolio allocation;Derivative pricing and hedging;Risk management using measures like VaR.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 2 / 29

Page 3: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

Low-Frequency Model Features

Black-Scholes mathematically attractiveGARCH and SV work well for low-frequency dataStationary returnsDo not fit high-frequency data.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 3 / 29

Page 4: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

High-Frequency Financial Data

High-frequency financial data possess unique features absent in datameasured at lower frequencies:

Microstructure noiseNonstationary with jumpsIrregularly spaced and random numbers of observationsNonsynchronous trading

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 4 / 29

Page 5: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

High-Frequency Financial Data

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 5 / 29

Page 6: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

High-Frequency Financial Data

BA

C SM

SF

TG

E FIN

TC

CS

CO C

PF

EJP

MM

UW

FC T

OR

CL

MS

HP

Q AA

RF

AM

DC

HK

AA

PL

DE

LLE

MC

YH

OO

AIG

FC

XN

WS

AG

LWX

OM

HA

LV

ZLO

WF

TR

BS

XM

RK

AM

ATC

MC

SA

KE

YP

HM

QC

OM

NV

DA

HB

AN

JNJ

SP

LSE

BAY

XR

XS

CH

WS

TX

MO

FIT

BP

GK

OLS

IU

SB

BM

Y XG

NW HD

ME

TC

SX

WM

TJN

PR

VLO

DO

WT

XN

DIS

DU

KC

OP

BB

YW

AG

BR

CM

JCP

BT

US

YM

CS

BU

XLU

VH

ST

WIN

GIL

DS

LB DH

IB

KN

TAP

CV

SLL

YW

MB

CAT

EX

CH

IGA

BT

WU

MR

OT

WX

ES

RX

SW

YS

TI

AV

PG

PS

NB

RIP

G

Top 100 by volume

0e+

001e

+10

2e+

103e

+10

4e+

10

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 6 / 29

Page 7: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

High-Frequency Financial Data

Time

log

retu

rn

0 5000 10000 15000 20000

−0.

04−

0.02

0.00

0.02

0.04

0.06

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 7 / 29

Page 8: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

High-Frequency Financial Data

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 8 / 29

Page 9: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

Problems and Challenges

There are major difficulties facing the portfolio allocation and volatilitymatrix estimation in high frequency financial data:

Both number of observations (n) and number of assets (p) arelarge;Existing estimators (similar to MLE for covariance estimation)perform poorly;Existing dimension reduction methods fail due tonon-synchronous data structure.

Computation is a very challenging due to large data sets and vastnumber of iterations in simulations and optimizations.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 9 / 29

Page 10: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

Portfolio Allocation and Risk Management

Portfolio allocation is one of the most fundamental problems infinance.The process of determining the optimal mix of assets to hold inthe portfolio is a critical issue in risk management.Dividing an investment portfolio among different assets based onthe volatilities of the asset returnsIdeal scenario: portfolio with maximum return and minimum risk

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 10 / 29

Page 11: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

Modern Portfolio Theory

Markowitz (1952) was the original milestone paper for modern portfoliotheory on the mean-variance analysis by solving an unconstrainedquadratic optimization problem. It was later expanded in the bookMarkowitz (1959).

Tradeoff between risk and expected returnAim to select a collection of investment assets that has lower riskthan any individual assetProvide ways to find the best possible diversification strategySharpe (1966) introduced the Sharpe ratio for the performance ofmutual funds, which is a direct measure of reward-to-risk.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 11 / 29

Page 12: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Introduction

Modern Portfolio Theory - Cont’d

Limitationsvery sensitive to errors in the estimates of the expected return andthe conditional covariance of daily returns (which is often calledvolatility matrix)works well only if the portfolio size is smallunstable performance when the portfolio size is large

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 12 / 29

Page 13: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

Methodology

The proposed methodology consists of three steps:1 Estimate integrated volatility matrix for each day by average

realized volatility matrix (ARVM) estimators.2 Regularize the inverse ARVM estimator using smoothly clipped

absolute deviation (SCAD) penalty to obtain the ARVM-SCADvolatility estimator.

3 Make portfolio allocation based on the ARVM-SCAD volatilityestimator.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 13 / 29

Page 14: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

High Performance Computing

We exploit a variety of HPC techniques, includingparallel RIntel Math Kernel Library (MKL)automatic offloading to Intel Xeon Phi SE10P Co-processor

to speed up the simulation and optimization procedures in ourstatistical investigations.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 14 / 29

Page 15: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

Price Model

Suppose that there are p assets and their log price processX(t) = X1(t), · · · , Xp(t)T obeys an Itô process governed by

dX(t) = µt dt+ σTt dBt, t ∈ [0, L], (1)

Our goal is to estimate the integrated volatility matrix for the `-th day,which is defined as

Σx(`) =

∫ `

`−1σsσ

Ts ds, ` = 1, · · · , L. (2)

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 15 / 29

Page 16: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

Portfolio Allocation Problem

For the portfolio with allocation vector w and a holding period T ,the variance (risk) of the portfolio return is given byR(w,Σ) = wTΣw.However, it is well known that the estimation error in the meanvector µt could severely affect the portfolio weights and producesuboptimal portfolios.This motivates us to adopt another popular portfolio strategy: theglobal minimum variance portfolio, which is the minimum riskportfolio with weights that sum to one. These weights are usuallyestimated proportional to the inverse covariance matrix, i.e.,w ∝ f(Σ−1).

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 16 / 29

Page 17: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

Global Minimum Variance Portfolio

Following Jagannathan and Ma (2003) and Fan, Zhang and Yu (2012),we consider the following risk optimization with two differentconstraints:

minwTΣw, s.t. ‖w‖1 ≤ c and wT1 = 1 (3)

where c is the gross exposure parameter which specifies the totalexposure allowed in the portfolio. Here we consider two cases:

c = 1 corresponds to the no short sale restriction.c =∞ is the global minimum risk portfolio without any short saleconstraint.

Other cases with varying c can be easily generalized in ourmethodology.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 17 / 29

Page 18: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

ARVM Estimator

Let τ = τr, r = 1, · · · ,m be the pre-determined sampling frequency.For asset i, define previous-tick times

τir = maxti` ≤ τr, ` = 1, · · · , ni, r = 1, · · · ,m.

Based on τ we define realized co-volatility between assets i1 and i2 by

Σy(1, τ)[i1, i2](τ ) =

m∑r=1

[Yi1(τi1,r)− Yi1(τi1,r−1)] [Yi2(τi2,r)− Yi2(τi2,r−1)] ,

(4)and realized volatility matrix by

Σ(1, τ ) =(

Σy(1, τ)[i1, i2]). (5)

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 18 / 29

Page 19: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

ARVM-SCAD Estimator

With the estimated volatility matrix ARVM Σ, we define theARVM-SCAD estimator as follows:

consider penalized estimation of the covariance matrix Σ and itsinverse matrix, precision matrix Ω = Σ−1. Denote their(i, j)-element by σij and ωij , respectively.proceed to apply the SCAD penalty pλ(·) to achieve a penalizedestimator by solving the following optimization problem.

minΩ− log |Ω|+ tr(ΣΩ) +

∑i 6=j

pλ(ωij). (6)

Note that (6) is not a convex programming. We will use the locallinear approximation algorithm. At the end of tth step, denote thesolution by Ω(t) = (ω

(t)ij ). By using the local linear approximation,

at the next step we solve the following optimization problem

minΩ− log |Ω|+ tr(ΣΩ) +

∑i 6=j

p′λ(|ω(t)ij |)|ωij | (7)

and denote its solution by Ω(t+1) = (ω(t+1)ij ). We repeat this step

until convergence.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 19 / 29

Page 20: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Methodology

Asymptotic Theory

Theorem 1

Under some regularity conditions, if max|p′λn(θj0)| : θj0 6= 0 → 0,then there exists a local maximizer θ of Q(θ) such that‖θ − θ0‖ = OP (en + bn) where an = max|p′λn(θj0)| : θj0 6= 0,bn = dan, suppose bn → 0 as λn → 0. en ∼ n−1/6 for the case withmicrostructure noise and en ∼ n−1/3 for the noiseless case.

Theorem 2

Under some regularity conditions, if limn→∞ n−1/(enλn)→ 0, and

lim infn→∞

lim infθ→0+

p′λn(θ)/λn > 0, then our estimator in Theorem 1 satisfies

P(θ2 = 0

)→ 1, as n→∞

where en follows the same rate as in Theorem 1.Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 20 / 29

Page 21: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Simulation Model

Assume the true log price X(t) of p assets follow the diffusion model

dX(t) = σTt dWt t ∈ [0, 1]

where we take σ as a Cholesky decomposition ofγ(t) = σtσ

Tt = (γij(t))1≤i,j≤p.

The diagonal elements of γ(t) are generated from four commonstochastic volatility models with leverage effect.

Geometric Ornstein-Uhlenbeck processSum of two CIR processesThe volatility process in Nelson’s GARCH diffusion limit modelTwo-factor log-linear stochastic volatility process.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 21 / 29

Page 22: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Parallel R

While most features in R are implemented as single thread processes,efforts have been made in enabling parallelism with R over the pastdecade. Parallel package development coincides with the technologyadvances in parallel system development. For computing clusters.

RmpirparallelSnow

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 22 / 29

Page 23: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Intel MKL

R enables linking to other shared mathematics libraries to speed upmany basic computation tasks. One option for linear algebracomputation is to use Intel Math Kernel Library (MKL). MKL includes awealth of routines (e.g., the use of BLAS and LAPACK libraries) toaccelerate application performance and reduce development time suchas highly vectorized and threaded linear algebra, fast fouriertransforms (FFT), vector math and statistics functions. Furthermore,the MKL has been optimized to utilize multiple processing cores, widervector units and more varied architectures available in a high endsystem. Different from using parallel packages, MKL can provideparallelism transparently and speed up programs with supported mathroutines without changing code. It has been reported that thecompiling R with MKL can provide three times improvements out ofbox.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 23 / 29

Page 24: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Offloading to Phi Coprocessor

The basis of the Xeon Phi is a light-weight x86 core with in-orderinstruction processing, coupled with heavy-weight 512bit SIMDregisters and instructions. With these two features the Xeon Phi diecan support 60+ cores, and can execute 8 double precision (DP) vectorinstructions. The core count and vector lengths are basic extensions ofan x86 processor, and allow the same programming paradigms (serial,threaded and vector) used on other Xeon (E5) processors. Unlike theGPGPU accelerator model, the same program code can be usedefficiently on the host and the coprocessor. Also, the same Intelcompilers, tools, libraries, etc. that you use on Intel and AMD systemsare available for the Xeon Phi. R with MKL can utilize both CPU andXeon Phi co-processor. In this model, R is compiled and built withMKL. Offloading to Xeon Phi can be enabled by setting environmentvariables as opposed to making modifications to existing R programs

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 24 / 29

Page 25: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Offloading to Phi Coprocessor

# enable mkl mic offloadingexport MKL_MIC_ENABLE=0

# from 0.0 to 1.0 the work divisionexport MKL_HOST_WORKDIVISION=0.3export MKL_MIC_WORKDIVISION=0.7

# Make the offload report big to be visible:export OFFLOAD_REPORT=2

# now set the number of threads on hostexport OMP_NUM_THREADS=16export MKL_NUM_THREADS=16

# now set the number of threads on the MICexport MIC_OMP_NUM_THREADS=240export MIC_MKL_NUM_THREADS=240

Figure: Configuring environment variables to enable automatic offloading toIntel Xeon Phi Coprocessor. In this sample script, 70% of computation isoffloading to Phi, while only 30% is done on host.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 25 / 29

Page 26: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Simulation Results

1 2 3 4 5 6

050

100

150

Sparsity Level

L1 n

orm

ARVMTSRVSCAD

High Noise

1 2 3 4 5 6

020

6010

0

Sparsity Level

L2 n

orm

ARVMTSRVSCAD

Figure: Risk profile with high noise levelJian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 26 / 29

Page 27: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Numerical Studies

Dow 30 Portfolio

We applied our methodology to a portfolio consisting 30 Dow JonesIndustrial Average (DJIA) constituent stocks. The purpose of ourempirical study is twofold: to demonstrate the applicability of ourapproach to a real high-frequency financial data set, as well as toprovide some insights into regularization in the portfolio allocationusing high-frequency data.

Mean Median SD (%)ARVM 0.084 0.094 4.659ARVM(no short) 0.207 0.153 4.019SCAD 0.101 0.133 4.603SCAD(no short) 0.212 0.165 4.011

Table: Portfolio performance based on the Sharpe ratio

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 27 / 29

Page 28: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Summary

Summary

Large portfolio allocation are very challenging due to thecomplexity of the problem.Volatility matrix modeling and estimation using high-frequencydata pose additional difficulties.We proposed a new methodology to perform portfolio allocationthat based on the regularized version of the estimated integratedvolatility matrix.Theoretical and numerical studies indicate that the methodologyworks effectively.We exploit a variety of HPC techniques, including parallel R, IntelMath Kernel Library, and automatic offloading to Intel Xeon Phicoprocessor in particular to speed up the simulation andoptimization procedures in our statistical investigations.

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 28 / 29

Page 29: Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue in risk management. Dividing an investment portfolio among different assets based

Summary

Thank you!

Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 29 / 29