Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap;...

download Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang -- On the Structure and Estimation of Hierarchical

of 16

description

On the structure and estimation of hierarchical Archimedean copula

Transcript of Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap;...

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 1/16

    Journal of Econometrics 173 (2013) 189204

    Contents lists available atSciVerse ScienceDirect

    Journal of Econometrics

    journal homepage:www.elsevier.com/locate/jeconom

    On the structure and estimation of hierarchical Archimedean copulas

    Ostap Okhrin a, Yarema Okhrin b,, Wolfgang Schmid ca Ladislaus von Bortkiewicz Chair of Statistics, C.A.S.E - Center for Applied Statistics and Economics, Humboldt-Universitt zu Berlin, D-10099 Berlin, Germanyb Department of Statistics, University of Augsburg, D-86135 Augsburg, Germanyc Department of Statistics, European University Viadrina, D-15230 Frankfurt (Oder), Germany

    a r t i c l e i n f o

    Article history:Received 2 November 2007

    Received in revised form

    28 November 2012

    Accepted 6 December 2012

    Available online 20 December 2012

    Keywords:

    Hierarchical Archimedean copula

    Multivariate distributionDensity estimation

    Asymptotic theory

    a b s t r a c t

    In this paper we provide a method for estimating multivariate distributions defined through hierarchicaArchimedean copulas. In general, the true structure of the hierarchy is unknown, but we develop acomputationally efficient technique to determine it from the data. For this purpose we introduce ahierarchical estimation procedure for the parameters and provide an asymptotic analysis. We consideboth parametric and nonparametric estimation of the marginal distributions. A simulation study and anempiricalapplication showthe effectivenessof the groupingprocedurein the sense of structureselection

    2012 Elsevier B.V. All rights reserved

    1. Introduction

    The modelling and estimation of multivariate distributions isone of the most critical issues in financial and economic appli-cations. The distributions are usually restricted to the class ofmultivariate elliptical distributions. The main drawback of thisprocedure is the small number of alternative distributions and thelarge number of parameters for a given family. The seminal resultofSklar(1959)provides a partial solution to this problem. It al-lows separation of the marginal distributions from the dependencestructures between the randomvariables. Since thetheory of mod-elling and estimation of univariate distributions is well established,compared to the multivariate case, the initial problem reduces tomodelling the dependence by copulas.

    Hence the theoretical properties and application of copulashave recently attracted much attention in the academic literature.The semiparametric estimation of copula functions and parame-ters is discussed by Chen and Fan (2006), Chen et al. (2006), Genestet al. (1995),Joe (2005), Wang and Wells (2000), etc. Nonparamet-ric estimation is discussed byFermanian and Scaillet(2003),Chenand Huang(2007), etc. Goodness-of-fit tests are developed in thepapers of Chen and Fan (2005), Chen et al. (2004), Fermanian(2005), Genest et al.(2006),and Genestand Rmillard(2008).Ade-tailedreviewand discussion of Archimedean copulas is given inJoe

    Corresponding author. Tel.: +49 17622041901.E-mail addresses:[email protected],[email protected]

    (Y. Okhrin).

    (1997) andNelsen(2006).Genest and Rivest(1993) derive impor-

    tant properties and provide an estimation technique. An in-depthdiscussion of simulation methodologies for Archimedean copulasis provided in Whelan (2004) andMcNeil (2008). Copulas havefound an increasing number of applications, especially in financeAn application to risk management is discussed by Embrechts et al(2002)andJunker and May (2005). Portfolioselection problems areconsidered byHennessy and Lapan (2002) andPatton (2004). The-oretical foundations of using copulas for GARCH models and applications are provided byChen and Fan(2005).Lee and Long(2009andGiacomini et al.(2009)consider time varying copulas.

    In principle, in all the mentioned papers, the authors considereither elliptical or Archimedean copulas. However, the number ofparameters by elliptical copulas increases quadratically and thepostulated dependence is symmetric. The simple Archimedean

    copulas put too much structure on the dependence between thevariables. One possibility to generalize the simple bivariate copulasto multivariate framework offer the vine-copulas, suggestedby Bedford and Cooke (2002) and popularized by Aas et al(2009). The vine copulas allow cascade-type decomposition of amultivariate distribution into a product of conditional bivariatecopulas, graphically shown as a sequence of trees. The model isextremely flexible and special types of vines, likeC-,D- orR-vineswere suggested to reduce the number of potential alternativemodels. Acar et al. (2012) discuss generalization of vines usingnonparametric smoothing to overcome the potentially misleadingresults due to simplifying assumptions. The key advantage of thevines is that at each node an arbitrary bivariate copula can beused to model the conditional dependence. This implies that, on

    0304-4076/$ see front matter 2012 Elsevier B.V. All rights reserved.doi:10.1016/j.jeconom.2012.12.001

    http://dx.doi.org/10.1016/j.jeconom.2012.12.001http://www.elsevier.com/locate/jeconomhttp://www.elsevier.com/locate/jeconommailto:[email protected]:[email protected]:[email protected]://dx.doi.org/10.1016/j.jeconom.2012.12.001http://dx.doi.org/10.1016/j.jeconom.2012.12.001mailto:[email protected]:[email protected]://www.elsevier.com/locate/jeconomhttp://www.elsevier.com/locate/jeconomhttp://dx.doi.org/10.1016/j.jeconom.2012.12.001
  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 2/16

    190 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    the one hand, that vines can be fitted in very high dimensions;while, on the other hand, the number of potential candidatemodels increases dramatically. Another class of multivariatemodel copulas constitute the skew-elliptical copulas popularizedbySmith et al.(2012).

    In this paper we advocate the Hierarchical Archimedean cop-ulas (HAC), which allow a more flexible and intuitive depen-

    dence structure compared with the simple Archimedean copulas,and need a smaller number of parameters compared to ellipticalcopulas.Joe (1997) introduced this class of copulas, while Whe-lan(2004) andMcNeil(2008) provide simulation techniques forHAC.Savu and Trede (2008)andSavu and Trede(2010)discussgoodness-of-fit tests and estimations of HACs with a known struc-ture.Okhrin et al.(forthcoming) provide detailed statistical prop-erties, like tail dependence, distribution of the copula function, etc.

    Despite their advantages, little research has been done ondetermining the best grouping of the variables in hierarchicalcopulas, nor on their estimation and simulation. In a few empiricalpapers, such asSavu and Trede(2008) andSavu and Trede(2010),the structure of the copula was motivated economically. Thisapproaches fails in higher dimensions or for very heterogeneousdata sets. This paper attempts to fill this gap and provides amethod for estimating the structure of an HAC. This can be seenas a part of copula selection problem. However, we concentratenot on the empirical comparison of alternative copulas, but onthe estimation. We propose several approaches to determine theoptimal aggregation for the next step at each level of the hierarchy.The first approach is based on a dimension free goodness-of-fittest. The second approach is based on binary trees and aggregatedbinary trees.

    The subset of variables with the strongest fit are joined withthe chosen copula function. We estimate the parameters bya multi-stage maximum-likelihood procedure and provide therelevant asymptotic theory. In the simulation study, this methodof grouping dominates the goodness-of-fit approach.

    This paper is structured as follows. Section2 contains a shortreview of copulas. The main results of the present paper concernthe determination of the structure and the estimation of copulas.These are presented in Section3.Section4 provides a simulationstudy and empirical applications of the methods developed.Section5concludes and all proofs are given in the Appendix.

    2. Theoretical background: Archimedean and hierarchical

    Archimedean copulas

    For an arbitrary k-dimensional continuous distribution functionF, the associated copula is unique and defined as the continuousfunctionC: [0, 1]k [0, 1] such thatF(x1, . . . ,xk)= C(F1(x1),. . . , Fk(xk)),x1, . . . ,xk R,where F1(x1) , . . . , Fk(xk)are the re-spective continuous marginal distributions.To formalize the definition of the new class of copulas, let usintroduce two classes of functions. The first is the family of strictlydecreasing differentiable functions

    L = {: [0, ) [0, 1] | (0)=1, ()=0,(1)j(j) 0, j=1, . . . , }.

    This family is also known as the class of Laplace transforms ofstrictly positive random variables. The second class L comprisesfunctions with similar properties

    L= {: [0, ) [0, )|(0)=0, ()= ,(1)j1(j) 0, j=1, . . . , }.

    The functionC

    : [0, 1

    ]k

    [0, 1

    ]defined as

    C(u1, . . . , uk)= (1(u1) + + 1(uk))

    Fig. 1. Examples of HACk=4.

    is called a k-dimensionalArchimedean copulaif L. The func-tion is referred to as the generator of the copula. McNeil andNelehov(2009) give less restrictive conditions on the generator,whichisrequiredtobek-monotone, i.e., k2 timesdifferentiablewith(1)i(i)(x) 0,i= 0, . . . , k2 for anyx [0, )andwith (1)k2(k2)(x) being nondecreasing and convex on [0, ).However, here we restrict ourselves to the completely monotonecase. In general the generator depends on a vector of parame-ters .

    A simple multivariate Archimedean copula implies that thevariables are exchangeable. This means that the distribution of

    (u1, . . . , uk)is the same as that of(uj1 , . . . , ujk )for allj =jv . Thisis rarely a feasible assumption in practical applications. A muchmore flexible method is provided by hierarchical Archimedeancopulas (HACs). In the special case of fully nested copulas, thecopula function is given by

    C(u1, . . . , uk)= k1(1k1

    k2[. . . ( 12 1[11 (u1)+ 11 (u2)] + 12 (u3))+ + 1k2(uk1)]

    + 1k1(uk)). (1)This is illustrated together with a partially nested copula inFig. 1.

    To guarantee that C is a multivariate distribution function weassume that 1ki kj L, i < j, seeSchweizer and Sklar(1983). Unlike the usual AC, HACs define the whole dependence

    structure in a recursive way. At the lowest level of a fully nestedcopula, the dependence between the first two variables is mod-elled by a copula function with generator1, i.e.,z1=C(u1, u2)=1[11 (u1) + 11 (u2)]. At the second level, another copula func-tion is used to model the dependence between z1andu3, etc. Notethat generatorsi can come from the same family and differ onlythrough the parameter or, to introduce more flexibility, can comefrom different generator families. Fully nested HACs were alsodiscussed byJoe (1997),Whelan(2004),Savu and Trede(2010),andEmbrechts et al.(2002).

    In contrary to other copula classes, like elliptical or vines, thedensity of theHAC cannotbe given explicitly andshould be derivedfrom thecopula function by taking derivatives.It can be potentiallya tedious task, but should not be seen as a drawback. Note that

    the generator functions are usually relatively simple functionsallowing us to compute analytic (see Hofert and Pham(0000)) ornumeric derivatives even in high dimensions. Furthermore, due tonumericalintegrationit is commonlymore difficultto compute thedistribution function using the density function than vice versa.

    HACs have a list of interesting properties. First, since the struc-ture is hierarchical, at each node of the tree we have a marginaldistribution which is also given by an HAC. Taking into account thestructureof the copulafunctionand theproperties of thegeneratorfunctions, we seethat this also holds forany subsetof thevariables.This implies that the HAC family is closed under taking multivari-ate margins. For example, by setting uk= 1 in(1) and using thefact that1k1(1)=0 we obtainC(u1, . . . , uk

    1, 1)

    =k

    2

    [. . . ( 1

    2 1

    [1

    1 (u1)

    +1

    1 (u2)

    ]+ 12 (u3)) + + 1k2(uk1)].

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 3/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 191

    Thus the marginal distribution of the firstk 1 variables is also anHAC.

    Second, if we consider copulas based on a single generatorfunction, for most cases the copula parameter should increasefrom the lowest to the highest level. For a fully nested copula thiscorresponds to 1 > 2 > > k1. This follows from thecondition that 1ki kj L, i < j. In the case of an HACwith different generator functions, the bounds on the parametersshould be determined individually (seeHofert(2012)). Third, theflexibility of the structure and the fact that there are at mostk 1 parameters make it a very flexible, but at the same timeparsimonious, distributionmodel. From this point of view, the HACapproach can be seen as a flexible alternative to Gaussian modelsand to clustering techniques applied to the correlation matrix,seeMantegna(1999) andTola et al. (2007). Correlation selectiontechniques for Gaussian copulas were also elaborated byPitt et al.(2006) within a Bayesian framework for multivariate regression.

    The recursive structure of an HAC may be particularly usefulin modelling the time-dependence of a univariate time series.In contrary to the cross-sectional dependence, the time orderingimposes a natural structure on the hierarchy and allows forinteresting alternatives to autoregressive models. The advantagesof vines for modelling longitudinal data are discussed in Smithet al.(2010). These problems go, however, far beyond the issuesaddressed in this paper.

    Despite its importance and flexibility, no studies are availableon the estimation of the structure of HACs. Theoretically, for givenkand we can enumerate all possible structures and choose thestructure with the best fit. The number of different structures isequal to the number of phylogenetic trees with k nodes. For k=10this exceeds 2.8108 and simple enumeration is unrealistic. Ouraim is to develop a technique for determining the structure of anHAC leading to the best fit of the copula-based distribution to agiven data set. We follow in this paper frequentist approach, whilealternative procedures for determination of the structure can bedeveloped in Bayesian framework, similarly as it is suggested fordrawable vines bySmith et al.(2010)andMin and Czado(2010).

    3. Determination of the structure

    Let M= {j}j=1,...,Mdenote the finite set of known generatorfunctions under consideration. We assume that each function jdepends on a vector of parameters j. Let X1, . . . ,Xk denote thevector of iid random variables whose dependence structure wewant to determine and let X= {xji} be the respective samplesfor i=1, . . . , n, j=1, . . . , k. To simplify the presentation weintroduce some further notation. Let s denote the structure of theHAC. It is a sequence of reordered indices {1, . . . , k} grouped usingparentheses to mark the variables joined at a single node. Forexample, the structures of the copulas inFig. 1are ((12)3)4 and(12)(34) respectively. The quantity sjis the structureof subcopulaswithsk= s and s1 is the identity function. We assume that thevariablesXj for j= 1, . . . , kfollow arbitrary continuous marginaldistributions, which can be estimated either parametrically ornonparametrically. Further let thek-dimensional HAC be denotedby C(, ; s)(u1, . . . , uk), where denotes the set of generatingfunctions for all levels and the set of copula parameters. Forexample, the fully nested HAC in(1)can be expressed as

    C(, ; s=sk)(u1, . . . , uk)= C({1, . . . , k1}, {1, . . . , k1}; {(sk1)k})(u1, . . . , uk)= k1,k1 (1k1,k1 C({1, . . . , k2}, {1, . . . , k2};

    ((sk

    2)(k

    1)))(u1, . . . , uk

    1)

    +1k

    1,k

    1

    (uk)),

    wheres= {( . . . (12)3) . . . k}.

    Instead of enumerating and testing all possible structures weselect at each level j of the hierarchy the best subset of variablesIj to be grouped. For this purpose we estimate for some subsetsof the variables the copula parameters and choose the subset with

    the best fit. We denote this subset byIj and the optimal generator

    function and its parameters at leveljbyjand jrespectively. Theestimation of theparameters is discussed in Section 4. The selected

    subset is then used to define the pseudo-variables C(j,j; (Ij

    ))which are treated as if variables in the usual sense at furtherlevelsThe properties of the pseudo-variables were discussed in the case

    of a bivariate Archimedean copula byGenest and Rivest(1993).

    The estimation of the copula for a 4-dimensional data set is

    illustrated in the case of a single generator function in Diagram 1

    At the first step we estimate the copula parameters for all subsetsof the variables and choose the best set to be grouped. Assume it

    is the couple(X1,X3). At the next level of the hierarchy we define

    the sub-copula Z13=13 (113 (F1(X1)) + 113

    (F3(X3))) and keep

    it fixed at the subsequent steps. At the next level we estimate

    the parameters of all subsets of X2, X4, Z(13) and assume thatthe best fit is exhibited by Z(13) and X4. Therefore, we define the

    next-level sub-copula asZ((13)4)=

    C(2

    , 2;

    Z(13)

    ,F

    4(X

    4))and it is

    joined at the highest level of the hierarchy withX2. The estimated

    structure is then s=(((13)4)2). If we restrict ourselves only tobinary trees, then we consider only the bivariate subsets listedabove the horizontal lines.

    In the general case the algorithm can be sketched as follows:

    Step1. Estimate the parameters jusing the ML technique for each

    set {Xi}iIkj , j=1, . . . , 2k k 1 and each M. In thenonparametric setup we first estimate the marginal distri-butions nonparametrically. Then the copula parametersare

    determined by solving(4)withp=1. For parametric mar-gins the estimatorjis obtained by solving(3)below.

    Step2. Determine the best subset of variablesI1

    to be grouped bythe given copula. Next,introduce a pseudo-variable defined

    byZ1=C(I1) and consider thesetZ1 {Xj}j{1,...,k}\I1of sizek2=k dim(I1) + 1.

    Step3. With the new set, proceed similarly as in Step 1 and con

    sider all subsets of Z1 {Xj}j{1,...,k}\I1 of the variables atlevel two. Next, for each subset estimate the parameters of

    the copula by solving(3)or(4)below withp=2.Step4. Determine the set of variables with the best copula fit at

    the second level. Let the obtained grouping be denoted byI2. It may contain the original variables as well as pseudo-

    variables obtained from grouping at lower levels. Then, in-

    troduce a new variable defined by Z2=C(I2)and considerthe setZ2 {Xj}j{1,...,k2}\I2 of sizek3=k2 dim(I

    2

    ) + 1.Step5. . . .continue untilkj=1.

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 4/16

    192 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    Note that the pseudo-variables are only seen as functions ofthe original variables X1, . . . ,Xk and are not used explicitly inthe estimation procedure. This implies, that while computingthe copula density for the ML estimation, we have to take thederivatives with respect to the original variables X1, . . . ,Xk andnot with respect to the pseudo-variablesZ. Proceeding in the latterway produces incorrect estimates, as is shown in the simulationstudy. The correct procedure we call hereafter the recursivemaximum likelihood estimation.

    Two further importantissues arisefrom the proposed approach.First, the usual maximum likelihood estimation is hindered by thefact that the stochastic properties of estimators at higher levels de-pend on the fact that the parameters at lower levels are estimatedtoo. A detailed analysis of this issue is provided in Section4.Thesecond problem is the choice of the subset of variables which canbe modelled by a copula in the best way. In Section 4.2 we considerfollowing alternatives: goodness-of-fit, grouping based on binarycopulas, and parameter proximity. Although the overall procedureseems to be computationally intensive, the hierarchical structureof the copula leads to substantial numerical simplifications.

    The suggested algorithms resembles the procedure suggestedfor vines in(Dissmann et al. (0000)). In the case of vines the de-scribed methodology is formally based on analysing the condi-tional distributions and the set of potentialcandidates is enormous.In the case of HAC, however, we concentrate on the unconditionaldistributions, since the conditional structure is imposed by theform of the HAC. Also the strategy for selecting the optimal sub-set for grouping is theoretically unambiguous, while vines allow,due to their flexibility, a variety of approaches (see Dissmann et al.(0000) andAas et al.(2009)).

    4. Estimation issues

    In this section we discuss the technical aspects of the estima-tion procedure for the parameters of an HAC. LetH(x1, . . . ,xk) de-note the true distribution function on a measurablek-dimensionalEuclidean space and letgdenote its measurable RadonNikodmdensity. SinceHis unknown we specify a parametric family of dis-tribution functions F(x1, . . . ,xk, ) with densitiesf(x1, . . . ,xk, ).It is assumed that the densities are measurable in (x1, . . . ,xk)forevery from a compact subspace of an Euclideanspace andcontin-uous in for every(x1, . . . ,xk). The quasi-log-likelihood functionof the sample is defined by

    L(, X)= 1n

    ni=1

    logf(x1i, . . . ,xki, ).

    Then the quasi-ML estimator is obtained by maximizing L(, X)wrt to over a compact subspace of an Euclidean space. Thisestimator always exists under suitable regularity conditions, but

    is not necessarily unique.The linkage between H and F is crucial for the asymptotic

    properties of the estimator. IfF is correctly specified, i.e., if thereexists a vector 0 such thatF(x1, . . . ,xk, 0)= H(x1, . . . ,xk)forall (x1, . . . ,xk), then the estimator is consistent for0. Moreover,it is asymptotically normal and achieves asymptotically theCramerRao lower bound. However, if the model is misspecified,i.e.,Hdoes not belong to the family F, then the interpretation ofthe estimator and its properties is not straightforward. The vector is an estimator for which minimizes the KullbackLeiblerdivergence between the true and misspecified models defined by

    K(h,f, )=Eh{log[h(x1, . . . ,xk)/f(x1, . . . ,xk, )]},where the expectation is taken with respect to the true model. The

    KullbackLeibler divergence measures the information content ofthe misspecified new model relative to the true density function.

    AssumethatE[log h(x1, . . . ,xk)] exists and | logf(x1, . . . ,xk, )| m(x1, . . . ,xk) for all with m integrable with respect to H. IfKhas a unique minimum at, then converges almost surely to .Thus even if the model is misspecified the ML estimation still pro-vides a consistent estimator of the parameters, but the limit is notthe true modelH.

    This discussion is particularly relevant while estimating the

    HAC. The first form of misspecification arises if the family of thedistribution is specified incorrectly, for example HAC vs. Gaussianmodels. This is a common form of misspecification. The secondform is specific to HAC. It arises if we specify the HAC familycorrectly, but the structure is misspecified. Next we discuss theestimation of HAC paying particular attention to misspecificationissues.

    4.1. Estimation of HACs

    The properties and goodness of the fit of estimators of the

    copula parametersdepend on the estimatorsFj for j= 1, . . . , kof the marginal distributions.We distinguish between a parametricand a nonparametric specification of the margins. On the one

    hand, since we are interested in the estimator of the dependencestructure, the estimator of should be independent of anyparametric models for the margins. On the other hand, in practicalapplications usually parametric models for individual variables arepreferred, seeJoe (1997). To simplify the exposition we assumethat the margins are specified correctly. This can be justified bythe fact that we analyse the margins separately and determine themodel with the best fit independently of the copula.

    Letj denote the parameters of the j-th marginal distribution.The ML estimator is given by

    = arg min

    ni=1

    log

    c(, ; s)(F1(x1i,1) , . . . ,

    Fk(xki,k))k

    j=1fj(xji,j)

    , (2)

    wherec(, ; )is the corresponding copula density.If the margins are estimated nonparametrically, Genest et al.

    (1995) derive the asymptotic moments of the ML estimators andshow their asymptotic normality and consistency. In the caseof parametric specification of the marginal distribution, the MLestimation can be performed simultaneously for the parameters ofthemargins andof thecopula function or in a two-stageprocedure,seeJoe(1997).

    The ML estimation can only be applied to HACs with knownstructures. In this paper the structure is determined using a

    multistage ML procedure which can be sketched as follows. At thefirst stage we estimate the marginaldistributions. At thenext stagewe estimate the parameter of the copula at the first level by takingthe marginal distributions as known. At further stages the nextlevel copula parameter is estimated assuming that the margins aswell as the copula parameters at lower levels are known. In thenext sections we provide a detailed description of this procedureand the relevant asymptotic theory.

    Parametric margins

    Assume thatkvariables are joined withinp hierarchical levels,which means that ifs = ((123)(45)), then s = s3= ((s1)(s2))where s1, s2, s3correspond to the levels 1, 2 and 3 respectively. Let

    =(1, . . . ,k)denote the vector of parameters of the marginaldistributions and =(1, . . . , p)the parameters of the copulas,

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 5/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 193

    starting with the lowest up to the highest level. The ML for theestimator of=(, )with parametric margins is given by

    L(, X)=n

    i=1log

    c(, ; s)(F1(x1i,1) , . . . ,

    Fk(xki,k))

    k

    j=1 fj(xji,j)=

    ni=1

    log c(, ; s)(F1(x1i,1) , . . . ,

    Fk(xki,k)) +k

    j=1logfj(xji,j)

    .

    Recall that the are not free parameters and, therefore, the MLprocedure requires constrained optimization

    =arg max

    L(, X) s.t. 1j i L, 0i < jp.

    The structure of the log-likelihood function allows to separatethe parameters of margins and of the copula in an additive way.

    The optimization problem can be solved using the maximizationby parts technique of Song et al. (2005). The authors suggesta recursive procedure. In the first step of the recursion weestimate the parameters of margins; in the second step the copulaparameters are estimated, by assuming the marginal parametersbeing constant. The procedure can formally be generalized andapplied to estimate the HACs. However, we concentrate on asingle loop of the recursion, since the major aim of the paper is acomputationally efficient estimator of the copula parameters andof the structure. Because of the constraints on the parameters,the estimation and, particularly, the asymptotic theory for theseestimators is very demanding. Multistage estimation eliminatesthese problems. We obtain the estimators by solving

    L11

    , . . . , Lkk

    , Lk+1

    1

    , . . . , Lk+p

    p

    =0, (3)where

    Lj=n

    i=1lj(Xi), forj=1, . . . , k +p,

    lj(Xi)=log fj(xji,j), forj=1, . . . , k, i=1, . . . , n,

    lj+k(Xi)= log

    c({, }=1,...,j; sj)

    {Fm(xmi,m)}msj msj

    fm(xmi,m)

    forj=1, . . . ,p, i=1, . . . , n.The firstk components in(3)correspond to the usual ML estima-tion of the parameters of the marginal distributions. The next pcomponents reflect the recursive estimation at each level of hier-archy, with each likelihood function depending on the copula pa-rameters at lower levels. This implies that the constraints on theparameters are fulfilled by construction.

    Let the solution of this system be denoted by. FollowingJoe(2005) we collect the derivatives of the likelihoods j on the LHSof (3):

    ni=1

    g(Xi, )=n

    i=1(g1(Xi, ) , . . . , g

    k+p(Xi, ))

    =n

    i=1 l1(Xi)

    1, . . . , lk(Xi)

    k, lk+1(Xi)

    1, . . . , lk+p(Xi)

    p .

    For r= 1, . . . , k+p denote byIrj= E gr(,)j forj=1, . . . , k andIr,k+j= E gr(,)j forj=1, . . . ,p. Note that I

    rj= 0 ifp = jk

    orj > r. The same holds ifj sr forjk andr > k. Furthermorelet Jrj = E[gr(, )gj(, )] E[gr(, )] E[gj(, )]. If both themargins and the copula are correctly specified, then Irr = Irrare the Fisher information matrices of the individual ML problemsLr

    r =0 forr

    =1, . . . , kand Lr

    r =0 forr

    = k

    +1, . . . , k

    +p

    Furthermore,Jrj= Jrj=E[gr(, )gj(, )].Theorem 1is a generalization of Proposition A.1 in Joe(2005

    which gives the asymptotic behaviour of the multi-stage estimatoboth in thecase of a correctlyspecified andof a misspecifiedcopula

    function.

    Theorem 1. (a) Suppose the model is misspecified, and let each

    marginal density fr for r= 1, . . . , k and the copula density at eachlevel of the hierarchy satisfy the regularity conditions A1A8 (see

    Appendix A). Then

    n() aN(0, B1(B1)), with =Var[g(X, )] = {Jrj}r,j=1,...,k+p and B = {Irj}r,j=1,...,k+p and minimizes the KullbackLeibler divergence between the HAC and the

    true model.

    (b) Now suppose the model is correctly specified and let eachmarginal density fr for r= 1, . . . , k and the copula density at eachlevel of the hierarchy satisfy the regularity conditions A1A4, A7, A9

    Then

    n( 0) aN(0, B1(B1)), where = {rj}r,j=1,...,k+p

    andB= {Brj}r,j=1,...,k+pwithrj= Jrj, for r,j=1, . . . , k,rj=0, for r,j=k + 1, . . . , k +p, r =j,rr= Irr, for r= k + 1, . . . , k +p,rj=0, for r=1, . . . , k,j=k + 1, . . . , k +p,Brr= Irr, for r= 1, . . . , k,Brj=0, for r,j=k + 1, . . . , k +p, r = j,Brj=0, for r,j=1, . . . , k, r = j,Brj=0, for r=1, . . . , k,j=k + 1, . . . , k +p,Brr= Irr, for r= k + 1, . . . , k +p,

    Brj=

    Irj, for r=k + 1, . . . , k +p, k + 1j < r

    if sjsr0, for r= k + 1, . . . , k +p, k + 1j < r

    if sj sr,,

    Brj=

    Irj, for r=k + 1, . . . , k +p,j=1, . . . , k

    if jsr0, for r= k + 1, . . . , k +p,j=1, . . . , k

    if j sr..

    Note that theregularity assumptions of the theorem arefulfilled byvirtually any common parametric HAC. The correct specification o

    the copula imposes interesting constraints on the structure of the

    covariance matrix and, therefore, on the asymptotic dependencebetween the marginal parameters and copula parameters a

    different levels. In particular, as follows from the form of the

    components of, we obtain that the gradients are asymptoticallyindependent. It holds that = {Jrj} with Jrj = 0 in general, but

    =

    J11 . . . J1k 0 . . . 0...

    . . ....

    .... . .

    ...Jk1 . . . Jkk 0 . . . 0

    0 . . . 0 Ik+1,k+1 . . . 0...

    .. .

    .

    .....

    .. .

    .

    ..0 . . . 0 0 . . . Ik+p,k+p

    .

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 6/16

    194 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    Similarly

    B=

    J11 . . . 0 0 . . . 0...

    . . ....

    .... . .

    ...0 . . . Jkk 0 . . . 0

    Ik+1,k+1 . . . 0

    =0

    . . ....

    =0 Ik+p,k+p

    ,

    with B having the same structure with the elements Irj. In theapplications, the numerical discrepancy between misspecified andcorrectly specified models arises from two factors: the first factoris the setting of some components ofto zero; the second factoris the reliance on the information inequality for the individual MLproblems. This problem will be assessed in the simulation studybelow.

    Depending on the structure of the hierarchical copula, thederivatives needed to compute the asymptotic covariance matrixmay be difficult to obtain.Joe (1997) suggests a jackknife approachto obtain an approximation. In this case only the univariate andmultivariate likelihoods have to be programmed. This approach

    can also be extended to multi-stage estimation following thetechnique inJoe(1997).

    Nonparametric margins

    In order to estimate the marginal distributions in a nonpara-metric way we use the modification of the empirical distributionfamily

    Fj(x)= 1n + 1

    ni=1

    I(xjix), j=1, . . . , k,

    where I denotes the indicator function. Since the results estab-lished in this section do not depend on the explicit form ofthe nonparametric kernel, but on its convergence properties, anyother suitable estimator can be used as well. Let us denote F =(F1, . . . , Fk)andF=(F1, . . . ,Fk). The canonical ML estimatorofsolves the system by maximizing the pseudo log-likelihood withnonparametric margins

    L1

    1, . . . ,

    Lp

    p

    =0, (4)

    where

    Lj=n

    i=1lj(Xi)forj=1, . . . ,p,

    lj(Xi)=log

    c({, }=1,...,j; sj)

    {Fm(xmi)}msj

    msjfm(xmi)

    forj=1, . . . ,p,

    wherefj(x)is the kernel density estimator withj=1, . . . , k. As inthe parametric case, here also we show that the semiparametric

    estimator is asymptotically normal and find its asymptoticmoments. The asymptotic theory for nonparametric estimators isusually based on the rank theory, especially the rank statisticsconsidered byHjek and Sidk(1967)andBhuchongkul(1964).Here we closely follow the approach of Genest et al. (1995).The next theorem provides intermediate results used further inTheorem 3.

    Theorem 2. Let Cn(u)be a empirical copula

    Cn(u1, . . . , uk)= 1

    n

    ni=1

    kj=1

    I{Fj(xji)uj}

    andJ(u)be a continuous function on(0, 1)k Rp, wherep 1issuch that

    = E[J(F1(X1) , . . . , Fk(Xk))]= (E[J1(F1(X1) , . . . , Fk(Xk))], . . . , E[Jp(F1(X1) , . . . , Fk(Xk))])=

    J(F) dC(F).

    Consider the statistic

    Rn= 1n

    ni=1

    J(F1(x1i) , . . . ,Fk(xki))

    =

    1

    n

    ni=1

    J1(F1(x1i) , . . . ,Fk(xki) ) , . . . ,

    1

    n

    ni=1

    Jp(F1(x1i) , . . . ,Fk(xki))

    =

    . . .

    J(F) dCn(F).

    If Jp (u) Mkr=1h(ur)ar, for p= 1, . . . ,p, with ar= (1+)/pr, for all r= 1, . . . , k, h(u)= u(1u) , > 0, p, q >0,

    kr=1

    1pr

    =1, thenRn a.s.If br= (0.5+ )/pr, for all r= 1, . . . , k, andJ admits con-

    tinuous partial derivatives on(0, 1)k such that Jp (u)/ur Mkj=1,j=rh(uj)

    bj

    h(ur)

    br1 for p=1, . . . ,p, then n 12 (Rn )N(0,), where

    = Cov

    J(F) +k

    r=1

    I{Fr(xr)ur} ur

    J(u)

    urdC(u)

    .

    Theorem 3. Suppose the model is misspecified and suppose ljj

    and

    lj

    r

    j

    satisfy the conditions of theJ-function from Theorem 2. Then the

    estimatoris consistent andn

    12 ( ) dN(0, B1B1),

    where the components= {rj}r,j=1,...,pand B= {Brj}r,j=1,...,pcanbe determined from

    j = Cov

    lj(F1(x1) , . . . ,Fk(xk))j

    +k

    r=1 I{Fr(xr)ur} ur2lj(u)

    jurdC(u)

    for j=1, . . . ,p,rj=0 for r,j=1, . . . ,p,

    Brj= E

    2lj(x)

    r

    j

    for r,j=1, . . . ,p.

    The second part of the expression for the j arises due to theestimation of the marginal distributions. The correctly specifiedcase is obtained by taking the moments with respect to the truemeasures.

    Discussion of the estimation issues

    In general the classical single-step ML estimators are usuallyobtained as thesolutionof a systemof highlynon-linear equations.

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 7/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 195

    This requires sophisticated numerical techniques to avoid localoptima. This is of special importance in problems with manyparameters. In the case of HAC we have an additional restrictionwhich reflects themonotonicity of theparametersfrom thebottomto the top of the hierarchy. This complicates even further theestimation and particularly the asymptotic theory.

    The multistage procedure considered here lacks these prob-

    lems, because the systems in(3) and (4) are solved equation byequation. In each equationL/j (or a system of equations if wehave a generator function with several parameters) the parame-

    ters estimated from previous equations (1, . . . , j1) are taken asknown quantities. This leads to a dramatic reduction in the com-putations and provides a more stable solution. Furthermore, themonotonicity of the parameters is automatically fulfilled and neednot be enforced in the optimization, nor taken into account in theasymptotic theory.

    After the grouping, the parameters of the copula can be ingeneral reestimated. Since the structure of the copula is fixed wecan use the single-step MLE to achieve efficiency. Nevertheless, itis still computationally demanding in higher dimensions due toseveral reasons. First, the copula density and the derivatives for

    the ML function have to be determined numerically. Second, themaximization is performed with respect to many parameters andsubject to the above mentioned monotonicity constrains. Third,because of the constrains the asymptotic theory is difficult toestablish. This diminishes the advantages of the reestimation.

    The estimation of the copula parameter can be performed us-ing Kendallsand the method of moments as proposed by Gen-est and Rivest (1993). This approach has several disadvantages.By estimating the from Kendalls , we cannot guarantee that1 > > k1 (or kikj L ( i < j) for copulas basedon different generator functions).Furthermore, the functional rela-tionship between and in thebivariate case is strongly nonlinearfor most of the popular generator functions. Thus minor changesinmay lead to strong changes inand vice versa. The method,

    however, works well with different generator functions and withmultiparameter copulas.

    4.1.1. Extension to SCOMDY models

    Let us assume that the processXtis formed by the innovationsof the normalized semiparametric copula-based multivariatedynamic models (SCOMDY) process Yt, i.e.,

    Yt= t(0) + 1/2t (0,1)Xt,where

    t(0)=E(Yt|Ft1),t(0,1)= diag{2jt(0,1)}j=1,...,k

    = diag{E[(Yjt jt(0))2

    |Ft1]}j=1,...,k.The SCOMDY parametrization is very flexible and allows for avariety of model specifications such as VAR and GARCH, seeChenand Fan(2006). The likelihood function ofYtis given by

    L({0,1, },Y)= 1

    2n

    ni=1

    log |2t(0,1)|

    + 1n

    ni=1

    logf(x1i, . . . ,xki, ).

    We maximize the likelihood function with respect to theparameters0, 1, the joint densityf, and the parameters . Sincethe aim of this paper is the estimation of HAC we assume for

    simplicity that t(0) and 1/2

    t (0,1) are correctly specified.To estimate the structural part we follow the approach ofChen

    and Fan(2006). The estimators of0 and 1 can be consistentlyestimated using a two-stage procedure as follows.

    0=arg min0

    1

    2n

    ni=1

    (Yi i(0))(Yii(0))

    ,

    1

    =arg min

    1 1

    2n

    n

    i=1k

    j=1 (Yji ji(0))2

    2

    ji (0,1)

    + log 2ji (0,1)

    .

    At the secondstage we estimate thedistribution of the innovationsXt given by an HAC. In general the estimation of the copulaparameters is asymptotically independent of the estimation of thestructural parameters0 and 1. This follows fromChen and Fan(2006, Proposition 3.2). For this reason we proceed further onlywith the estimation of the distribution ofXtgiven by an HAC.

    4.2. Criteria for grouping

    At each level of the hierarchy we decide which variables should

    be gathered into a copula at the next level. Next we discuss severapotential alternatives for grouping.

    4.2.1. Grouping based on goodness-of-fit tests

    The classical goodness-of-fit tests (GOF), for example, KolmogorovSmirnov or 2-type tests, cannot be applied directlyto copula functions, because copulas depend on the estimatedmarginal distributions. Furthermore, in most cases the distributionof the test statistic is not a standard one and depends on the di-mension. A dimension-free test is proposed in Chen et al.(2004)As argued byChen et al.(2004), the power and size of the test arecomparable with other more sophisticated tests. The test does notdepend explicitly on the type of the non-parametric estimator o

    the marginalsFi, but uses the order ofFj(xji)

    Fj(xji)as a function

    ofn. Thus we apply the test ofChen et al.(2004)as one of the cri-terion for grouping the variables at each level of the hierarchicacopula. The set of variables with the smallest test statistic is joinedby a copula at the next level.

    Note that the GOF method appears to lead frequently to mis-leading results due to the low power of the test against alternativestructures that are close to the true structure. Thus this methodof grouping has several drawbacks, however, we still apply in thenumerical studies for comparison purposes.

    4.2.2. Grouping based on binary structures

    A computationally simple approach is to determine a binarycopula. First we consider a single generator function. Let us denoteas

    {Iki

    }i=1,...,2

    k

    k1

    the subsets of the initial set of sizek, excludingthe empty set and single element sets. At each level of a binaryhierarchywe join thetwo variableswith thestrongest dependenceThis implies in most cases the couple with the highest value of thecopula parameter. More formally, the pairIj to be grouped at leve

    jis determined from

    Ij = arg maxIkj+1,i,|Ikj+1,i|=2

    (Ikj+1,i).

    By |Iki| we denotethe cardinality of theset Iki. This approach alwayleads to a feasible copula function with k 1 parameters. Notethat the procedure is not a specification test at each level of thehierarchy, but merely a deterministic procedure, which selects thecouple with the strongest fit. This procedure also guarantees thatthe copula parameters are monotone as argued above.

    If the true copula is not binary, the procedure leads to apotentially misspecified model. Despite a difference in structure

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 8/16

    196 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    the difference in the distribution functions is in general minor.To illustrate this point, we consider the following binary HACC1(,(1, 2); ((12)3))(x1,x2,x3). If theparameters areclose thenthe dependence structure imposed by C1 is very close to thedependence structure imposed by C2(,1; (123))(x1,x2,x3). Thisproperty is referred to as associativity of Archimedean copulas, seeTheorem 4.1.5 ofNelsen(2006).

    This fact additionally provides us with tools for aggregation

    of binary copulas to general HACs. To join three variables into asingle copula the corresponding parameters of the binary copulasshould be close. We can measure the distance in two ways. Oneapproach uses the asymptotic test with H0: j= j+1, wherejandj+1are copula parameters at two subsequent levels. The testcan be implemented using the asymptotic results from Section4.Alternatively, to make the procedure similar to the tools of thecluster analysis, we may use the threshold approach and join thevariables if, for example,(1 2)/1 < .

    The aggregation of binary trees can be augmented by consider-ing the parameters of subcopulas of dimension higher than two. Togroup variables, we compare the parameters of-variate copulaswith the parameters of all possible 1-variate subcopulas. Weformalize this idea using the minimax principle. Let

    = minIki , |Iki|3 maxJIki,|J|=k1 | (Iki) (J)|and the minimum is achieved at the subset I. For each subset ofvariables we find the largest distance between its and thes ofits own subsets with a dimension ofk1. Then the subset with thesmallest maximal distance is taken as I. This procedure, however,does not allow of selecting a bivariate copula as the best grouping.To improve this we determinethe subsetof variablesto be groupedusing the criterion

    I1 =

    I, argmaxIki , |Iki|=2

    (Iki), > .

    If is larger than some predetermined constant , then the pa-rameters of the copulas with higher dimension are toofar from the

    parameters of the lower dimensional copulas. In this case we takethe couple with the largest copula parameter as the best grouping.If we choose small, then we prefer binary copulas, with only cou-ples of variables joined at each level. For larger values of, joiningseveral variablesat once becomes more probable.The actualchoiceof depends on the parameter domain of. We can either fix thepercentage deviation of the subset parameters or use confidenceintervals based on the asymptotic theory above.

    If the multivariate copula is constructed using different genera-torfunctions, we cannotdirectlyapplythe above methodof group-ing and of aggregating. However, at each level we can choose thecouple of variables to be joined for each generator function. The KLdivergenceis used to select the couple with the betterfit. Note thatif thegeneratorfunctionis differentat twosubsequent levels of thebinary copula, the corresponding variables cannot be joined.

    5. Simulation study

    The aim of this simulation study is the comparison of thediscussed grouping methods for an example of simulated data. Weassess the goodness-of-fit of the estimated models and the abilityto capture the dependence correctly. Furthermore, we assess theimpact of misspecification by analysing the fit of HAC to Gaussiandata.

    5.1. Comparison of the grouping techniques

    To compare the grouping methods we consider two differ-ent true structures s=(123)(45) and s=(12(34))5 with theGumbel generator function given by 1

    =(

    log(u)) and

    =exp{u1/}. The parameters for the first structure are set equal to123= 4, 45= 3, and(123)(45)= 2, and for the second structure,

    to34= 4, 12(34)= 3, and(12(34))5= 2. Without loss of general-ity the marginal distributions are taken to be uniform on [0, 1] andare not estimated. We simulate 1000 samples of size 500.

    For the simulation we use the method based on Laplace trans-forms suggested inMcNeil(2008) and further assessed inHofert(2011), Hofert (2012). To implement the ML maximization, weneed the corresponding density functions. These are determinedexplicitly by taking analytic derivatives of the copula using R soft-

    ware. The computation is fast even in bigdimensions.For each sim-ulateddata set of thestudyand each structure we compared the fit,the structureand the ability to capture the dependence forthe sug-gested grouping procedures. We consider the simple Archimedeancopula (sAC) and groupings based on the Chen et al. (2004)teststatistics (Chen), on the, binary copulas (binary), aggregated bi-nary copulas (binary aggr.). The binary aggregated grouping withrecursive estimation is denoted by RML. As benchmark models we

    consider the 5-dimensional Gaussian copula (Gauss) withesti-mated from the data and thetcopula.

    The results of this part of the simulation study are summarizedin Table 1 for the first structure and in Table 2 for the second struc-ture. The second column contains the three most frequent esti-mated structures with average parameter values. The frequency of

    the true structure for binary groupings is relative low. However,since the parameters are close, thevalues of thecopula function forother structures will be numerically close to the value for the truestructure. This is additionally illustrated in the next two columns,which contain the average KullbackLeibler divergence and the av-erage test statistic ofChen et al.(2004). The frequency of the truestructure is much higher for the grouping based on Kendalls andfor the recursive reestimation. Note, however, that the estimatorsareclearly biasedin theformer case. We observe that forthe KL lossthegroupingbasedon s with recursive reestimation clearly dom-inates the alternatives, followed by both grouping methods basedon , the binary and the aggregated binary trees. The results forthe Chen statistics obviously put more weights on the correspond-ing grouping method, but still show clear outperformance of HACcompared to the Gaussian andt-models.

    To assess the ability of the estimated model to capture thedependence structure of the data, we evaluate the discrepancybetween the pairwise true and the estimated Kendalls and thelower and upper tail indices. More precisely, we calculate theFrobenius norm of the difference between the Kendalls matrixfor the true model and the Kendalls matrix for the model underconsideration, say Pmodel . The smallest error exhibits again thegrouping based ons with recursive reestimation. The groupingbased on s is the second best option, followed by groupingsbased ons. Note that the Kendall s are well estimated by theGaussian and t-copulas. The Frobenius norms of the differencesin the true and estimated upper and lower tail indices are givenin the columns U and L respectively. However, frequentlyrecommended t-copula clearly fails to capture the zero lower-

    tail dependence and heavily biases the upper part. This is animportant drawback, while the aim of copulas is the modellingthe dependence and tail dependence. The last column of thetables contains the average computing times needed for a singlereplications. The groupings based on and on binary trees arecomputationally very efficient.

    Next we address the issue of efficiency loss due to multistageML estimation.Table 3provides the true parameters, the averageparameters from the multistage and the average parameters fromsingle step ML estimation. Since the variance of theasymptotic dis-tribution for the full ML is unknown due to the imposed inequalityconstraints on the parameters, we provide the corresponding em-pirical standard deviations. The efficiency loss at all levels of thehierarchy is generally small. A stronger jump in the standard devi-ation is observed if more than two variables are joint together at a

    single level, as we observe for the first structure. If, however, onlytwo variables are joined, the loss is minor.

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 9/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 197

    Tab

    le

    1

    Modelfitforthetruestructure(12(34))5.

    ThecolumnscontaintheaveragevaluesofKullbackLeibler

    divergencebetweentheestimatedandthetruem

    odel(KL);thevalueoftheteststatisticsoftheGo

    FtestofChenetal.(2004)

    (Chen);theFrobeniusnormsofthedifferencesbetw

    eenestimatedandtrueKendallmatricesandupp

    er/lowertailindices;theestimationtimeins.Th

    evaluesinparenthesisarethecorrespondingstandarddeviations.

    M

    ethod

    Copulastructure(s)

    %

    KL

    CF

    Av.errorin

    U

    L

    time(ins)

    G

    auss

    0.2

    896(0.0

    418)

    1021.0

    584(86.1

    239)

    0.6

    417(0.0

    298)

    3.0

    882(0.0

    000)

    0.0

    000(0.0

    000)

    0.0

    003(0.0

    004)

    t

    0.1

    992(0.0

    281)

    1068.3

    437(85.0

    886)

    0.6

    442(0.0

    279)

    1.4

    610(0.1

    430)

    1.8

    445(0.1

    469)

    37.2

    612(22.0

    799)

    sAC

    (12345)2.3

    7

    100.0

    0.4

    963(0.0

    664)

    55.0

    961(6.6

    516)

    0.4

    338(0.0

    147)

    0.3

    938(0.0

    127)

    0.0

    000(0.0

    000)

    1.1

    138(0.8

    659)

    >0

    (5(12(34)4.0

    3)3.3

    3)2.1

    8

    98.8

    0.0

    318(0.0

    219)

    126.1

    234(24.9

    078)

    0.1

    627(0.0

    582)

    0.1

    488(0.0

    539)

    0.0

    000(0.0

    000)

    0.5

    360(0.4

    191)

    (5(1234)3.5

    8)2.1

    7

    1.1

    (5(1(2(34)3.9

    7)3.7

    4)3.5

    0)2.2

    4

    0.1

    b

    inary

    (5(2(1(34)4.0

    2)3.3

    4)3.3

    3)2.1

    8

    51.3

    0.0

    312(0.0

    207)

    125.9

    834(24.8

    084)

    0.1

    624(0.0

    581)

    0.1

    487(0.0

    538)

    0.0

    000(0.0

    000)

    0.5

    235(0.3

    619)

    (5(1(2(34)4.0

    2)3.3

    5)3.3

    4)2.1

    7

    47.6

    (5((34)3.9

    3(1.2

    0)3.2

    1)3.2

    1)2.2

    2

    1.1

    Chen

    (15(234)3.2

    0)2.1

    0

    12.1

    0.4

    512(0.1

    377)

    42.4

    741(16.4

    991)

    0.4

    939(0.0

    891)

    0.4

    503(0.0

    82)

    0.0

    000(0.0

    000)

    382.2

    255(116.2

    116)

    (13(245)2.1

    8)2.0

    9

    11.0

    (34(125)2.1

    8)2.0

    8

    10.9

    (5(3(124)2.9

    9)2.6

    3)1.7

    3

    46.4

    0.1

    464(0.0

    288)

    49.3

    676(7.9

    451)

    0.2

    983(0.0

    504)

    0.2

    828(0.0

    512)

    0.0

    000(0.0

    000)

    6.0

    718(3.0

    262)

    (5(4(123)3.0

    1)2.6

    5)1.7

    3

    44.6

    (5(1234)3.0

    6)1.7

    6

    9.0

    binary

    (5((34)4.0

    0(12)3.0

    7)3.0

    7)1.7

    8

    38.9

    0.0

    312(0.0

    163)

    65.9

    947(12.4

    950)

    0.2

    196(0.0

    562)

    0.2

    152(0.0

    562)

    0.0

    000(0.0

    000)

    0.7

    069(0.3

    823)

    (5(2(1(34)4.0

    3)3.0

    6)2.5

    9)1.7

    5

    32.3

    (5(1(2(34)4.0

    3)3.0

    6)2.5

    9)1.7

    5

    28.8

    binaryaggr.

    (5(12(34)4.0

    0)3.0

    7)1.7

    8

    38.9

    0.0

    312(0.0

    163)

    66.0

    036(12.5

    048)

    0.2

    196(0.0

    562)

    0.2

    152(0.0

    562)

    0.0

    000(0.0

    000)

    0.6

    652(0.3

    404)

    (5(2(1(34)4.0

    3)3.0

    6)2.5

    9)1.7

    5

    32.3

    (5(1(2(34)4.0

    3)3.0

    6)2.5

    9)1.7

    5

    28.8

    RML

    (5(12(34)4.0

    1)3.0

    2)2.0

    0

    81.1

    0.0

    025(0.0

    032)

    84.1

    208(7.7

    031)

    0.0

    509(0.0

    253)

    0.0

    472(0.0

    244)

    0.0

    000(0.0

    000)

    3.2

    885(2.1

    005)

    (5((12)3.1

    2(34)4.0

    3)2.8

    7)1.9

    9

    17.5

    (5(1(2(34)4.1

    7)3.1

    9)2.9

    7)2.0

    2

    0.9

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 10/16

    198 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    Tab

    le

    2

    Modelfitforthetruestructure(123)(45).ThecolumnscontaintheaveragevaluesofKullbackLeibler

    divergencebetweentheestimatedandthetruem

    odel(KL);thevalueoftheteststatisticsoftheGo

    FtestofChenetal.(2004)

    (Chen);theFrobeniusnormsofthedifferencesbetw

    eenestimatedandtrueKendallmatricesandupp

    er/lowertailindices;theestimationtimeins.Th

    evaluesinparenthesisarethecorrespondingstandarddeviations.

    M

    ethod

    Copulastructure(s)

    %

    KL

    CF

    Av.errorin

    U

    L

    time(ins)

    G

    auss

    0.2

    847(0.0

    403)

    1145

    .3618(96.1

    629)

    0.0

    789(0.0

    299)

    3.0

    261(0.0

    000)

    0.0

    000(0.0

    000)

    0.0

    003(0.0

    005)

    t

    0.1

    989(0.0

    283)

    1214

    .3247(93.5

    416)

    0.0

    682(0.0

    268)

    1.3

    398(0.1

    500)

    1.7

    787(0.1

    334)

    40.0

    308(24.2

    342)

    sAC

    (12345)2.3

    2

    100.0

    0.8

    038(0.0

    711)

    62.2205(7.0

    608)

    0.5

    249(0.0

    097)

    0.4

    710(0.0

    078)

    0.0

    000(0.0

    000)

    1.2

    514(1.0

    350)

    >0

    ((45)3.0

    1(123)4.1

    4)2.2

    3

    100.0

    0.0

    183(0.0

    124)

    109.0668(22.0

    437)

    0.1

    825(0.0

    710)

    0.1

    740(0.0

    675)

    0.0

    000(0.0

    000)

    0.5

    911(0.3

    154)

    b

    inary

    ((45)3.0

    1(3(12)4.1

    3)4.1

    3)2.2

    3

    35.1

    0.0

    182(0.0

    124)

    109.053(22.0

    418)

    0.1

    825(0.0

    710)

    0.1

    740(0.0

    675)

    0.0

    000(0.0

    000)

    0.5

    905(0.3

    496)

    ((45)3.0

    2(1(23)4.1

    4)4.1

    4)2.2

    3

    32.6

    ((45)3.0

    1(2(13)4.1

    5)4.1

    5)2.2

    4

    32.3

    Chen

    (45(123)4.0

    1)2.0

    2

    18.3

    0.7

    261(0.2

    880)

    43.7516(12.3

    469)

    0.5

    460(0.1

    505)

    0.4

    929(0.1

    337)

    0.0

    000(0.0

    000)

    419.2

    090(122.4

    066)

    (12(345)2.1

    8)2.0

    1

    14.8

    (23(145)2.1

    9)2.0

    3

    13.3

    (2(1345)2.2

    3)1.7

    8

    29.1

    0.8

    196(0.3

    398)

    50.233(23.7

    257)

    0.6

    528(0.1

    860)

    0.6

    011(0.1

    689)

    0.0

    000(0.0

    000)

    7.4

    331(3.8

    592)

    (1(2345)2.2

    4)1.7

    9

    28.2

    (3(1245)2.2

    4)1.7

    8

    27.9

    binary

    ((45)3.0

    1(3(12)4.1

    1)3.9

    1)2.2

    7

    34.7

    0.0

    215(0.0

    073)

    102.6946(9.1

    997)

    0.2

    117(0.0

    550)

    0.2

    024(0.0

    520)

    0.0

    000(0.0

    000)

    0.8

    801(0.4

    933)

    ((45)3.0

    1(1(23)4.1

    1)3.9

    1)2.2

    8

    33.4

    ((45)3.0

    1(2(13)4.1

    1)3.9

    0)2.2

    8

    31.9

    binaryaggr.

    ((45)3.0

    0(123)4.0

    2)2.2

    8

    51.4

    0.0

    214(0.0

    071)

    103.2285(9.3

    303)

    0.2

    116(0.0

    551)

    0.2

    023(0.0

    520)

    0.0

    000(0.0

    000)

    0.8

    475(0.4

    614)

    ((45)3.0

    2(1(23)4.1

    5)3.8

    4)2.2

    8

    16.5

    ((45)3.0

    2(2(13)4.1

    6)3.8

    5)2.2

    7

    16.3

    RML

    ((123)4.0

    4(45)3.0

    1)1.9

    7

    82.2

    0.0

    029(0.0

    030)

    84.3295(7.2

    043)

    0.0

    542(0.0

    310)

    0.0

    515(0.0

    312)

    0.0

    000(0.0

    000)

    0.5

    376(0.0

    826)

    ((1(23)4.2

    )3.9

    4.(

    4.5

    )3.0

    3)1.9

    9

    6.4

    ((3(12)4.2

    3)3.9

    6.(

    4.5

    )2.9

    9)1.9

    7

    5.8

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 11/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 199

    Table 3

    The average parameters and computational times for multistage ML and full ML estimation based on 1000 simulated samples of size 500. Standard errors are provided in

    brackets.

    3, (3=4.0) 2, (2=3.0) 1, (1=2.0) Time (in s)Structure((123)4(45)3)2

    Multistage 4.028 (0.103) 3.010 (0.112) 1.967 (0.058) 0.496 (0.032)

    Full 4.002 (0.100) 3.010 (0.111) 2.002 (0.058) 0.949 (0.060)

    Structure((12(34)4)35)2Multistage 3.983 (0.148) 2.995 (0.078) 2.003 (0.061) 1.995 (0.372)

    Full 3.980 (0.141) 3.004 (0.070) 2.005 (0.061) 2.740 (0.326)

    Fig. 2. Illustration of the misspecification problem: kernel density estimators

    of the distribution of the KL divergence between the estimated and empiricaldistributions. Simulation from the HAC with Gumbel generators (upper pictures),

    simulation from the Gauss copula (lower pictures) with 200 runs. In the legend: (g)

    and (c) refer to the fitted HAC with Gumbel and Clayton copulas respectively.

    5.2. Misspecification

    In this section we assess the impact of model misspecificationon the estimation results. We simulate the data from two models.The first is a three dimensional HAC with Gumbel generator andparameters 1.5and 2. The marginal distributions are standard nor-mal. The second model is the normal distribution, with correlationmatrix identical to the one imposed by the HAC in the first model.To each of the simulated datasets we fit one of four models: HACwith Gumbel generators, HAC with Clayton generators, the multi-variate normal distribution and theC-vine with mixed generators.For the copula-based models we distinguish between estimationwith nonparametric, true, and parametrically estimated margins.In Fig. 2 we plot the nonparametric density estimators of the KL di-vergence over the replications. The upper figures refer to the datasimulated from the HAC model, the bottom figures show the re-

    sults forthe normal data.The modelwith thebest fitshould exhibita narrow density of the KL criteria centred around zero.

    If the data is generated from the HAC-based model, the modewith the true known margins and correct generator function out-performs the alternatives, followed by the models with the para-metric and nonparametric margins. The fitted normal distributionshows a stronger variation and a bias of the KL criteria if the sam-ple size increases. The fit of the HAC with the misspecified Claytongenerator is extremely poor. The C-vine shows results close to theresults from HAC. We conclude that the misspecification caused bythe fit of normal distribution to HAC-data is stronger that the mis-specification caused by the fit of an HAC model to normal data forsmaller samples. Having more parameters and different generator

    functions, the vines show slightly better performance compared toHAC for Gaussian samples.

    6. Empirical example

    In this subsection we apply the proposed estimation techniqueto real data. The purpose of the study is twofold. First, we assessthe potential of the HAC model for modelling the returns of thegiven assets. Second, we evaluate the economic profit of applyingthe HAC by calculating the VaR for the profit and loss function ofa portfolio. We consider daily log-returns of Apple (AAPL), Ford(F), Google (GOOG), Microsoft (MSFT) and Toyota Motors (TM)The sample period covers 1008 observations from 03.01.2007 to31.12.2010. To overcome the time dependence of the data we

    fit all five time series to univariate AR(1)GARCH(1,1) processewith generalized error distributed (GED) residuals (see Nelson(1991)). The estimation results are given inTable 4.We concludethat the fitted process successfully eliminates the autocorrelation(BoxLjung test) and the hypothesis of GED residuals cannobe rejected by the KolmogorovSmirnov test. The subsequenmodelling uses these residuals.

    The pairwise residuals and the residuals mapped on the unitsquare by the empirical distribution functions show clear asym-metric behaviour, for example, GOOG vs. AAPL, APPL vs. F, MSFTvs. F, etc. We skip the corresponding pictures for brevity. This factadvocates the attempt to fit the copula-based distribution withGED margins to the five dimensional time series of the residualsTable 5contains details on the estimated binary HAC with Gum-

    bel and Clayton generators. Note that the estimated structure areclose for both generator functions. For both models first GOOG andAAPL are linked, with MSFT added at the next level. The automo-tive companies are added either jointly for the Clayton generatoror linked sequentially with the Gumbel generator.

    In the subsequent analysis we compare the fit of the HACmodel with Clayton or Gumbel generators to the fits of severabenchmarks. For simplicity we do not consider mixed generatorsFor possible constellations of different generators within a singleHAC refer to Hofert (2011). The first two benchmarks are theGaussian andt-copula. The simple Archimedean copulas with thesame generators build the second class of alternatives.

    To assess the dynamics of the estimated parameters and theestimated structure we perform a moving window estimation

    At each step 100 observations are used for estimation purposesThe respective time series of the BIC values, Kendalls and o

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 12/16

    200 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    Table 4

    Estimationresultsof fitting univariateAR(1)GARCH(1,1)processes to the datawith thevolatilityequation 2t= 0 +1(rt1 1rt2)2 +12t1and GED residuals.Thecolumns BL and KS contain the p-values of the BoxLjung for the autocorrelation of the squared residuals and the KolmogorovSmirnov test for GED. Second lines contain

    the standard deviations of the parameters.

    1 0 1 1 shape BL KS

    GOOG 8.585e04 0.03429 3.323e06 0.04525 0.94720 1.08482 0.3344 0.10384.271e04 0.02121 1.809e06 0.01152 0.01250 0.05705

    AAPL 2.454e

    03

    0.00964 6.075e

    06 0.08711 0.90523 1.37463 0.1506 0.2886

    5.282e04 0.03262 4.088e06 0.02115 0.02279 0.08272MSFT 7.803e05 0.05871 5.153e06 0.08253 0.90647 1.21966 0.6493 0.8201

    4.676e04 0.02550 2.793e06 0.02396 0.02570 0.07182TM 5.223e04 0.08890 4.697e06 0.08965 0.89870 1.52981 0.9418 0.7060

    4.693e04 0.03221 2.404e06 0.01932 0.02068 0.09729F 7.963e04 0.02220 1.483e05 0.12247 0.87397 1.35513 0.3228 0.1423

    7.356e04 0.03145 8.834e06 0.02945 0.02837 0.07890

    Table 5

    Estimation results for the fit of the HAC with Gumbel and Clayton generators to the residuals. The standard errors of the parameters are given in the parenthesis.

    Generator estimated structure and parameters

    Gumbel ((((GOOG.AAPL)1.692(0.078).MSFT)1.490(0.046).TM)1.344(0.028).F)1.280(0.025)Clayton (((GOOG.AAPL)1.072(0.120).MSFT)0.762(0.089).(TM.F)0.554(0.058))0.548(0.051)

    Fig. 3. BIC, the parameters transformed to Kendallsand the structure from moving window estimation with window length 100 using Clayton (left) and Gumbel (right)

    generators respectively.

    the structure are shown in Fig. 3. The BIC reveals clear periodswhere one of the models dominates the alternatives; particularlyimportant are the periods where the HAC-based distributionwith Gumbel generator outperforms the Gaussian, tand Clayton

    benchmarks. The percentage of the days, when one of the modelsis dominating, is given in the legend. The Kendalls and the

    correspondingly the parameters exhibit clear co-movement withperiodic divergences. There is no visual relationship betweenthe performance and the dynamics of the parameters. In theperiods with close parameters, the copula functions with different

    structures are numerically indistinguishable despite possessingdifferent trees. For this reason we observe frequent jumps in the

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 13/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 201

    structure only in the periods with similar parameters, leading totheconclusion that a few different structure aresufficient to modelthe data over the whole estimation period. This also illustratesthe robustness of the estimation procedure: distant parametersnecessarily lead to a stable copula structure.

    To compare the performance of HAC we rely on the economicsignificance of the alternative distributions. For the analysis wechose the Gaussian distribution, simple Archimedean copulas anddifferent vines. The vine-based benchmarks consist of six alter-native models with three models arising from C-vines and threemodels fromD-vines. Each triple consists of a Clayton based vine,Gumbel based vine and the mixed vine, which allows for a mix-ture of Clayton, Gumbel, Gaussian andtcopulas. Particularly weconsider the Value-at-Risk (VaR) of a portfolio of five assets withweightsw, assuming individual AR(1)GARCH(1,1) data generat-ing processes with GED residuals for each asset and the three al-ternative joint distributions for the residuals. The profit and loss

    function of theportfolio is defined as Lt+1=5

    i=1wiPit(eRi,t+1 1),

    wherePit andRitare the price and the return of the asseti at thetime point trespectively. We drop the time index and denote by FLthe distribution function ofLt+1. The VaR of the portfolio at level

    isgiven by VaR()=F1

    L (). Since FLdepends on themarginaldis-tributions of the returns as well as on the joint distribution of theresiduals, it cannot be derived explicitly. The distribution functionFLisestimated by simulating thepathsof theassetreturns from theabove marginal processes and alternative joint distributions of theresiduals. TheVaR()is computed as the corresponding empiricalquantile.

    Backtesting allows us to assess the economic significance ofthe chosen copula model. We estimate the realized as a relativefraction of the exceedances in the time series, i.e.,

    w= 1T

    Tt=1

    I{Lt

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 14/16

    202 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    Appendix A

    Let f() denote either the marginal density or the copuladensity for arbitrary subcopula. The elements of the vector ofparameters vary depending on the specification off.

    1. LetXi be a stochastic process on a complete probability spacewith an absolutely continuous distribution.

    2. f : Rdim(X) R+ are measurable and continuous foreach .

    3. fis continuously twice-differentiable on(a.s.).

    4. E[logf(X, )] exists and is finite for each . AdditionallyE[logf(X, )] is continuous on(a.s.).

    5. E[logf(X, )] exists and is finite for each .6. E[2 logf(X, )] exists, is finite, positive definite and continu-

    ous on.

    7. Lj have identifiably unique maximizers on the interior of asubspace offor j=1, . . . ,p + k.

    8. n1/2f(Xn,) obeys, uniformly in n , the CLT, with a positivedefinite covariance matrix.

    9. For all we have f(X, )dH(X)=0 and 2f(X, )dH(X)=0.Appendix B

    Proof of Theorem 1. The asymptotic results follow directly fromTheorem 6.2 ofWhite(1982)for misspecified and from Theorem6.4 of White (1982) for correctly specified models. Here wederive only the expressions for the components of the asymptoticcovariance matrix of the correctly specified ML estimators. Wehave B = Eg(Xi, )/. The last derivative can be writtenexplicitly in terms of the components of

    g(Xi, )

    =

    2 l1(Xi)

    11 . . .

    2 l1(Xi)

    1k

    2l1(Xi)

    11 . . .

    2l1(Xi)

    1p...

    . . ....

    .

    .

    .. . .

    .

    .

    .

    2 lk(Xi)

    k1

    . . . 2 lk(Xi)

    kk

    2lk(Xi)

    k1

    . . . 2lk(Xi)

    2

    p

    2 lk+1(Xi)1

    1

    . . . 2 lk+1(Xi)

    1k

    2 lk+1(Xi)1

    1

    . . . 2lk+1(Xi)

    1

    p

    .

    .

    .. . .

    .

    .

    .

    .

    .

    .. . .

    .

    .

    .

    2 lk+p(Xi)p

    1

    . . . 2 lk+p(Xi)

    pk

    2lk+p(Xi )p

    1

    . . . 2 lk+p(Xi)

    p

    p

    .

    The elements on the main diagonal are parts of the informationmatrices from the individual ML procedures. We denote them by

    E 2Lr

    rr = Irr, forr= 1, . . . , kandE2Lk

    +j

    jj = Ijj, forj= 1,. . . ,p. Since the likelihood function Lr forr= 1, . . . , k is inde-pendent ofjforj=1, . . . , k, r =j and offor =1, . . . ,p, thisimplies that

    2Lr

    r

    j

    = 2Lr

    jr=

    2Lr

    r

    =0,

    r,j=1, . . . , k; r =j, =1, . . . ,p.The elements in the lastp rows below the main diagonal dependon the structure of the HAC. If the element j has been grouped at

    the level r, i.e.,j sr, thenE 2Lrrj = Irj for j= 1, . . . , k andr=k + 1, . . . ,p.

    The matrixis equal to Var(g)=Egg. FromJoe(1997)it fol-lows that E Lrr

    Lk+11

    = 0 forr= 1, . . . , k. We have to show that

    also E Lrr

    Lk+j

    j

    = E Lk+jj

    Lk+

    = 0 forj, = 1, . . . ,pand j = .Letr sj, then it holds that

    E lr

    r

    lk+jj

    = lr(yr)

    r

    lk+j({y}sj )

    jc(j, j

    ;sj)

    ({F(y,j)}sj )sj

    f(y,j)dy

    =

    lr(yr)

    r

    jlog

    c(j, j; sj)({F(y,j)}sj )

    c(j, j; sj)({F(y,j)}sj )

    sj

    f(y,j)dy

    =

    lr(yr)

    r

    j

    c(j, j; sj)({F(y,j)}sj )sj

    f(y,j)dy

    = yr lr(yr)

    r yr

    jc(j, j

    ;sj)(

    {F(y,j)

    }

    sj )

    sj

    f(y,j)

    sj,=rdy

    dyr(where

    yr denotes integration over all variables except foryr)

    =

    yr

    lr(yr)

    r

    j

    yr

    c(j, j; sj)({F(y,j)}sj )

    sjf(y,j)

    sj,=rdy

    dyr.

    The internal integral is equal to the marginal density of the r-thvariable, and thus it is independent of j. This implies that thederivative and the whole expression are zero. For the second term

    ELk+j

    j

    Lk+

    =0 we proceed in a similar way. Letrsr sj. Oth-erwise the expression is obviously zero, since Eg(Xr, )=0. Nowwe have

    E lk+rr

    lk+jj

    =

    lk+r({y}sr)r

    lk+j({y}sj )j

    c(j, j; sj)

    ({F(y,j)}sj ) sj

    f(y,j)dy

    =

    lk+r({y}sr)i

    jc(j, j; sj)

    ({F(y,j)}sj )sj

    f(y,j)dy

    =

    y,sr

    lk+r({y}sr)r

    j

    y,sj/sr

    c(j, j; sj)

    ({F(y,j)}sj ) sj

    f(y,j) sj/sr

    dysr

    dy.

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 15/16

    O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204 203

    The inner integral is equal to the joint distribution of the vari-ables in s rand is, therefore, independent ofj. This implies thatthe derivative with respect to j and the whole expression is zero.This completes the proof of the theorem.

    Proof of Theorem 2. By the Lemma of GlivenkoCantelli, Cn(F)converges almost surely to C(F). Rncan be written in the followingway

    Rn= E(Vn)=E(V1,n, . . . , Vp,n)= E(J(v1n, . . . , vkn))= (E(J1(v1n, . . . , vkn) ) , . . . , E(Jp(v1n, . . . , vkn))),

    wherev1n, . . . , vknCn(F).

    Observe that (v1n, . . . , vkn) D(v1, . . . , vk), where (v1, . . . , vk)

    are random variables distributed according toC(F). Because of thecontinuity ofJ,Vnconverges toV=J(v1, . . . , vk). To complete theproof we have to show that all components ofVnare uniformly in-tegrable, i.e., that there exists an > 0 such that E(|Vp,n|1+ )for

    p= 1, . . . ,

    pare bounded. To prove this, we use a multivariate

    extension of Hlders inequality. For positive functions f1(x) , . . . ,

    fk(x):(a, b) R, ba

    kj=1

    fj(x) dxk

    j=1

    ba

    fj(x)pj dx

    1/pj.

    From now on, we fix such p that 1 p p. To show thatE(|Vp,n|1+ )is bounded we use its integral representation, the in-equality given inTheorem 2for J, and the extension of Hldersinequality.

    E(|Vp,n|1+ )=

    |Jp (F)|1+ dCn(F)

    M

    k

    j=1h(Fj)

    aj(1+) dCn(F)

    Mk

    j=1

    1

    n

    ni=1

    h

    i

    n + 1

    (1+)(1+)1/pj

    = Mn

    ni=1

    h

    i

    n + 1

    (1+)(1+) M

    10

    1

    [u(1 u)](1)(1+) du.

    The last integral is finite for < , which completes the proof ofthe convergency ofRnto . To prove the existence of the variancewe split the elements of our statisticinto the following summands:

    n

    1

    2 (Rp,n p )=Hp,1n+ Hp,2n+ Gp,n,where

    Hp,1n=n 12

    Jp (F) d{Cn(F) C(F)}

    Hp,2n=n12

    [Jp (F) Jp (F)]d{Cn(F) C(F)}

    Gp,n=

    Jp (F) Jp (F) dC(F)=

    kr=1

    Gp,rn

    =k

    r=1n

    12

    (Fr Fr)

    Jp (F1, . . . , Fr1, r, Fr+1, . . . , Fk)r

    r=Fr+(FrFr)

    dC(F),

    whereGp,n is transformed by the mean value theorem and [0, 1]. EachGrncould be rewritten as

    Gp,rn=n 12

    (Fr Fr)

    Jp (F1, . . . , Fr1, r, Fr+1, . . . , Fk)r r=Fr+(FrFr)

    Jp (F)Fr

    dC(F) + n 12

    (Fr Fr) Jp

    (F)

    FrdC(F).

    While the first derivative of the functionJis uniformly continuous

    and bounded by the assumptions, and since |r Fr| |Fr Fr|by the Lemma of GlivenkoCantelli

    Jp (F1,...,Fr1,r,Fr+1,...,Fk )r

    r=Fr+(FrFr)

    converges in probability

    to Jp (F)

    Fr. Using the definition of the empirical distribution func-

    tion which for simplicity is multiplied by n+1n

    , the last summandhas following representation:

    n12 (Fr Fr) Jp (F)Fr dC(F)= n 12

    ni=1

    (I{xrixr} Fr)

    Jp (F)

    FrdC(F)

    = n 12n

    i=1

    [I{xrixr} Fr]

    Jp (F)

    FrdC(F)

    = n 12

    [I{yr xr} Fr]Jp

    (F)

    FrdC(F).

    Note that Jp (F)

    Frhas no influence on the variance but con

    tributes to the covariance between the elements of the vector

    Applying Hlders inequality and the assumption, that Jp (F)

    Fr

    Mkj=1,j=rh(uj)bj h(ur)br1 provides theexistenceof thesecondmoment ofGp,n:

    Var(Gp,n)=Var

    kr=1

    I{yr xr}Jp

    (F)

    FrdC(F)

    .

    In the main difference, Hp,1ncan be rewritten in the followingform

    Hp,1n=n12

    ni=1

    Hp,1in,

    where Hp,1in = Jp (F1(x1i) , . . . , Fk(xki))p are iid with zeromeans. Using thatJp (F)

    M

    k

    r

    =1h(ur)

    ar, the application of the

    multivariate extension of the Hlders inequality for the same setofpr,r= 1, . . . , kas in the first part of the theorem, shows thatHp,1inhas finite absolute moment of order 2+0, for some 0 > 0By the GlivenkoCantelli LemmaHp,2nconverges almost surely tozero and from Bhuchongkul (1964) we get the resulting covariance

    Cov[n 12 (Rn)]

    = Cov

    J(F1(y1) , . . . , Fk(yk))

    +k

    r=1

    I{yr xr} Fr(yr)

    J(F1(y1) , . . . , Fk(yk))

    Fr(yr)dC(F1(y1) , . . . , Fk(yk))

  • 5/23/2018 Journal of Econometrics Volume 173 Issue 2 2013 [Doi 10.1016%2Fj.jeconom.2012.12.001] Okhrin, Ostap; Okhrin, Yarema; Schmid, Wolfgang

    http:///reader/full/journal-of-econometrics-volume-173-issue-2-2013-doi-1010162jeconom201212001 16/16

    204 O. Okhrin et al. / Journal of Econometrics 173 (2013) 189204

    Proof of Theorem 3. In this proof we usesimilar methods as in theproof of Theorem2.1 ofRuymgaartet al. (1972) and Proposition 2.1

    ofGenest et al.(1995). Expanding Lj(,X)

    for j = 1, . . . ,p in a

    Taylor series we obtain

    Lj(, X)

    =

    =0an Bn( ),

    where

    ajn= 1n

    ni=1

    lj(F1(x1i) , . . . ,Fk(xki))j

    , forj=1, . . . ,p

    an=(a1n, . . . , apn)Bn = {Bjrn}r,j=1,...,p

    =

    1n

    ni=1

    2lj(F1(x1i) , . . . ,Fk(xki))j

    r

    r,j=1,...,p

    ,

    where Bn is a block matrix. From the theory of rank statistics we

    conclude thatn12 ( )n 12 B1n an. TakingJinTheorem 2equal

    to lj

    jand

    2 lj

    jrwherepwill be the number of parameters to beestimated on the current level, we get

    Brjna.s.rj= E

    2lj(F1(x1) , . . . ,Fk(xk))

    r

    j

    andn12 ajn

    a N(0,j), where

    j= Cov

    lj(F1(x1) , . . . ,Fk(xk))j

    +k

    r=1

    I{Fr(xr)ur} ur

    2lj(u)jur

    dC(u)

    .

    We assemble the variances jin the block matrix

    =

    1 0 . . . 00 2 . . . 0...

    .... . .

    ...0 0 . . . p

    ,as well as B = {jr}j,r=1,...,p. Putting them together we can con-clude that

    n12 ( ) aN(0, B1B1).

    References

    Aas, K., Czado, C., Frignessi, A., Bakken, H., 2009. Pair-copula constructions ofmultiple dependence. Insurance: Mathematics and Economics 44, 182198.Acar,E., Genest,C., Nelehov,J., 2012.Beyondsimplifiedpair-copula constructions.

    Journal of Multivariate Analysis 110, 7490.Bedford, T., Cooke, R.M.,2002. Vinesa new graphical model for dependentrandom

    variables. Annals of Statistics 30 (4), 10311068.Bhuchongkul, S., 1964. A class of nonparametric tests for independence in bivariate

    populations. The Annals of Mathematical Statistics 35, 138149.Chen, X., Fan, Y., 2006. Estimation and model selection of semiparametric copula-

    based multivariate dynamic models under copula misspecification. Journal ofEconometrics 135, 125154.

    Chen, X., Fan, Y., 2005. Pseudo-likelihood ratio tests for model selection insemiparametric multivariate copula models. The Canadian Journal of Statistics33 (2), 389414.

    Chen, X., Fan, Y., Patton, A., 2004. Simple tests for models of dependence betweenmultiple financial time series, with applications to U.S. Equity returns andexchange rates, Discussion paper 483, Financial Markets Group, London Schoolof Economics.

    Chen, X., Fan, Y., Tsyrennikov, V., 2006. Efficient estimation of semiparametric

    multivariate copula models. Journal of the American Statistical Association 101(475), 12281240.

    Chen, S.X., Huang, T., 2007. Nonparametric estimation of copula functions fordependence modeling. The Canadian Journal of Statistics 35, 265282.

    Dissmann, J., Brechmann, E., Czado, C., Kurowicka, D., Selecting and estimatingregular vine copulae and application to financial returs, arXiv:1202.2002.

    Embrechts,P., McNeil, A.J.,Straumann, D., 2002. Correlationand dependencein riskmanagement: properties and pitfalls. In: Risk Management: Value at Risk andBeyond. Cambridge University Press, pp. 176223.

    Fermanian, J.-D., 2005. Goodness-of-fit tests for copulas. Journal of MultivariateAnalysis 95 (1), 119152.

    Fermanian, J.-D., Scaillet, O., 2003. Nonparametric estimation of copulas for time

    series. Journal of Risk 5, 2554.Genest, C., Ghoudi, K., Rivest, L.-P., 1995. A semi-parametric estimation procedure

    of dependence parameters in multivariate families of distributions. Biometrika82, 543552.

    Genest, C., Quessy, J.-F., Rmillard, B., 2006. Goodness-of-fit procedures for copulamodels based on the probability integral transformation. Scandinavian Journalof Statistics 33, 337366.

    Genest, C., Rivest, L.-P., 1993. Statistical inference procedures for bivariateArchimedean copulas. Journal of the American Statistical Association 88,10341043.

    Genest, C., Rmillard, B., 2008. Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Annales de lInstitut Henri Poincar.Probabilits et Statistiques 44, 10961127.

    Giacomini, E., Hrdle, W.K., Spokoiny, V., 2009. Inhomogeneous dependencemodeling with time-varying copulae. Journal of Business and EconomicStatistics 27 (2), 224234.

    Hennessy, D.A., Lapan, H.E., 2002. The use of Archimedean copulas to modelportfolio allocations. Mathematical Finance 12, 143154.

    Hjek, J.,Sidk, Z., 1967. Theory of Rank Test. Academic Press, New York.

    Hofert, J.M., 2012. A stochastic representation and sampling algorithm for nestedArchimedean copulas. Journal of Statistical Computation and Simulation82 (9),12391255.

    Hofert, J.M., 2011. Efficiently sampling Archimedean copulas. ComputationalStatistics & Data Analysis 55, 5770.

    Hofert, M., Pham, D., Densities of nested Archimedean copulas,arXiv:1204.2410v2.Joe, H., 2005. Asymptotic efficiency of the two-stage estimation method for copula-

    based models. Journal of Multivariate Analisys 94, 401419.Joe, H., 1997. Multivariate Models and Dependence Concepts. Chapman & Hall,

    London.Junker, M., May, A., 2005. Measurement of aggregate risk with copulas. E conomet-

    rics Journal 8, 428454.Lee, T.-H., Long, X., 2009. Copula-based multivariate GARCH model with uncorre-

    lated dependent errors. Journal of Econometrics 150, 207218.Mantegna,R.N., 1999.Hierarchicalstructurein financialmarkets.EuropeanPhysical

    Journal B 11, 193 197.McNeil, A.J., 2008. Sampling nested Archimedean copulas. Journal Statistical

    Computation and Simulation 78, 567581.McNeil, A.J., Nelehov, J., 2009. Multivariate Archimedean copulas, d-monotone

    functions and l 1 norm symmetric distributions. Annals of Statistics 37 (5b),30593097.Min, A., Czado, C., 2010. Bayesian inference for multivariate copulas using pair-

    copula constructions. Journal of Financial Econometrics 8, 511546.Nelsen, R.B., 2006. An Introduction to Copulas. Springer, Berlin.Nelson, D., 1991. Conditional heteroscedasticity in asset returns: a new approach.

    Econometrica 59, 347370.Okhrin, O., Okhrin, Y., Schmid, W., Properties of the hierarchical Archimedean

    copulas, Statistics & Risk Modelling, forthcoming,(http://dx.doi.org/10.1524/strm.2013.1071).

    Patton, A.J., 2004. On the out-of-sample importance of skewness and asymmetricdependence for asset allocation. Journal of Financial Econometrics 2, 130168.

    Pitt, M., Chan, D., Kohn, R., 2006. Efficient Bayesian inference for Gaussian copularegression models. Biometrika 93, 537554.

    Ruymgaart, F., Shorak, G., Zwet, W., 1972. Asymptotic normality of nonparametrictests for independence. The Annals of Mathematical Statistics 43, 11221135.

    Savu, C., Trede, M., 2008. Goodness-of-fit tests for parametric families ofArchimedean copulas. Quantitative Finance 8, 109116.

    Savu, C., Trede, M., 2010. Hierarchical Archimedean copulas. Quantitative Finance10, 295304.

    Schweizer, B., Sklar, A., 1983. Probabilistic Metric Spaces. North Holland,Amsterdam.

    Sklar, A., 1959. Fonctions d repartition n dimension et leurs marges. Publ. Inst.Stat. Univ. Paris 8, 229231.

    Smith, M., Gan, Q., Kohn, R., 2012. Mod