Pitfalls and Possibilities in Predictive Regression · Nonstandard (LUR) limit theory for AR...
Transcript of Pitfalls and Possibilities in Predictive Regression · Nonstandard (LUR) limit theory for AR...
Pitfalls and Possibilities in Predictive Regression
Peter C. B. Phillips
SoFiE Conference June 2013
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 1 / 90
OverviewPitfalls in Predictive Regressions
Bias + Nonstandard Inference
spurious predictabilitydiffi culties with multiple predictors + strong endogeneity
Confidence Intervals can have Zero Coverage Probability
failure of uniformity (in AR and predictive regression)prediction tests reject with probability one under null
Quantile Predictive Regression
crossing problemsinvalidity of QR for nonstationary datanonstandard inference for LUR case
Model Validity under Predictability
Meaningful alternativesEquation balancing in time series characteristics
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 2 / 90
OverviewPossibilities
IVX regression
use reconstructed regressor as own instrument (Magdalinos-Phillips, 2009)easy to use in empirical work (Pitarakis & Gonzalo, 2012)has broad asymptotic validity —correct CI
Quantile Predictive Regression & IVX versions
resolve quantile crossings & validityprovides predictions at all quantilessuperior for tail event prediction
Nonparametric regression
NP regression addresses balance & validity issuesasymptotics are uniform across SM, LM, nonstationary cases
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 3 / 90
Prediction ModelPredictive regressions: widely used linear predictive regression
Stock returns vs D/P ratio, Future ER vs forward premium, Consump-tion growth vs lagged Income, etc
Designed in simple form
yt+1 = βxt + u0t+1xt+1 = ρxt + uxt+1
Simple mds innovation structure often used: ut = (u0t , uxt )′
EFt−1ut = 0, EFt−1[utu′t
]=
[σ00 σ0xσx0 σxx
]=: Σ
General WD CHE dependence structure possible/desirable
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 4 / 90
Prediction ModelPredictive regressions: widely used linear predictive regression
Stock returns vs D/P ratio, Future ER vs forward premium, Consump-tion growth vs lagged Income, etc
Designed in simple form
yt+1 = βxt + u0t+1xt+1 = ρxt + uxt+1
Simple mds innovation structure often used: ut = (u0t , uxt )′
EFt−1ut = 0, EFt−1[utu′t
]=
[σ00 σ0xσx0 σxx
]=: Σ
General WD CHE dependence structure possible/desirable
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 4 / 90
Pitfalls: (a) BiasFinite Sample Bias with Stationary Regressors
Centered OLS decomposition:(β− β
)=
∑nt=1 xt−1u0t∑nt=1 x
2t−1
=∑nt=1 xt−1u0.xt∑nt=1 x
2t−1
+
(σ0xσxx
)∑nt=1 xt−1uxt∑nt=1 x
2t−1
=∑nt=1 xt−1u0.xt∑nt=1 x
2t−1
+
(σ0xσxx
)(ρ− ρ) ,
where u0.xt = u0t − σ0xσxxuxt .
Under normality, stationarity (|ρ| < 1), intercept —Stambaugh (1999):
E(
β− β)= −
(σ0xσxx
)(1+ 3ρ
n
)+O
(1n2
),
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 5 / 90
Pitfalls: (a) BiasFinite Sample Bias with Stationary Regressors
Centered OLS decomposition:(β− β
)=
∑nt=1 xt−1u0t∑nt=1 x
2t−1
=∑nt=1 xt−1u0.xt∑nt=1 x
2t−1
+
(σ0xσxx
)∑nt=1 xt−1uxt∑nt=1 x
2t−1
=∑nt=1 xt−1u0.xt∑nt=1 x
2t−1
+
(σ0xσxx
)(ρ− ρ) ,
where u0.xt = u0t − σ0xσxxuxt .
Under normality, stationarity (|ρ| < 1), intercept —Stambaugh (1999):
E(
β− β)= −
(σ0xσxx
)(1+ 3ρ
n
)+O
(1n2
),
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 5 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
Consensus that potential explanatory regressors are highly persistent inthis model
Local to unity specification ρn = 1+ cn has received much attention:
Campbell and Yogo (2006) and Jansson and Moreira (2006)
Nonstandard (LUR) limit theory for AR coeff. - Phillips (1987,1988):
n (ρ− ρ) =1n ∑n
t=1 xt−1u0t1n2 ∑n
t=1 x2t−1
=⇒∫Jxc (r)dB0(r) + λ∫
Jxc (r)2dr
The “finite sample bias” is still present in the limit and not obviouslycorrectable since c is not consistently estimable (but see below)
Not mixed normal and not pivotal
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 6 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
Consensus that potential explanatory regressors are highly persistent inthis model
Local to unity specification ρn = 1+ cn has received much attention:
Campbell and Yogo (2006) and Jansson and Moreira (2006)
Nonstandard (LUR) limit theory for AR coeff. - Phillips (1987,1988):
n (ρ− ρ) =1n ∑n
t=1 xt−1u0t1n2 ∑n
t=1 x2t−1
=⇒∫Jxc (r)dB0(r) + λ∫
Jxc (r)2dr
The “finite sample bias” is still present in the limit and not obviouslycorrectable since c is not consistently estimable (but see below)
Not mixed normal and not pivotal
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 6 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
Nonnormality comes from endogeneity & LUR limit theory
If ut ∼ mds (0,Σ)
ζnt =
[1n xt−1u0t1√nuxt
]
∑ EFnt−1ζntζ′nt =
( 1n2 ∑ x2t−1
)σ00
(1n√n ∑ xt−1
)σ0x(
1n√n ∑ xt−1
)σx0 σxx
︸ ︷︷ ︸
not diagonal unless σ0x 6=0
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 7 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
From the decomposition
n(
β− β)=
1n ∑n
t=1 xt−1u0.xt1n2 ∑n
t=1 x2t−1
+σx0σxx
1n ∑n
t=1 xt−1uxt1n2 ∑n
t=1 (xt−1)2
The second term is a standard LUR distribution1n ∑n
t=1 xt−1uxt1n2 ∑n
t=1 x2t−1
=⇒∫Jxc (r)dBx (r)∫Jxc (r)2dr
,
Giving limit theory for n(
β− β)
MN
(0, σ00.x
(∫Jxc (r)
2dr)−1)
+
(σx0σxx
) ∫Jxc (r)dBx (r)∫Jxc (r)2dr
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 8 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
From the decomposition
n(
β− β)=
1n ∑n
t=1 xt−1u0.xt1n2 ∑n
t=1 x2t−1
+σx0σxx
1n ∑n
t=1 xt−1uxt1n2 ∑n
t=1 (xt−1)2
The second term is a standard LUR distribution1n ∑n
t=1 xt−1uxt1n2 ∑n
t=1 x2t−1
=⇒∫Jxc (r)dBx (r)∫Jxc (r)2dr
,
Giving limit theory for n(
β− β)
MN
(0, σ00.x
(∫Jxc (r)
2dr)−1)
+
(σx0σxx
) ∫Jxc (r)dBx (r)∫Jxc (r)2dr
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 8 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
From the decomposition
n(
β− β)=
1n ∑n
t=1 xt−1u0.xt1n2 ∑n
t=1 x2t−1
+σx0σxx
1n ∑n
t=1 xt−1uxt1n2 ∑n
t=1 (xt−1)2
The second term is a standard LUR distribution1n ∑n
t=1 xt−1uxt1n2 ∑n
t=1 x2t−1
=⇒∫Jxc (r)dBx (r)∫Jxc (r)2dr
,
Giving limit theory for n(
β− β)
MN
(0, σ00.x
(∫Jxc (r)
2dr)−1)
+
(σx0σxx
) ∫Jxc (r)dBx (r)∫Jxc (r)2dr
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 8 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
Nonstandard limit theory from endogeneity and persistent regressor
Endogeneity/bias correction methods such as Phillips and Hansen (1990,FM-OLS) are designed for UR case (c = 0) and give
n(
βFM − β)=⇒ MN
(−c σx0
σxx, σ00.x
(∫Jxc (r)
2dr)−1)
Without precise information on c, the bias is not correctable
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 9 / 90
Pitfalls: (b) Nonstandard InferenceInference with Persistent Regressors
t test
β− β
σβ=⇒
(σ00.xσ00
)1/2
Z +σx0
(σxxσ00)1/2
( ∫Jxc (r)dBx (r)(
σxx∫Jxc (r)2dr
)1/2
)=
(1− δ2
)1/2Z + δηLUR (c)
whereδ =
σx0
(σxxσ00)1/2
Unless δ = 0, standard chi-square inference is not valid
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 10 / 90
Pitfalls: (b) Nonstandard InferenceBonferroni Bounds Method
Test has mixture limit theory
tβ =β− β
σβ=⇒ δηLUR (c) +
(1− δ2
)1/2Z . (1)
Cavanagh, Elliott and Stock (1995) :
pretest for δ = 0 and t-test is validif δ 6∼ 0, use a Bonferroni method and most conservative cconstruct a 100 (1− α1)% CI for c using Stock (1991) confidence beltsFor each c in CI, construct a 100 (1− α2)% CI for β, denoted asCIβ|c (α2) , using (1).A CI free of c is obtained as
CIβ (α) =⋃
c∈CIc (α1)CIβ|c (α2) .
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 11 / 90
Pitfalls: constructing a CI for cStock’s (1991) suggestion
use LUR limit theory for AR tρ (Phillips, 1987)
tρ =ρ− 1
σρ=⇒
∫JcdW(∫
J(r)2dr)1/2 + c
(∫Jc (r)2dr
)1/2
=: τc , (2)
invert statistic tρ using confidence bands constructed from the limitdistribution theory of τc
mechanism: use confidence belts (Kendall, 1946) from intensive simu-lations of (2)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 12 / 90
Pitfalls: constructing a CI for cStock’s confidence belts for c
beltuse
Confidence belts (levels 2.5%, 50% and 97.5%) with asymptotic functionalapproximations —Phillips (2013)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 13 / 90
Pitfalls: constructing a CI for cBonferroni Method - specifics
using ρ find CIc (α1) = [cl (α1) , cu (α1)] by inversion - Stock (1991)
using the cv dtβ,c of tβ ∼ δηLUR (c) +(1− δ2
)1/2Z , find
CIβ(α1, α2) =[d βl (α1, α2), d
βu (α1, α2)
]=
[min
cl≤c≤cudtβ,c , 12 α2
, maxcl≤c≤cu
dtβ,c ,1− 12 α2
]Then
Pr(tβ /∈
[d βl (α1, α2), d
βu (α1, α2)
])→ Pr
(δηLUR (c) +
(1− δ2
)1/2Z /∈
[d βl (α1, α2), d
βu (α1, α2)
])≤ α1 + α2. (By Bonferroni)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 14 / 90
Pitfalls: constructing a CI for cBonferroni Method - specifics
using ρ find CIc (α1) = [cl (α1) , cu (α1)] by inversion - Stock (1991)
using the cv dtβ,c of tβ ∼ δηLUR (c) +(1− δ2
)1/2Z , find
CIβ(α1, α2) =[d βl (α1, α2), d
βu (α1, α2)
]=
[min
cl≤c≤cudtβ,c , 12 α2
, maxcl≤c≤cu
dtβ,c ,1− 12 α2
]Then
Pr(tβ /∈
[d βl (α1, α2), d
βu (α1, α2)
])→ Pr
(δηLUR (c) +
(1− δ2
)1/2Z /∈
[d βl (α1, α2), d
βu (α1, α2)
])≤ α1 + α2. (By Bonferroni)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 14 / 90
Pitfalls: constructing a CI for cBonferroni Method - numerically intensive like grid bootstrap computations
Golly, I just ran 5 billion regressions!!
Construct the cv dtβ,c of tβ ∼ δηLUR (c) +(1− δ2
)1/2Z across a grid
20, 000 values of c × 5 values of δ × 50, 000 replications
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 15 / 90
Pitfalls: constructing a CI for cBonferroni Method - numerically intensive like grid bootstrap computations
Golly, I just ran 5 billion regressions!!
Construct the cv dtβ,c of tβ ∼ δηLUR (c) +(1− δ2
)1/2Z across a grid
20, 000 values of c × 5 values of δ × 50, 000 replications
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 15 / 90
Pitfalls: constructing a CI for cBonferroni Method - Campbell & Yogo (2006, CY)
Campbell and Yogo (2006): the same idea with DF-GLS
Often employed in applied work and in other methodological work
Undesirable Properties:
Conservative by construction: numerical size often much less than nom-inal sizeResults in biased test, since there are local alternatives for rejecting lessoften than nominal size.No extensions to multivariate systems/multiple regressors.... plus ...
A Major New Problem!!
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 16 / 90
Pitfalls: constructing a CI for cBonferroni Method - Campbell & Yogo (2006, CY)
Campbell and Yogo (2006): the same idea with DF-GLS
Often employed in applied work and in other methodological work
Undesirable Properties:
Conservative by construction: numerical size often much less than nom-inal sizeResults in biased test, since there are local alternatives for rejecting lessoften than nominal size.No extensions to multivariate systems/multiple regressors.... plus ...
A Major New Problem!!
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 16 / 90
Pitfalls: constructing a CI for cBonferroni Method - Campbell & Yogo (2006, CY)
Campbell and Yogo (2006): the same idea with DF-GLS
Often employed in applied work and in other methodological work
Undesirable Properties:
Conservative by construction: numerical size often much less than nom-inal sizeResults in biased test, since there are local alternatives for rejecting lessoften than nominal size.No extensions to multivariate systems/multiple regressors.... plus ...
A Major New Problem!!
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 16 / 90
Pitfalls: constructing a CI for cRobustness Failure
Unnoticed Problem:
Stock (1991) confidence intervals have ZERO coverage probability when|ρ| < 1 (and more generally when n 13 (1− ρ)→ ∞)UR test statistic τc fails tightnessCIc (α1) = [cl (α1) , cu (α1)] divergesprobability mass escapes as c → −∞ and stationary region |ρ| < 1 isapproached
Consequences - serious failure of robustness to ρ
CY confidence intervals have zero coverage probability in stationary caseCY Q test indicates predictability with probability unity under the nullwhen |ρ| < 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 17 / 90
Pitfalls: constructing a CI for cRobustness Failure
Unnoticed Problem:
Stock (1991) confidence intervals have ZERO coverage probability when|ρ| < 1 (and more generally when n 13 (1− ρ)→ ∞)UR test statistic τc fails tightnessCIc (α1) = [cl (α1) , cu (α1)] divergesprobability mass escapes as c → −∞ and stationary region |ρ| < 1 isapproached
Consequences - serious failure of robustness to ρ
CY confidence intervals have zero coverage probability in stationary caseCY Q test indicates predictability with probability unity under the nullwhen |ρ| < 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 17 / 90
Asymptotic analysis - local to unity caseForm of the confidence belts
TheoremAs c → −∞ the asymptotic form of the distribution of τc is
τc ∼ N(−|c |
1/2
21/2 −1
23/2 |c |1/2 ,14
)+Op
(1
|c |1/2
), (3)
inducing the (2.5%,97.5%) confidence belts−|c |
1/2
21/2 −1
23/2 |c |1/2 ±1.962
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 18 / 90
Asymptotic analysis - local to unity caseExpansion of the limit distribution
As c → −∞, we expand τc as
τc =
∫JcdW(∫
J(r)2dr)1/2 + c
(∫Jc (r)2dr
)1/2
=(−2c)1/2 ∫ JcdW(−2c)
∫J(r)2dr
1/2 −|c |1/2
21/2
(−2c)1/2
∫Jc (r)2dr
1/2
∼ ζ1+ 2ζ
(−2c )1/2
1/2 −|c |1/2
21/2
1+
2ζ
(−2c)1/2
1/2
= −|c |1/2
21/2 +12
ζ +Op
(1
|c |1/2
), ζ ≡ N (0.1)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 19 / 90
Asymptotic analysis - local to unity caseForm of the confidence belts
Asymptotic confidence belts:− |c |
1/2
21/2 − 123/2 |c |1/2 ± 1.96
2
Confidence belts with asymptotic functional approximations
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 20 / 90
Asymptotic analysis - local to unity caseForm of the confidence belts
Family τc = N
(−|c |
1/2
21/2 −1
23/2 |c |1/2 ,14
)is not tight as c → −∞ .
Probability mass escapes as c → −∞, i.e as stationary region |ρ| < 1 isapproached
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 21 / 90
Asymptotic analysis - local to unity caseForm of the confidence belts
Family τc = N
(−|c |
1/2
21/2 −1
23/2 |c |1/2 ,14
)is not tight as c → −∞ .
Probability mass escapes as c → −∞, i.e as stationary region |ρ| < 1 isapproached
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 21 / 90
Asymptotic analysis - stationary caseStock CIs have zero coverage probability in stationary/near stationary case
Theorem100(1− α)% CI for |ρ| < 1 is
[ρL, ρU ] =
[1− 2Aρ + 2
2ζ ± zα/2√n
A1/2ρ
], Aρ =
1− ρ
1+ ρ(4)
where ζ = N (0, 1). Coverage probability is
P ρ ∈ [ρL, ρU ] = Pζ
ζ ∈
[√n4(1− ρ)A1/2
ρ ± zα/2
2
]
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 22 / 90
Asymptotic analysis - stationary caseStock CIs have zero coverage probability in stationary/near stationary case
[ρL, ρU ]→p3ρ− 11+ ρ
= ρ
1 2
4
2
2
rho
rho_bar
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 23 / 90
Asymptotic analysis - stationary caseStock CIs have zero coverage probability in stationary/near stationary case
Coverage probability
P ρ ∈ [ρL, ρU ] = Pζ
ζ ∈
√n4(1− ρ)A1/2
ρ︸ ︷︷ ︸→∞
± zα/2
2
→ 0
i.e. if√n (1− ρ)A1/2
ρ → ∞.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 24 / 90
Asymptotic analysis - stationary caseStock CIs have zero coverage probability in stationary/near stationary case
Zero coverage probability whenever√n (1− ρ)3/2 → ∞
That is: for all fixed |ρ| < 1 and for all mildly integrated ρ with
ρ = 1+cnδ, δ <
13, c < 0
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 25 / 90
Implications for Campbell Yogo Predictive TestsComplete failure of robustness to AR coeffi cient
Coverage probabilities of Campbell-Yogo and stationary confidenceintervals for the predictive regression coeffi cient β.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 26 / 90
Implications for Campbell Yogo Predictive TestsPrediction Failure
CY Q test employs t ratio tβ(ρ) based on regression coeffi cient β (ρ)conditional on ρ :
β (ρ) =∑nt=1 xt−1
[yt − σx0
σxx(xt − ρxt−1)
]∑nt=1 x
2t−1
Confidence interval from Stock inversion giving [ρL, ρu ] and
[βL (ρU ) , βU (ρL)] =[
β (ρU )− zα2/2σβ(ρ), β (ρL) + zα2/2σβ(ρ)
]
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 27 / 90
Implications for Campbell Yogo Predictive TestsPrediction Failure
CY confidence interval is inconsistent for all |ρ| < 1
[βL (ρU ) , βU (ρL)] → β+σx0σxx
(ρ− 1)2
ρ+ 16= β
for all σx0 6= 0
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 28 / 90
Implications for Campbell Yogo Predictive TestsPrediction Failure
Under H0 : β = 0, Size → Unity for |ρ| < 1
β (ρ)→pσx0σxx
(ρ− 1)2
ρ+ 16= 0
So rejection probability → 1
Always find evidence of predictability with this test!!
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 29 / 90
Possibilities: (a) & (b) Correct centeringCorrect centring of AR test statistic - then use Bonferroni construction
Hansen, 1999: bootstrap approach; Mikusheva, 2007, 2012.
Idea: do the inversion based on tρ =ρ−ρσρ
tρ =⇒
∫JcdW
(∫J (r )2dr)
1/2 ⇒c→−∞
N (0, 1) (Phillips, 1987)
N (0, 1) for n (1− ρ)→ ∞ (Giraitis & Phillips, 2006)
CI construction for ρ is then uniform in ρ ∈ (0, 1] (Mikusheva, 2007).
But: numerically intensive, applies only for AR(1), no extension tomultivariate regressors
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 30 / 90
Possibilities: (a) & (b) Correct centeringCorrect centring of AR test statistic - then use Bonferroni construction
Hansen, 1999: bootstrap approach; Mikusheva, 2007, 2012.
Idea: do the inversion based on tρ =ρ−ρσρ
tρ =⇒
∫JcdW
(∫J (r )2dr)
1/2 ⇒c→−∞
N (0, 1) (Phillips, 1987)
N (0, 1) for n (1− ρ)→ ∞ (Giraitis & Phillips, 2006)
CI construction for ρ is then uniform in ρ ∈ (0, 1] (Mikusheva, 2007).
But: numerically intensive, applies only for AR(1), no extension tomultivariate regressors
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 30 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach —Magdalinos and Phillips (MP, 2009)
Idea: self generation - create less persistent IV using own regressor xt
zt =t
∑j=1
ρt−jnz 4xj
ρnz = 1+cznϕ, ϕ ∈ (0, 1) , cz < 0,
IVX ⇒ filter the regressor xt to produce a valid instrument for xtreduce persistence when xt is LURapplicable when xt is UR, LUR, MI, ME
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 31 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach —Magdalinos and Phillips (MP, 2009)
Idea: self generation - create less persistent IV using own regressor xt
zt =t
∑j=1
ρt−jnz 4xj
ρnz = 1+cznϕ, ϕ ∈ (0, 1) , cz < 0,
IVX ⇒ filter the regressor xt to produce a valid instrument for xtreduce persistence when xt is LURapplicable when xt is UR, LUR, MI, ME
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 31 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach - mechanics
IVX is based on the regressor xt
zt =t
∑j=1
ρt−jnz 4xj , ρnz = 1+cznϕ, ϕ ∈ (0, 1) , cz < 0
Since 4xj = cn xj−1 + uxt , we have the decomposition
zt =t
∑j=1
ρt−jnz
(cnxj−1 + uxt
)=
t
∑j=1
ρt−jnz uxt +cn
t
∑j=1
ρt−jnz xj−1
= zt +cn
ψnt
zt = ρnzzt−1 + uxt plays the role of a mildly integrated instrument.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 32 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach - mechanics
IVX is based on the regressor xt
zt =t
∑j=1
ρt−jnz 4xj , ρnz = 1+cznϕ, ϕ ∈ (0, 1) , cz < 0
Since 4xj = cn xj−1 + uxt , we have the decomposition
zt =t
∑j=1
ρt−jnz
(cnxj−1 + uxt
)=
t
∑j=1
ρt−jnz uxt +cn
t
∑j=1
ρt−jnz xj−1
= zt +cn
ψnt
zt = ρnzzt−1 + uxt plays the role of a mildly integrated instrument.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 32 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach - mechanics
IVX is based on the regressor xt
zt =t
∑j=1
ρt−jnz 4xj , ρnz = 1+cznϕ, ϕ ∈ (0, 1) , cz < 0
Since 4xj = cn xj−1 + uxt , we have the decomposition
zt =t
∑j=1
ρt−jnz
(cnxj−1 + uxt
)=
t
∑j=1
ρt−jnz uxt +cn
t
∑j=1
ρt−jnz xj−1
= zt +cn
ψnt
zt = ρnzzt−1 + uxt plays the role of a mildly integrated instrument.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 32 / 90
Possibilities: (a) & (b) IVX InferenceIVX estimation & limit theory
IVX estimator:
βIVX =∑nt=1 zt−1yt
∑nt=1 zt−1xt−1
= β+∑nt=1 zt−1u0t
∑nt=1 zt−1xt−1
Nice limit theory under the rate restriction ϕ ∈ (0, 1)
n1+ϕ2(
βIVX − β)=⇒ Ψ
where Ψ is a correctly centered mixed normal random variable
The t-ratio:
tβIVX =βIVX − β
σIVX=⇒ Z
which is a standard normal
IVXasymptotics
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 33 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach - key idea
asymptotic independence between MG parts of numerator/denominator.
Compared to earlier discussion
ζnt =
[ 1
n1+ϕ2zt−1u0t1√nuxt
],
∑ EFnt−1ζntζ′nt =
( 1n1+ϕ ∑ z2t−1
)σ00
(1
n1+ϕ2
∑ zt−1)
σ0x(1
n1+ϕ2
∑ zt−1)
σ0x σxx
=⇒ diagonal
achieved by reducing order of magnitude of the instrument zt
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 34 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach - ongoing research
Ongoing work:
allowing multivariate xt with mixed orders of persistenceallowing intercepts, deterministic trends, and structural breaks
Desirable properties:
pivotal mixed normal limit theory under general persistent regressorsaccommodates multivariate systems + many regressorsapplicable when regressors have differing levels of persistencegood size & power properties of test
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 35 / 90
Possibilities: (a) & (b) IVX InferenceIVX approach - ongoing research
Ongoing work:
allowing multivariate xt with mixed orders of persistenceallowing intercepts, deterministic trends, and structural breaks
Desirable properties:
pivotal mixed normal limit theory under general persistent regressorsaccommodates multivariate systems + many regressorsapplicable when regressors have differing levels of persistencegood size & power properties of test
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 35 / 90
Pitfalls: (c) Invalidity of Quantile Preditive RegressionQR Model Invalidity under the Alternative
QR predictive model - conditional quantile formulation:
Qyt (τ|Ft−1) = β (τ) xt−1 + σ1/200 Φ−1u0 (τ) (5)
xt = xt−1 + uxt
Φ = cdfN (0, 1), Qyt (τ|Ft−1) = quantile function, ut = (u0t , uxt )′ ∼ mds
Quantile crossings occur for ρ > τ when natural order is reversed:
β (ρ)− β (τ) xt−1 + σ1/200
Φ−1u0 (ρ)−Φ−1u0 (τ)
< 0
that is whenever
xt−1 <σ1/200
Φ−1u0 (τ)−Φ−1u0 (ρ)
β (ρ)− β (τ)
, if β (ρ)− β (τ) > 0 (6)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 36 / 90
Pitfalls: (c) Invalidity of Quantile Preditive RegressionQR Model Invalidity under the Alternative
QR predictive model - conditional quantile formulation:
Qyt (τ|Ft−1) = β (τ) xt−1 + σ1/200 Φ−1u0 (τ) (5)
xt = xt−1 + uxt
Φ = cdfN (0, 1), Qyt (τ|Ft−1) = quantile function, ut = (u0t , uxt )′ ∼ mds
Quantile crossings occur for ρ > τ when natural order is reversed:
β (ρ)− β (τ) xt−1 + σ1/200
Φ−1u0 (ρ)−Φ−1u0 (τ)
< 0
that is whenever
xt−1 <σ1/200
Φ−1u0 (τ)−Φ−1u0 (ρ)
β (ρ)− β (τ)
, if β (ρ)− β (τ) > 0 (6)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 36 / 90
Pitfalls: (c) Invalidity of Quantile Preditive RegressionQR Model Invalidity under the Alternative
TheoremQuantile crossing frequency given by
n−1n
∑t=11
xt−1√n<
1√n
σ1/200
Φ−1u0 (τ)−Φ−1u0 (ρ)
β (ρ)− β (τ)
⇒∫ 1
01 Bx (r) < 0 dr ,
distribution is the arc sine law with density 1πx 1/2(1−x )1/2 over x ∈ [0, 1]
Most likely many or few crossings for a given trajectory xtnt=1.Quantile prediction requires constant slope coeffi cient for validity!
But ...
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 37 / 90
Pitfalls: (c) Invalidity of Quantile Preditive RegressionQR Model Invalidity under the Alternative
TheoremQuantile crossing frequency given by
n−1n
∑t=11
xt−1√n<
1√n
σ1/200
Φ−1u0 (τ)−Φ−1u0 (ρ)
β (ρ)− β (τ)
⇒∫ 1
01 Bx (r) < 0 dr ,
distribution is the arc sine law with density 1πx 1/2(1−x )1/2 over x ∈ [0, 1]
Most likely many or few crossings for a given trajectory xtnt=1.Quantile prediction requires constant slope coeffi cient for validity!
But ...
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 37 / 90
Pitfalls: (c) Invalidity of Quantile Preditive RegressionQR Model Invalidity under the Alternative
TheoremQuantile crossing frequency given by
n−1n
∑t=11
xt−1√n<
1√n
σ1/200
Φ−1u0 (τ)−Φ−1u0 (ρ)
β (ρ)− β (τ)
⇒∫ 1
01 Bx (r) < 0 dr ,
distribution is the arc sine law with density 1πx 1/2(1−x )1/2 over x ∈ [0, 1]
Most likely many or few crossings for a given trajectory xtnt=1.Quantile prediction requires constant slope coeffi cient for validity!
But ...
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 37 / 90
Possibilities: (c) Quantile Preditive RegressionQR Model Validity by Marginalizing the Alternative
Local to constant slope β (τ) = β+ b(τ)kn
with kn → ∞
With localizing coeffi cient function b (τ) ∈ [bL, bU ]
Local QR model
Qyt (τ|Ft−1) =(
β+b (τ)kn
)xt−1 + σ1/2
00 Φ−1u0 (τ)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 38 / 90
Possibilities: (c) Quantile Preditive RegressionQR Model Validity by Marginalizing the Alternative
Condition for no quantile crossing
√nb (ρ)− b (τ)
kn
xt−1√n+ σ1/2
00
Φ−1u0 (ρ)−Φ−1u0 (τ)
> 0
holds if√nkn→ 0 as n→ ∞.
Probability of quantile crossing tends to zero as n→ ∞.
kn → ∞ fast enough so quantiles do not cross(√
nkn→ 0
)kn → ∞ not too fast
(knn → 0
)so marginal deviations can be consistently
estimated
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 39 / 90
Possibilities: (c) Quantile Preditive RegressionQR Model Validity by Marginalizing the Alternative
Condition for no quantile crossing
√nb (ρ)− b (τ)
kn
xt−1√n+ σ1/2
00
Φ−1u0 (ρ)−Φ−1u0 (τ)
> 0
holds if√nkn→ 0 as n→ ∞.
Probability of quantile crossing tends to zero as n→ ∞.
kn → ∞ fast enough so quantiles do not cross(√
nkn→ 0
)kn → ∞ not too fast
(knn → 0
)so marginal deviations can be consistently
estimated
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 39 / 90
Possibilities: (c) Quantile Preditive RegressionFM-QR Estimation in UR case (Xiao, 2009)
FM-QR limit theory for xt ≡ UR
n(
β+(τ)− β (τ)
)⇒ MN (0,V ) , V =
σ2ψ.x
f (F−1 (τ))2
(∫ 1
0B2x
)−1Allowance for marginal alternatives β (τ) = β+ b(τ)
knwith kn → ∞
nkn
(b+ (τ)− b (τ)
)⇒ MN (0,V )
Extensions to IVX-QR for xt ≡ LUR,UR,MI ,ME (Lee, 2013)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 40 / 90
Pitfalls: (d) Misbalancing in GeneralModel Invalidity under the Alternative
Predictive model formulation:
yt+1 = βxt + u0t+1; xt+1 = ρxt + uxt+1
Equation unbalanced under βA 6= 0 as βAxt−1 = Op(√n)conflicts
with yt = I (0)
But if we reject the null and βA 6= 0 and yt is I (0) we have
β =1n
1n ∑n
t=1 xt−1yt1n2 ∑n
t=1 x2t−1
= Op
(1n
)→p 0
so β→p 0 conflicts with the alternative hypothesis β = βA
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 41 / 90
Pitfalls: (d) Misbalancing in GeneralModel Invalidity under the Alternative
Predictive model formulation:
yt+1 = βxt + u0t+1; xt+1 = ρxt + uxt+1
Equation unbalanced under βA 6= 0 as βAxt−1 = Op(√n)conflicts
with yt = I (0)
But if we reject the null and βA 6= 0 and yt is I (0) we have
β =1n
1n ∑n
t=1 xt−1yt1n2 ∑n
t=1 x2t−1
= Op
(1n
)→p 0
so β→p 0 conflicts with the alternative hypothesis β = βA
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 41 / 90
Possibililites: (d) Achieving Balance under AlternativeResolving Balance by localization of coeffi cient
Local to zero predictive regression:
yt+1 = βnxt + u0t+1xt+1 = ρnxt + uxt+1
Suppose βn =bnγ with b 6= 0, marginal deviation from null
Then
yt ∼
u0t + bn12−γJx
( tn
)6= I (0) γ < 0.5
u0t + bJx( tn
)= Op (1) γ = 0.5
u0t = Op (1) γ > 0.5Test is consistent for γ ∈ [0.5, 1) and inconsistent for γ ≥ 1 since
nβn = n(
βn − βn)+ nβn = Op(1) +Op
(n1−γ
)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 42 / 90
Possibililites: (d) Achieving Balance under AlternativeResolving Balance by localization of coeffi cient
Local to zero predictive regression:
yt+1 = βnxt + u0t+1xt+1 = ρnxt + uxt+1
Suppose βn =bnγ with b 6= 0, marginal deviation from null
Then
yt ∼
u0t + bn12−γJx
( tn
)6= I (0) γ < 0.5
u0t + bJx( tn
)= Op (1) γ = 0.5
u0t = Op (1) γ > 0.5Test is consistent for γ ∈ [0.5, 1) and inconsistent for γ ≥ 1 since
nβn = n(
βn − βn)+ nβn = Op(1) +Op
(n1−γ
)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 42 / 90
Possibililites: (d) Achieving Balance under AlternativeResolving Balance by localization of coeffi cient
Local to zero predictive regression:
yt+1 = βnxt + u0t+1xt+1 = ρnxt + uxt+1
Suppose βn =bnγ with b 6= 0, marginal deviation from null
Then
yt ∼
u0t + bn12−γJx
( tn
)6= I (0) γ < 0.5
u0t + bJx( tn
)= Op (1) γ = 0.5
u0t = Op (1) γ > 0.5Test is consistent for γ ∈ [0.5, 1) and inconsistent for γ ≥ 1 since
nβn = n(
βn − βn)+ nβn = Op(1) +Op
(n1−γ
)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 42 / 90
Possibililites: (d) Achieving Balance under AlternativeResolving Balance by localization of coeffi cient
Marginal deviations from null are detectable!!
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 43 / 90
Possibililites: (d) Achieving Balance under AlternativeResolving Balance by Nonlinearity
Nonlinearity & local linearity
Linear specifications can be satisfactory but require coeffi cient local-ization or error localization for balance
localization is a form of nonlinearity where coeffi cients change with n
Direct nonlinear specifications also offer reasonable solutions
And may also benefit from coeffi cient localization
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 44 / 90
Motivation for nonlinear specificationsResolving Balance by Nonlinearity
Model
yt+1 = g (xt ) + u0t+1,
xt+1 =(1+
cn
)xt + uxt+1,
Park and Phillips (1998): yt can be I (0) for some g such as g ∈ L1attenuates signaltransforms I (1) process to a process with memory d ∼ 1
4
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 45 / 90
Motivation for nonlinear specificationsResolving Balance by Nonlinearity
Integrable Transform g ∈ L1 of a random walk
0 100 200 300 400 50020
15
10
5
0
5
10
RW vs g (RW) (red)
0 100 200 300 400 50020
15
10
5
0
5
10
g (RW) + white noise (green)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 46 / 90
Motivation for nonlinear specificationsResolving Balance by Nonlinearity
Model
yt+1 = g (xt ) + u0t+1,
xt+1 =(1+
cn
)xt + uxt+1,
Wang and Phillips (2009a,b;WP): kernel estimation gives(√nh)1/2
g =(√nh)1/2
[g − g ] +(√nh)1/2
g = Op((√
nh)1/2
)divergent, thereby achieving (2)
Limit theory is pivotal and mixed normal —so (1) and (2) are covered;and effects are small (3) without coeffi cient localization for g ∈ L1.
asymptotics , simulations
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 47 / 90
Motivation for nonlinear specificationsResolving Balance by Nonlinearity with Localization
For more general g (e.g. asymptotically homogeneous) local to zeropredictive power may be achieved with
yt+1 = βng (xt ) + u0t+1,
xt+1 = xt + uxt+1
with some function g (xt ) = g (xt ,π) known up to π
Consistent estimation possible for βn local to zero, e.g.
|βn | %bn1/4 , or
bn1/2κ (n1/2,πo )
Asymptotic theory in Shi and Phillips (2012) for weakly identified π
Solutions to (a), (b) and (c) can be achieved in this framework.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 48 / 90
Motivation for nonlinear specificationsResolving Balance by Nonlinearity with Localization
For more general g (e.g. asymptotically homogeneous) local to zeropredictive power may be achieved with
yt+1 = βng (xt ) + u0t+1,
xt+1 = xt + uxt+1
with some function g (xt ) = g (xt ,π) known up to π
Consistent estimation possible for βn local to zero, e.g.
|βn | %bn1/4 , or
bn1/2κ (n1/2,πo )
Asymptotic theory in Shi and Phillips (2012) for weakly identified π
Solutions to (a), (b) and (c) can be achieved in this framework.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 48 / 90
Pitfalls + Possibilities in Predictive Regressionsum up
Pitfalls
bias in estimation, testing, CIs & nonstandard inference
uncertain degrees of persistence - can be fatal as in Stock &CY
need to cope with multiple regressors and marginal predictability
omitted predictors misspecification
issues of model validity under alternative
Possibilities
IVX & Nonparametrics
Application to long horizon prediction (Phillips & Lee, 2013)
Quantile Regression balancing & IVX (Phillips, 2013; Lee, 2013)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 49 / 90
Pitfalls + Possibilities in Predictive Regressionsum up
Pitfalls
bias in estimation, testing, CIs & nonstandard inference
uncertain degrees of persistence - can be fatal as in Stock &CY
need to cope with multiple regressors and marginal predictability
omitted predictors misspecification
issues of model validity under alternative
Possibilities
IVX & Nonparametrics
Application to long horizon prediction (Phillips & Lee, 2013)
Quantile Regression balancing & IVX (Phillips, 2013; Lee, 2013)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 49 / 90
Pitfalls + Possibilities in Predictive Regressionsum up
Pitfalls
bias in estimation, testing, CIs & nonstandard inference
uncertain degrees of persistence - can be fatal as in Stock &CY
need to cope with multiple regressors and marginal predictability
omitted predictors misspecification
issues of model validity under alternative
Possibilities
IVX & Nonparametrics
Application to long horizon prediction (Phillips & Lee, 2013)
Quantile Regression balancing & IVX (Phillips, 2013; Lee, 2013)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 49 / 90
Thank You!
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 50 / 90
Robustness Issues
Triangular system with UR or LUR regressors:
yt = Axt + u0t
xt UR, or Local to Unity (LUR)
Invalid asymptotic inference when xt LUR
Estimators for A are not asymptotically MN (0, ·)Wald test is not asymptotically χ2
FM and other correction methods do not work
Inference relies upon knowledge of type of regressor persistence
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 51 / 90
Feasible IVX Inference
Triangular system with unknown degree of persistence:
yt = Axt + u0txt = Rnxt−1 + uxt
Rn = IK +Cnα, C ≤ 0, α > 0
xt UR if C = 0 or α > 1xt LUR if C < 0 r& α = 1xt MI if C < 0 r& α ∈ (0, 1)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 52 / 90
Feasible IVX Inference (instrument construction)
Choose
Rnz = IK + Cz/nϕ for ϕ ∈ (2/3, 1) , Cz < 0
Instrument Construction: zt = ∑tj=1 R
t−jnz ∆xj
Since ∆xj = uxj + C/nαxj−1
zt =t
∑j=1R t−jnz uxj +
Cnα
t
∑j=1R t−jnz xj−1 = zt + error
zt Rnz -mildly integrated
Two Cases: (i) α < ϕ, (ii) ϕ ≤ α ≤ 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 53 / 90
Feasible IVX Inference
IVX estimationAn =
(Y ′Z − n∆0x
) (X ′Z
)−1TheoremCase (i): When 2/3 < ϕ < min (α, 1) ,
n1+ϕ2 vec
(An − A
)⇒ MN
(0,(Ψ−1xx
)′CzVzzCz Ψ−1xx ⊗Ω00
)as n→ ∞, where
Ψxx =
Ωxx +
∫ 10 BxdB
′x xt UR
Ωxx +∫ 10 JC dJ
′C xt LUR
Ωxx + VxxC xt MI
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 54 / 90
Feasible IVX Inference
Let Vxz =∫ ∞0 e
rCVxxerCzdr
TheoremCase (ii): For α ∈ (1/3, ϕ] and ϕ ∈ (2/3, 1):
n1+α2 vec
(An − A
)⇒N(0,V−1xx ⊗Ω00
)if α < ϕ
N(0,V−1xz C
−1VxxC−1 (V′xz )−1 ⊗Ω00
)if α = ϕ
Asymptotic mixed normality applies for all
1/3 < α ≤ ϕ ∈ (2/3, 1)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 55 / 90
Feasible IVX Inference
Let Vxz =∫ ∞0 e
rCVxxerCzdr
TheoremCase (ii): For α ∈ (1/3, ϕ] and ϕ ∈ (2/3, 1):
n1+α2 vec
(An − A
)⇒N(0,V−1xx ⊗Ω00
)if α < ϕ
N(0,V−1xz C
−1VxxC−1 (V′xz )−1 ⊗Ω00
)if α = ϕ
Asymptotic mixed normality applies for all
1/3 < α ≤ ϕ ∈ (2/3, 1)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 55 / 90
Feasible IV Inference (testing)
Testing for linear restrictions:
H0 : Hvec (A) = h, rank (H) = r
Wald test based on IV procedure:
Wn =(HvecAn − h
)′ [H(X ′PZX
)−1 ⊗ Ω00
H ′]−1 (
HvecAn − h)
TheoremIf α > 1/3 and ϕ is chosen in (2/3, 1)
Wn ⇒H0 χ2 (r)
irrespective of whether xt is UR, or LUR or MI
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 56 / 90
Feasible IVX Inference —Remarks
Standard χ2 inference applies without a priori knowledge of the orderof persistence of xt : a full solution to Elliott (1998)
Price paid for robustness: reduction in the RofC of An to O(n1+ϕ2
)from O (n).Exception - no cost when: xt mildly integrated with α ≤ ϕ
IVX method is feasible: instruments constructed from the regressorswithout extra information.Hence the requirement α > 1/3: untractable simultaneityMildly integrated processes provide good instruments because
zt stationary: Op (1) zt mild: Op(nϕ/2
) zt LUR: Op
(n1/2
)reduced RoC =⇒eliminates endogeneityincreased RoC=⇒tractable asymptotic bias
IVXInference
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 57 / 90
Nonparametric predictive regressionNonparametric estimation
Model with single lagged regressor
yt = g(xt−`) + u0t ,
xt =(1+
cn
)xt−1 + uxt
unknown g , mds u0t , and SM, LM, AP uxtNonparametric NW estimation
g(x) =∑nt=`+1 Kh (xt−` − x) yt
∑nt=`+1 Kh (xt−` − x)
Limit theory (Wang & Phillips, 2009)(n
∑t=1+`
K(xt−` − xhn
))1/2
(g(x)− g(x)) d→ N(0, σ20
∫ ∞
−∞K (x)2dx
)Includes SM, LM, AP uxt with |d | < 1
2 (Kasparis, Andreou & Phillips,2013)Peter C. B. Phillips () Prediction SoFiE Conference June 2013 58 / 90
Nonparametric predictive regressionNonparametric estimation
Model with single lagged regressor
yt = g(xt−`) + u0t ,
xt =(1+
cn
)xt−1 + uxt
unknown g , mds u0t , and SM, LM, AP uxtNonparametric NW estimation
g(x) =∑nt=`+1 Kh (xt−` − x) yt
∑nt=`+1 Kh (xt−` − x)
Limit theory (Wang & Phillips, 2009)(n
∑t=1+`
K(xt−` − xhn
))1/2
(g(x)− g(x)) d→ N(0, σ20
∫ ∞
−∞K (x)2dx
)Includes SM, LM, AP uxt with |d | < 1
2 (Kasparis, Andreou & Phillips,2013)Peter C. B. Phillips () Prediction SoFiE Conference June 2013 58 / 90
Nonparametric predictive regressionNonlinear model testing
Test constant regression function H0 : g(x) = µ using a t-ratio
t(x , µ) :=
∑nt=1+` K
(xt−`−xhn
)σ20∫ ∞−∞ K (λ)
2dλ
1/2
(g(x)− µ)d→ N (0, 1)
with µ = ∑nt=1+` yt/n and σ20 = n
−1 ∑nt=1+` (yt − µ)2 under the null
Test functionals over isolated points Xs = x1, ..., xs for some fixed s
Fsum := ∑x∈Xs
[t(x , µ)]2 and Fmax := maxx∈Xs
[t(x , µ)]2 .
Limit theory under H0 as n→ ∞
Fsumd→ χ2s and Fmax
d→ Y ∼d maxi≤s(χ21i )
back
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 59 / 90
Nonparametric predictive regressionNonlinear model testing
Test constant regression function H0 : g(x) = µ using a t-ratio
t(x , µ) :=
∑nt=1+` K
(xt−`−xhn
)σ20∫ ∞−∞ K (λ)
2dλ
1/2
(g(x)− µ)d→ N (0, 1)
with µ = ∑nt=1+` yt/n and σ20 = n
−1 ∑nt=1+` (yt − µ)2 under the null
Test functionals over isolated points Xs = x1, ..., xs for some fixed s
Fsum := ∑x∈Xs
[t(x , µ)]2 and Fmax := maxx∈Xs
[t(x , µ)]2 .
Limit theory under H0 as n→ ∞
Fsumd→ χ2s and Fmax
d→ Y ∼d maxi≤s(χ21i )
back
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 59 / 90
Nonparametric predictive regressionNonlinear model testing
Test constant regression function H0 : g(x) = µ using a t-ratio
t(x , µ) :=
∑nt=1+` K
(xt−`−xhn
)σ20∫ ∞−∞ K (λ)
2dλ
1/2
(g(x)− µ)d→ N (0, 1)
with µ = ∑nt=1+` yt/n and σ20 = n
−1 ∑nt=1+` (yt − µ)2 under the null
Test functionals over isolated points Xs = x1, ..., xs for some fixed s
Fsum := ∑x∈Xs
[t(x , µ)]2 and Fmax := maxx∈Xs
[t(x , µ)]2 .
Limit theory under H0 as n→ ∞
Fsumd→ χ2s and Fmax
d→ Y ∼d maxi≤s(χ21i )
back
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 59 / 90
Nonparametric predictive regressionSimulations - Model design
Model
yt = g(xt−1) + u0t , xt =(1+
cn
)xt−1 + uxt , x0 = 0
Error mechanisms
uxt = ρxuxt−1 + ηt , ρx = 0, 0.3 (SM)
(I − L)duxt = ηt , d = −0.25, 0.25 (AP & LM)[u0tηt
]∼ iidN
(0,[1 rr 1
]), |r | < 1.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 60 / 90
Nonparametric predictive regressionSimulations - Model design
Functional forms
g0(x) = 0 (null hypothesis)g1(x) = 0.015x , (linear)g2(x) = 1
4 sign(x) |x |1/4 (polynomial)g3(x) = 1
5 ln(|x |+ 0.1) (logarithmic)
g4(x) = (1+ e−x )−1 (logistic)g5(x) = (1+ |x |0.9)−1 (recip’al)g6(x) = e−5x
2(integrable)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 61 / 90
Nonparametric predictive regressionSimulation Results - Size (Nominal 5%), Tests - NP, FM, JM
n = 500, h = n−b , xt ∼ local to unity process (c)
1 0.5 0 0.5 1
00.10.2
0.4
0.6
0.8
1
R
Fig. 1(a) c = 0
1 0.5 0 0.5 1
00.1
0.2
0.4
0.6
0.8
1
R
Fig. 1(b) c = −50
NPP b = 0.1, NPP b = 0.2, FMLS, J&M
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 62 / 90
Nonparametric predictive regressionSimulation Results - Size (Nominal 5%), Tests - NP, FM, JM
n = 500, h = n−b , ∆xt ∼ ARFIMA(d)
1 0.5 0 0.5 1
00.10.2
0.4
0.6
0.8
1
R
Fig. 2(a) d=-0.25
1 0.5 0 0.5 1
00.10.2
0.4
0.6
0.8
1
R
Fig. 2(b) d=0.25
NPP b = 0.1, NPP b = 0.2, FMLS, J&M
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 63 / 90
Nonparametric predictive regressionSimulation Results - Power, Tests - NP, FM, JM
n = 1000, h = n−b , g(x)) = 0.015x : ρx = 0.3
1 0.5 0 0.5 1
00.10.2
0.4
0.6
0.8
1
R
Fig. 3(a) c = 0
1 0.5 0 0.5 1
00.10.2
0.4
0.6
0.8
1
R
Fig. 3(b) c = −50
NPP b = 0.1, NPP b = 0.2, FMLS, J&M
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 64 / 90
Nonparametric predictive regressionSimulation Results - Power, Tests - NP, FM, JM
n = 1000, h = n−b , c = 0, ρx = 0.3
1 0.5 0 0.5 1
00.10.2
0.4
0.6
0.8
1
R
Fig. 4(a) g(x) = (1+ |x |0.9)−11 0.5 0 0.5 1
0
0.1
0.2
0.4
0.6
0.8
1
R
Fig. 4(b) g(x) = e−x2
NPP b = 0.1, NPP b = 0.2, FMLS, J&M
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 65 / 90
Nonparametric predictive regressionEmpirics
Predictability Test Results for SP500 Returns 1926:1 - 2010:12 (n = 1009)
Two common predictors, Various grids and bandwidths hn = σxn−b
Predictor: Dividend Price Earnings Priceratio ratio
Log(D/P) Log(E/P)
Tests Grid pts Lag b bSum 10 1 - 0.1
4 0.3,0.4 0.1Sum 35 1 - 0.1
4 0.2,0.3 0.1,0.2Sum 50 1 0.2,0.3,0.4 0.1,0.2
4 - 0.1,0.2,0.3,0.4
back
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 66 / 90
Pitfalls + Possibilities in Predictive Regression
Pitfalls
bias & nonstandard inference
uncertain degrees of persistence - can be fatal
zero coverage probability in CIs for stationarity/MI regressors
need to cope with multiple regressors and marginal predictability
omitted predictors misspecification - need to allow for temporal depen-dence
model validity under alternative
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 67 / 90
Pitfalls + Possibilities in Predictive Regression
Pitfalls
bias & nonstandard inference
uncertain degrees of persistence - can be fatal
zero coverage probability in CIs for stationarity/MI regressors
need to cope with multiple regressors and marginal predictability
omitted predictors misspecification - need to allow for temporal depen-dence
model validity under alternative
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 67 / 90
Pitfalls + Possibilities in Predictive Regression
Possibilities - IVX
easy to use, no simulations/integration, standard chi-square inference
holds for local to unity, mildly integrated, mildly explosive, and station-ary regressors
handles multiple regressors - many fundamentals, varying degrees ofpersistence
handles multiple dependent variables, e.g., size sorted, value sorted,momentum sorted returns can all be regressed simultaneously
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 68 / 90
Pitfalls + Possibilities in Predictive Regression
Possibilities - Nonparametrics
easy to use
copes with predictors of general functional form
assures model validity under alternative
particularly wide applicability regarding regressor persistence
good size and power properties against parametric methods
but challenged by multiple regressors
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 69 / 90
Motivation for dependent innovationsScalar regressor misspecification with mds innovations
Many papers impose mds innovations in predictive regression
yt+1 = βnxt + u0t+1, EFt (u0t+1) = 0 (7)
Bivariate regression with mds innovations is unconvincing .... when
Two or more fundamentals x1t and x2t are known to have predictivepowerAs many authors have found (e.g. Campbell & Yogo, 2006)
Omitted variable misspecification produces contradiction to (7)
yt+1 = βnx1t + u0t+1, EFt (u0t+1) 6= 0 since x2t ∈ Ft .
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 70 / 90
Predictive Regression - IVX Limit TheoryThe Model
Multivariate system of predictive regressions:
yt+1 = Axt + u0t+1xt+1 = Rnxt + uxt+1
Rn = Ip +Cnα, for some α ∈ (0, 1]
A : m× p coeffi cient matrix and C = diag (c1, c2, ..., cp) are unknown.Each xit can be in the general vicinity of unity
mildly integrated (C < 0, α ∈ (0, 1)),nearly integrated (C < 0, α = 1),integrated (C = 0),locally explosive (C > 0, α = 1),mildly explosive (C > 0, α ∈ (0, 1)).
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 71 / 90
Predictive Regression - IVX Limit TheoryThe Model
Multivariate system of predictive regressions:
yt+1 = Axt + u0t+1xt+1 = Rnxt + uxt+1
Rn = Ip +Cnα, for some α ∈ (0, 1]
A : m× p coeffi cient matrix and C = diag (c1, c2, ..., cp) are unknown.Each xit can be in the general vicinity of unity
mildly integrated (C < 0, α ∈ (0, 1)),nearly integrated (C < 0, α = 1),integrated (C = 0),locally explosive (C > 0, α = 1),mildly explosive (C > 0, α ∈ (0, 1)).
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 71 / 90
Predictive Regression - IVX Limit TheoryInnovation structure
Weakly dependent errors:
ut =
[u0tuxt
]=
∞
∑j=0Fj εt−j , εt ∼ iid (0,Σ) ,
Σ > 0, E ‖ε1‖4 < ∞, F0 = Im+p ,∞
∑j=0j ‖Fj‖ < ∞,
F (z) =∞
∑j=0Fjz j and F (1) =
∞
∑j=0Fj > 0.
Ω =∞
∑h=−∞
E(utu′t−h
)= F (1)ΣF (1)′,Ω =
[Ω00 Ω0x
Ωx0 Ωxx
].
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 72 / 90
Predictive Regression - IVX Limit TheoryPreliminary tools
FCLT:
1√n
bnsc
∑j=1
uj =1√n
bnsc
∑j=1
[u0juxj
]=
[B0n(s)Bxn(s)
]=⇒
[B0(s)Bx (s)
]= BM
[Ω00 Ω0x
Ωx0 Ωxx
].
Local to unity asymptotics:xt√n=⇒ Jxc (r) with bnrc = t,
whereJxc (r) =
∫ r
0e(r−s)C dBx (s),
encompassing the unit root case
Jxc (r) = Bx (r) if C = 0.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 73 / 90
Predictive Regression - IVX Limit TheoryPreliminary tools
BN decomposition (Phillips and Solo, 1992):
ut = F (1)εt −4εt =
[F0(1)Fx (1)
]εt −4εt
εt =∞
∑j=0Fj εt−j , Fj =
∞
∑s=j+1
Fs
in partitioned form
u0t = F0(1)εt −4ε0t
uxt = Fx (1)εt −4εxt
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 74 / 90
IVX EstimationIVX estimator - local to unity case
Consider first α = 1 (local to unity xt). We impose, for technicalreasons, the rate condition:
n12
nϕ+nϕ
k+kn→ 0.
Reasonable since, e.g., k = n0.6 corresponds to 4-5 years among 60-80years for monthly data.
IVX Estimator:
B IVX − B =(n−k∑t=1
u0t+k(zkt)′)(n−k
∑t=1
xkt(zkt)′)−1
.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 75 / 90
Limit TheoryNumerator
Numerator decomposition
1
k12 n
12+ϕ
n−k∑t=1
u0t+k(zkt)′
=1
k12 n
12+ϕ
n−k∑t=1
u0t+k(zkt)′+
1
k12 n
32+ϕ
n−k∑t=1
u0t+k(
ψknt
)′C
=1
k12 n
12+ϕ
n−k∑t=1
u0t+k(zkt)′+ op (1)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 76 / 90
Limit TheoryNumerator
Component limits
Lemma1 Cz
k12 nϕzkt =⇒ Vx (t) ≡ N (0,Ωxx ) for any t
2 1n ∑n−k
t=1
(Czk12 nϕzkt) (
Czk12 nϕzkt)′→p Ωxx = Fx (1)ΣFx (1)′
3x kt=bnr c
kn12⇒ Jxc (r)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 77 / 90
Limit TheoryNumerator
Use BN decomposition and
Apply MGCLT to the dominant component of the numerator
Lemma
For nϕ
k +kn → 0,
1
k12 n
12+ϕ
n−k∑t=1
u0t+k(zkt)′=⇒ N
(0,C−1z ΩxxC−1z ⊗Ω00
)Since
1
k12 n
12+ϕ
n−k∑t=1
u0t+k(zkt)′∼ 1√
nF0 (1)
n−k∑t=1
εt+k
(zktk12 nϕ
)′
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 78 / 90
Limit TheoryDenominator
Denominator decomposition
1k2n1+ϕ
n−k∑t=1
xkt(zkt)′
=1n
n−k∑t=1
xktkn
12
(zktknϕ
)′+1n
n−k∑t=1
xktkn
12
(Cz
kn12+ϕ
ψknt
)′C−1z C
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 79 / 90
Limit TheoryDenominator
Component limits
Lemma
1 1k 2n2+ϕ ∑n−k
t=1 xkt
(ψknt)′C =⇒
(∫ 10 J
xc (r) J
xc (r)
′ dr)C−1z C
2 1k 2n1+ϕ ∑n−k
t=1 xkt
(zkt)′Cz →p 1
2Ωxx
Denominator limit
1k2n1+ϕ
n−k∑t=1
xkt(zkt)′=⇒ 1
2ΩxxC−1z +
(∫ 1
0Jxc (r) J
xc (r)
′ dr)C−1z C
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 80 / 90
Limit TheoryIVX Estimator
Combining gives the limit theory for
B IVX − B =(n−k∑t=1
u0t+k(zkt)′)(n−k
∑t=1
xkt(zkt)′)−1
.
Theorem
n12 k
32
(B IVX − B
)=⇒ MN (0,ΣB ) ,
where
ΣB =(Ψ−1cxz
)′C−1z ΩxxC−1z
(Ψ−1cxz
)⊗Ω00,
Ψcxz =12
ΩxxC−1z +
(∫ 1
0Jxc (r) J
xc (r)
′ dr)C−1z C .
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 81 / 90
Limit TheoryIVX Estimator
Mixed normality holds b/c
ζnt+k :=
[ 1
k12 n
12 +ϕzkt ⊗ εt+k
1√n εt+k
]
has asymptotically orthogonal components:
∑ EFnt+k−1ζnt+k ζ ′nt+k
=
1n ∑n−k
t=1
(1
k12 nϕzkt) (
1
k12 nϕzkt)′⊗ Σ 1
n ∑n−kt=1
(1
k12 nϕzkt)⊗ Σ
1n ∑n−k
t=1
(1
k12 nϕzkt)⊗ Σ Σ
→ p
[C−1z ΩxxC−1z ⊗ Σ 0
0 Σ
].
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 82 / 90
Limit Theory of TestsSelf-normalized IVX Estimator for use in Testing
Lemma
1 1kn1+2ϕ ∑n−k
t=1
(zkt) (zkt)′= 1
kn1+2ϕ ∑n−kt=1
(zkt) (zkt)′+ op(1).
2 nk3(X ′PZX
)−1 ⊗ Ω00 =⇒(Ψ−1cxz
)C−1z (Ωxx )C−1z
(Ψ−1cxz
)′ ⊗Ω00
where (X ′PZX
)−1=
(n−k∑t=1
xkt(zkt)′)(n−k
∑t=1
(zkt) (zkt)′)−1 (n−k
∑t=1
xkt(zkt)′)′
−1
.
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 83 / 90
Limit TheorySelf-normalized IVX Estimator for use in Testing
Theorem
Vec(B IVX − B
)′ [(X ′PZX
)−1 ⊗ Ω00
]−1Vec
(B IVX − B
)=⇒ χ2 (mp) ,
using a consistent estimator Ω00 for Ω00.
Standard Wald statistic WIVX - easy to compute
Standard chi-squared test
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 84 / 90
Model Validity & Consistency under AlternativeBalance by localization of coeffi cient
Local to zero predictive regression:
yt+k = Bnxkt + u0t+k , x
kt =
k
∑j=1xt+j−1
Suppose Bn = Bnγk with B 6= 0. Now
x kt=bnr c
kn12⇒ Jxc (r)
Then
yt+k ∼
u0t+k + Bn12−γJxc
( tn
)6= I (0) γ < 0.5
u0t+k + BJxc( tn
)= Op (1) γ = 0.5
u0t+k = Op (1) γ > 0.5
Balance for γ ≥ 0.5
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 85 / 90
Model Validity & Consistency under AlternativeBalance by localization of coeffi cient
Local to zero predictive regression:
yt+k = Bnxkt + u0t+k , x
kt =
k
∑j=1xt+j−1
Suppose Bn = Bnγk with B 6= 0. Now
x kt=bnr c
kn12⇒ Jxc (r)
Then
yt+k ∼
u0t+k + Bn12−γJxc
( tn
)6= I (0) γ < 0.5
u0t+k + BJxc( tn
)= Op (1) γ = 0.5
u0t+k = Op (1) γ > 0.5
Balance for γ ≥ 0.5
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 85 / 90
Model Validity & Consistency under AlternativeBalance by localization of coeffi cient
Test consistency
n12 k
32 B IVX = n
12 k
32
(B IVX − Bn
)+ n
12 k
32Bn = Op(1) +Op
(k12
nγ− 12
)
Test is consistent whenever n12 k
32
nγk = k12
nγ− 12→ ∞
Convergence rate of B IVX is fast enough to distinguish Bn 6= 0
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 86 / 90
Long-Horizon IVXIVX estimator - mildly integrated case
Consider first α < 1, C < 0 (mildly integrated xt). We use the ratecondition:
n12
nϕ+nϕ
k+knα+nα
n→ 0.
Set horizon k << nα << n
IVX limit theory:
Theorem
n12 k
32
(B IVX − B
)=⇒ N
(0, 4Ω−1xx ⊗Ω00
)WIVX =⇒ χ2 (mp)
Same convergence rate as for α = 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 87 / 90
Long Horizon IVXIVX estimator - mildly explosive case
Consider first α < 1, C > 0 (mildly explosive xt with Rn = I + n−αC ).We use the same rate condition:
n12
nϕ+nϕ
k+knα+nα
n→ 0.
IVX limit theory
Theorem
knα(B IVX − B
)Rnn =⇒ MN
(0,∫ ∞
0e−pCYCY
′C e−pC dp ⊗Ω00
)WIVX =⇒ χ2 (mp)
Faster convergence rate than α = 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 88 / 90
Localizing Coeffi cient EstimationUnivariate Model - illustration
Localizing coeffi cient C and index α in
xt+1 = Rnxt + uxt+1
Rn = 1+cnα, for some α ∈ (0, 1]
α is consistently estimable
Estimators: α = − log|Rn−1|log n →p α; α =log( 2n ∑n
t=1 x2t )
log n →p α
Limit distribution
n1−α2 log n
α− α+
1log n
log
∣∣∣∣c σ2xω2x
∣∣∣∣ ⇒ ω2x
cσ2xξc α ∈ (0, 1)
(log n) α− 1 ⇒ − log |c + ξc | α = 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 89 / 90
Localizing Coeffi cient EstimationUnivariate Model - illustration
Localizing coeffi cient C and index α in
xt+1 = Rnxt + uxt+1
Rn = 1+cnα, for some α ∈ (0, 1]
α is consistently estimable
Estimators: α = − log|Rn−1|log n →p α; α =log( 2n ∑n
t=1 x2t )
log n →p α
Limit distribution
n1−α2 log n
α− α+
1log n
log
∣∣∣∣c σ2xω2x
∣∣∣∣ ⇒ ω2x
cσ2xξc α ∈ (0, 1)
(log n) α− 1 ⇒ − log |c + ξc | α = 1
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 89 / 90
Pitfalls: constructing a CI for cStock’s confidence belts for c
back
Confidence belts (levels 2.5%, 50% and 97.5%) with asymptotic functionalapproximations —Phillips (2013)
Peter C. B. Phillips () Prediction SoFiE Conference June 2013 90 / 90