Part II: Estimation and Forecasting with ARIMA models · H Studya class of parametric univariate...

41
Part II: Estimation and Forecasting with ARIMA models Firmin Doko Tchatoka [email protected] https://www.adelaide.edu.au/directory/firmin.dokotchatoka The University of Adelaide July 9, 2018 Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 1 / 41

Transcript of Part II: Estimation and Forecasting with ARIMA models · H Studya class of parametric univariate...

Part II: Estimation and Forecasting with ARIMA models

Firmin Doko Tchatoka

[email protected]

https://www.adelaide.edu.au/directory/firmin.dokotchatoka

The University of AdelaideJuly 9, 2018

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 1 / 41

Targeted objectives of this module

Goals

H Study a class of parametric univariate time series models: ARIMA = AR/I/MA

− AR models: definition, stationarity, autocorrelation & partial

autocorrelation functions

− MA models: autocorrelation & partial autocorrelation functions

− General ARMA models

- Estimation

- Forecasting

I Forecasting SARIMA modelsFirmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 2 / 41

ARIMA Autoregressive (AR) models

ARIMA & AR processesDefinition

ARIMA = Autoregressive Integrated Moving Average. To understand ARIMA

processes, it is important to know the properties of the AR, I, and MA parts of it.

I A time series yt ∼ AR(p) ≡ linear regression of yt on a constant and first p lags

of yt :

yt = φ0 + φ1yt−1 + φ2yt−2 + . . . + φpyt−p + εt (1)

− εt is an error term ≡i.i.d.: E(εt) = 0 and E(ε2t ) = σ2

− Cannot use (1) to forecast yt+τ because φ’s and σ2 are unknown

− Lag length p is also unknown: we only have data y1, y2, . . . , yT

− AR(1): yt = φ0 + φ1yt−1 + εt , AR(2): yt = φ0 + φ1yt−1 + φ2yt−2 + εt

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 3 / 41

ARIMA Moving Average (MA) models

MA models

I A time series yt ∼ MA(q) if

yt = θ0 + εt + θ1εt−1 + θ2εt−2 + . . . + θqεt−q (2)

− εt is an error term ≡i.i.d. across t with E(εt) = 0 and E(ε2t ) = σ2

− Cannot use (2) to forecast yt+τ because both the parameters

θ0, θ1, θ2, . . . , θq, σ2 and current + lagged error terms are unknown⇒ must

be estimated

− Lag length q is also unknown⇒ must be estimated

− q = 1→ MA(1): yt = θ0 + εt + θ1εt−1 , p = 2→ MA(2):

yt = θ0 + εt + θ1εt−1 + θ2εt−2

− MA(q)≡ linear regression of yt on a constant and first q lags of error term:

cannot run OLS because independent (explanatory) variables are unobserved

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 4 / 41

ARIMA Unit roots and stationarity

Unit roots

I A time series yt has a unit root if |φ1| = 1 in the regression

yt = φ0 + φ1yt−1 + εt

- If yt contains exactly two unit roots, then the first ∆yt = yt − yt−1 contains one unit root

- A time series yt is integrated of order d if it contains exactly d unit roots. If so, we write

yt ∼ I(d)

- A time series yt is weakly (or second-order) stationary if it does not contain a unit roots, i.e.,

yt ∼ I(0)

- Unit root is usually referred to as stochastic trend

- yt is a random walk if it contains a stochastic trend:

yt = β0︸︷︷︸drift

+ β1t︸︷︷︸deterministic trend

+yt−1 + εt

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 5 / 41

ARIMA Testing for unit roots

Testing for unit roots

I Dickey-Fuller (DF) original test for unit roots involves fitting the AR(1) model:

yt = α + δt + φyt−1 + εt (3)

− Null hypothesis of unit root is: H0 : φ = 1

− Regression (3) is likely to be plagued by serial correlation

− To control for that, the augmented Dickey-Fuller (ADF) test instead fits a

model of the form

yt = α + δt + ρyt−1 + ζ1∆yt−1 + . . . + ζk∆yt−k + ut (4)

where ∆xt = xt − xt−1 for any variable xt , ρ = φ− 1→ H0 : ρ = 0 v.s. H1 : ρ < 0

− H1 : ρ < 0 is chosen as the case ρ > 0 is unlikely

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 6 / 41

ARIMA Testing for unit roots

Testing for unit roots

- We must consider one of the four cases for H0:

Case Process Restriction DF options

1 Random walk without drift α = 0, δ = 0 noconstant

2 Random walk without drift δ = 0 default

3 Random walk with drift δ = 0 drift

4 Random walk with drift & trend none trend

- Require of Optimal choice of k in (4)→ Command: ‘varsoc varname’→ use

SBIC/HQIC

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 7 / 41

ARIMA Testing for unit roots

ADF test: Application to airline data

I ADF test on airline passengers data

- Plot of data indicates both a drift and a deterministic trend

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 8 / 41

ARIMA Testing for unit roots

ADF test: Application to airline data

- Result indicates no evidence of a unit root

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 9 / 41

ARIMA Testing for unit roots

ADF test: Application to German log of consumption

- Plot of data indicates both a drift and a deterministic trend

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 10 / 41

ARIMA Testing for unit roots

ADF test: Application to German log of consumption

- Result shows strong evidence for the presence of unit root

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 11 / 41

ARIMA Testing for unit roots

Other unit root tests in STATA

I Phillips-Perron (PP) test:

- modification the ADF test statistic to account for the potential serial

correlation/heteroskedasticity in the errors

- command: pperron varname, lag(#) trend → lag(#)≡ lag length of Newey-West HAC

estimator

I GLS detrended ADF test:

- similar to the ADF test but prior to fitting the model in (4), one first transforms the actual

series via a generalized least-squares (GLS) regression

- More powerful than the ADF test

- command: dfgls varname, maxlag(#) trend

- maxlag(#) sets the value of k, the highest lag order for the first-differenced, detrended

variable in the DF regression: by default, kmax = floor[12{T+1

100 }14]→ Schwert, G. W (1989,

JBES)

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 12 / 41

ARIMA Testing for unit roots

Properties of a Random Walk

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 13 / 41

ARIMA Testing for unit roots

ARIMA(p,d,q)

I A time series yt ∼ ARIMA(p,d ,q) means that it d th difference

∆dyt ∼ ARIMA(p,q) :

− yt contains d unit roots

− ∆dyt ∼ I(0) (second-order stationary): d + 1 successive ADF tests must be

conducted to estimate d

− Once d is estimated, we can identify p and q from the transformed series

yt = ∆dyt , which is weakly stationary [yt ∼ I(0)]

− Then we can estimate the other parameters of the ARIMA specification,

and use these estimates to forecast the level series

− p and q are identified using the Autocorrelation (AC) and Partial

Autocorrelation (PAC) functions respectively.

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 14 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Autocorrelation function (ACF)

I Autocovariance function:

γk = cov(yt , yt−k) : k = 0,1,2, . . . (5)

I Autocorrelation function:

ρk =cov(yt , yt−k)

γ0: k = 0,1,2, . . . (6)

Both γk and ρk are symmetric function of k , i.e., γ−k = γk and ρ−k = ρk . Note thatρ0 = 1 and −1 ≤ ρk ≤ 1.

- Stationary AR(1): yt = φ0 + φ1yt−1 + εt

µ =: E[yt ] =φ0

1− φ1, var (yt) =

σ2

1− φ21,

γk = φk1

σ2

1− φ21, ρk = φk

1, k = 0,1,2, . . .

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 15 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Autocorrelation function (ACF)

- MA(1): yt = εt + θ0 + θ1εt−1

µ =: E[yt ] = θ0, var (yt) = σ2(1 + θ21),

γ1 = θ1σ2, ρ1 =

θ1

1 + θ21

and ρk = 0 ∀k > 1.

I Stationarity is a property of the AR part of the process⇒ MA processes are

always stationary and ρk = 0 for all k > q for an MA(q).

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 16 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Partial Autocorrelation function (PACF)

I Consider the AR(k) regression:

yt = β0 + β1yt−1 + . . . + βkyt−k + ut , k = 1,2, . . . (7)

I k th-order PAC of yt for any k = 1,2,3, . . . is:

PACk = βk (8)

H For an AR(p) process, ρk is not zero after lag p but PACk = 0 for k > p ⇒ PACk is used to

identify p

H For an MA(q) process, PACk is not zero after lag q but ρk = 0 for k > q ⇒ ρk is used to

identify q

I Estimating ARMA(p,q) models requires identifying both p and q → properties

discussed above are key ingredients to do this

I In STATA, the command corrgram plots the estimated ACs and PACsFirmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 17 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Corrgram in STATA

H Stata syntax: ‘corrgram’ tabulates autocorrelations, partial autocorrelations,

and portmanteau (Q) statistics

- Menu: Statistics→ Time series→ Graphs→ Autocorrelations & partial autocorrelations

Command: corrgram varname [if] [in] [, corrgram−options]

- We can use ‘ac’ to produce a graph of the autocorrelations→ Command: ac varname [if]

[in] [, ac−options]

- We can use ‘pac’ to produce a graph of the partial autocorrelations→ Command: pac

varname [if] [in] [, pac−options]

I Application to the international airline passengers dataset

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 18 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Application to international airline passengers

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 19 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Application to international airline passengers

(c) Autocorrelogram

(d) Partial Autocorrelogram

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 20 / 41

ARIMA Autocorrelation and Partial Autocorrelation functions

Application to international airline passengers

I From the PCF:- Data probably have a trend component as well as a seasonal component

- First-differencing will mitigate the effects of the trend

- Seasonal differencing will help control for seasonality

I To accomplish this goal, we can use Stata’s time-series operators: command→ pac DS12.air,

lags(20) srv

- Here we graph the partial autocorrelations after controlling for trends and seasonality

- Use ‘srv ’ to include the standardized residual variances

(e) Partial AutocorrelogramFirmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 21 / 41

ARIMA Estimating and forecasting ARIMA models

Estimation & forecasting: steps

I Steps to estimate yt ∼ ARIMA (p,d,q):

- Identify the order of integration d : must run d + 1 unit root tests

- Filter yt = ∆d(yt) ≡ transformed series is I(0)

- Plot the ACF and PACF of the filtered series yt

1 use PACF to identify p: last statistically significant lag of the autoregression

2 use ACF to identify q: last statistically significant autocorrelation

3 where there is no clear choice, select all potential candidates p and q

- Estimate all your model candidates

- Use model selection criteria– AIC/SBIC/HQIC to choose the ‘best’ model

- Proceed to forecasting

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 22 / 41

ARIMA Estimating and forecasting ARIMA models

Estimation & forecasting: command in STATA

I Syntax: arima depvar [indepvar] [if] [in] [weights] [,options]Options include:

- arima(#p, #d #q): specify ARIMA(p, d,q) model

- noconstant: suppress constant term

I Menu: Statistics→ Time series→ ARIMA and ARMAX models

I Application: log U.S. Wholesale Price Index (WPI)

- ADF tests indicate that ln−wpi ∼ I(1)

- First test is run with drift + trend, the second with only a driftFirmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 23 / 41

ARIMA Estimating and forecasting ARIMA models

Application: correlogram

(g) ACF

(h) PACF

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 24 / 41

ARIMA Estimating and forecasting ARIMA models

Application: correlogram

- Oscillations of both ACFs and PCFs may be due to seasonal variations

- ACFs→ push more on pure AR(p) processes couple (p,q) candidates lie in

{2,4} × {0}, i.e., (p,q) ∈ {(2,0); (4,0)}

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 25 / 41

ARIMA Estimating and forecasting ARIMA models

Application: estimation

I Estimation:

- Estimates store in a table with their s.e’s

- Model selection statistics are also provided: AIC & BIC

- The AIC indicates that ARIMA (4,1,0) model fits the data better

- Whereas the BIC indicates that it is ARIMA (2,1,0)

- As is often the case, different model-selection criteria have led to conflicting conclusions

- Both criteria select a pure AR process: no MA component is selected!

I Use comparative forecasting performance: which model forecasts better?

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 26 / 41

ARIMA Estimating and forecasting ARIMA models

Forecasts: 1 step-ahead

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 27 / 41

ARIMA Estimating and forecasting ARIMA models

Forecasts: 1 step-ahead

- Static > Dynamic: with dynamic forecasts, prior forecast errors accumulate over

time

- Problem with static: cannot obtain out-of-sample forecasts at T + 2, T + 3,

. . .→ Static forecast y (S)T+1 can be generated using yT , but generating y (S)

T+2

requires observing yT+1, which we don’t⇒ Dynamic forecasting is more realistic

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 28 / 41

ARIMA Estimating and forecasting ARIMA models

1-step ahead: ARIMA (2,1,0) vs. ARIMA (4,1,0)

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 29 / 41

ARIMA Estimating and forecasting ARIMA models

1-step ahead: ARIMA (2,1,0) vs. ARIMA (4,1,0)

- Static forecasts: ARIMA (2,1,0) & ARIMA (4,1,0) perform similarly: both do

well in mimicking the real data

- Dynamic forecasts: ARIMA (2,1,0) out performs ARIMA (4,1,0). As this case is

more realistic⇒ ARIMA (2,1,0) should be retained ≡ same choice of model as

the BIC

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 30 / 41

ARIMA Seasonal ARIMA models

Seasonal Adjustment techniques

I Seasonality in a time series:

- regular pattern of changes that repeats over S time periods, where S defines the number of

time periods until the pattern repeats again

- Monthly data for for which high values tend always to occur in some particular months & low

values tend always to occur in other particular months⇒ S = 12. If quarterly data⇒ S = 4

I Seasonal ARIMA model:

- seasonal AR and MA terms predict the series using data values and errors at times with lags

that are multiples of S

- with monthly data (and S = 12), a seasonal first-order autoregressive model would use

yt−12 to predict yt . A seasonal second-order autoregressive model would use yt−12 and

yt−24 to predict yt

- a seasonal first-order MA(1) model (with S = 12) would use εt−12 as a predictor. A

seasonal second-order MA(2) model would use εt−12 and εt−24.

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 31 / 41

ARIMA Seasonal ARIMA models

Seasonal Adjustment techniques

I Seasonality usually causes non-stationarity:

- average values at some particular times within the seasonal span may be different than the

average values at other times

- Seasonal differencing renders the series stationary: With S = 12, (1− L12)yt = yt − yt−12 is

purged of seasonal variations.

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 32 / 41

ARIMA Seasonal ARIMA models

Seasonal Adjustment techniques

I Non-seasonal differencing:

- If trend is present in the data, we may also need non-seasonal differencing

- Often (not always) a first-difference (nonseasonal) will “detrend" the data, i.e., we use

(1− L)yt = yt − yt−1 in the presence of trend

I Differencing for Trend and Seasonality

- When both trend and seasonality→ apply both a non-seasonal first-difference and a

seasonal difference⇒ examine ACF and PACF of

(1− L12)(1− L)yt = (yt − yt−1)− (yt−12 − yt−13)

- Removing trend doesn’t mean that we have removed the dependency: We may have

removed the mean, µt , part of which may include a periodic component

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 33 / 41

ARIMA Seasonal ARIMA models

SARIMA Models

I SARIMA Models: incorporates both non-seasonal and seasonal factors intwo ways1 multiplicative: shorthand notation is

ARIMA(p,d ,q)× (P,D,Q)S, where

− p ≡ AR order, d≡ order of integration, q ≡ MA order

− P ≡ seasonal AR order, D ≡ seasonal differencing, Q ≡ seasonal MA order

− S ≡ time span of repeating seasonal pattern

− yt ∼ ARIMA(p,d ,q)× (P,D,Q)S ⇔ φ(LS)ϕ(L)yt = Θ(LS)θ(L)εt

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 34 / 41

ARIMA Seasonal ARIMA models

SARIMA Models

• Non-seasonal: ϕ(L) = 1− ϕ1L− . . .− ϕpLp, θ(L) = 1 + θ1L + . . . + θqLq,

• Seasonal: Φ(LS) = 1− φ1LS − . . .− φPLSP, Θ(LS) = 1 + θ1LS + . . . + θQLSQ,

• Examples: ARIMA(1,0,0)× (1,0,0)12, ARIMA(0,0,1)× (0,0,1)12

I additive:yt ∼ ARIMA(p,d ,q) + (P,D,Q)S

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 35 / 41

ARIMA Seasonal ARIMA models

Application of SARIMA to Airline data

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 36 / 41

ARIMA Seasonal ARIMA models

Application of SARIMA to Airline data

• After first- and seasonally differencing the data:

− No presence of a trend component in the transformed data

- Use the “noconstant" option with ARIMA

− Stata command: arima lnair, arima(0,1,1) sarima(0,1,1,12) noconstant

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 37 / 41

ARIMA Seasonal ARIMA models

Application of SARIMA to Airline data

I We can write the outcome of the regression as:

(1− L12)(1− L)lnairt = −0.402εt−1−0.557εt−12+0.224εt−13 + εt

- Coefficient on εt−13 is the product of the coefficients on the εt−1 and εt−12 terms:

(−0.402)× (−0.557) = 0.224

- ARIMA labeled the dependent variable DS12.lnair to indicate that it has applied the

difference operator ∆ and the lag-12 seasonal difference operator ∆12 to “lnair "

- This model could have been fit by typing the command:

arima DS12.lnair, ma(1) mma(1, 12) noconstant

- For simple multiplicative models, using the sarima() option is easier, though this second

syntax allows us to incorporate more complicated seasonal terms

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 38 / 41

ARIMA Seasonal ARIMA models

Forecasting with SARIMA: Airline data

• SARIMA models have a good forecast performance:

- Static: close to the observed data

- Dynamic: not as good as static but shows an overall acceptable performance.

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 39 / 41

ARIMA X-12-ARIMA Seasonal Adjustment

X-12-ARIMA Seasonal Adjustment in STATA

I X-12-ARIMA was the U.S. Census Bureau’s software package for seasonaladjustment:

H can be used together with many statistical packages:

- Gretl or EViews which provides a graphical user interface for X-12-ARIMA

- NumXL avails X-12-ARIMA functionality in Microsoft Excel

I Many agencies presently are using X-12-ARIMA for seasonal adjustment:

- Statistics Canada

- U.S. Bureau of Labor Statistics

- Brazilian Institute of Geography and Statistics

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 40 / 41

ARIMA X-12-ARIMA Seasonal Adjustment

X-12-ARIMA Seasonal Adjustment in STATA

I Menu-driven X-12-ARIMA seasonal adjustment in Stata by:

- Qunyong Wang, Institute of Statistics and Econometrics, Nankai University

- Na Wu, Economics School, Tianjin University of Finance and Economics

Firmin Doko Tchatoka (UoA) 2018 ES-Summer Institute-Cotonou July 9, 2018 41 / 41