AR- MA- och ARMA-

Post on 19-Mar-2016

29 views 0 download

description

AR- MA- och ARMA-. How should these data be modelled?. Identification step: Look at the SAC and SPAC. Looks like an AR (1)-process. (Spikes are clearly decreasing in SAC and there is maybe only one sign. spike in SPAC). Then we should try to fit the model - PowerPoint PPT Presentation

Transcript of AR- MA- och ARMA-

20001990198019701960

14

12

10

8

6

4

2

0

Year

CPIC

hnge

Yearly changes in Consumer Price Index (CPI), USA, 1960-2001

How should these data be modelled?

Identification step: Look at the SAC and SPAC

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

Autocorrelation Function for CPIChnge(with 5% significance limits for the autocorrelations)

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

Partial Autocorrelation Function for CPIChnge(with 5% significance limits for the partial autocorrelations)

Looks like an AR(1)-process. (Spikes are clearly decreasing in SAC and there is maybe only one sign. spike in SPAC)

Then we should try to fit the model

The parameters to be estimated are and .

One possibility might be to uses Least-Squares estimation (like for ordinary regression analysis)

Not so wise, as both response and explanatory variable are randomly varying.

Maximum Likelihood better So-called Conditional Least-Squares method can be derived

ttt ayy 1

Use MINITAB’s ARIMA-procedure!!

AR(1)

We can always ask for forecasts

MTB > ARIMA 1 0 0 'CPIChnge';

SUBC> Constant;

SUBC> Forecast 2 ;

SUBC> GSeries;

SUBC> GACF;

SUBC> GPACF;

SUBC> Brief 2.

ARIMA Model: CPIChnge

Estimates at each iteration

Iteration SSE Parameters

0 316.054 0.100 4.048

1 245.915 0.250 3.358

2 191.627 0.400 2.669

3 153.195 0.550 1.980

4 130.623 0.700 1.292

5 123.976 0.820 0.739

6 123.786 0.833 0.645

7 123.779 0.836 0.626

8 123.778 0.837 0.622

9 123.778 0.837 0.621

Relative change in each estimate less than 0.0010

Final Estimates of Parameters

Type Coef SE Coef T P

AR 1 0.8369 0.0916 9.13 0.000

Constant 0.6211 0.2761 2.25 0.030

Mean 3.809 1.693

Number of observations: 42

Residuals: SS = 122.845 (backforecasts excluded)

MS = 3.071 DF = 40

4035302520151051

15

10

5

0

Time

CPIC

hnge

Time Series Plot for CPIChnge(with forecasts and their 95% confidence limits)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

All spikes should be within red limits here, i.e. no correlation should be left in the residuals!

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 26.0 35.3 39.8 *

DF 10 22 34 *

P-Value 0.004 0.036 0.227 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 1.54176 -1.89376 4.97727

44 1.91148 -2.56850 6.39146

Ljung-Box statistic:

where

n is the sample size

d is the degree of non-seasonal differencing used to transform original series to be stationary. Non-seasonal means taking differences at lags nearby.

rl2(â) is the sample autocorrelation at lag l for the residuals

of the estimated model.

K is a number of lags covering multiples of seasonal cycles, e.g. 12, 24, 36,… for monthly data

K

ll arldndndnKQ

1

2* )ˆ(2

Under the assumption of no correlation left in the residuals the Ljung-Box statistic is chi-square distributed with K – nC degrees of freedom, where nC is the number of estimated parameters in model except for the constant

A low P-value for any K should be taken as evidence for correlated residuals, and thus the estimated model must be revised.

In this example:Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 26.0 35.3 39.8 *

DF 10 22 34 *

P-Value 0.004 0.036 0.227 *

Here, data is not supposed to possess seasonal variation so interest is mostly paid to K = 12.

P – value for K =12 is lower than 0.05 Model needs revision!

K

A new look at the SAC and SPAC of original data:

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

Autocorrelation Function for CPIChnge(with 5% significance limits for the autocorrelations)

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

Partial Autocorrelation Function for CPIChnge(with 5% significance limits for the partial autocorrelations)

The second spike in SPAC might be considered crucial!

If an AR(p)-model is correct, the ACF should decrease exponentially (monotonically or oscillating)

and PACF should have exactly p significant spikes

Try an AR(2)

i.e.

tttt ayyy 1211

Type Coef SE Coef T P

AR 1 1.1684 0.1509 7.74 0.000

AR 2 -0.4120 0.1508 -2.73 0.009

Constant 1.0079 0.2531 3.98 0.000

Mean 4.137 1.039

Number of observations: 42

Residuals: SS = 103.852 (backforecasts excluded)

MS = 2.663 DF = 39

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 18.6 30.6 36.8 *

DF 9 21 33 *

P-Value 0.029 0.081 0.297 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 0.76866 -2.43037 3.96769

44 1.45276 -3.46705 6.37257

PREVIOUS MODEL:

Residuals: SS = 122.845 (backforecasts excluded)

MS = 3.071 DF = 40

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 26.0 35.3 39.8 *

DF 10 22 34 *

P-Value 0.004 0.036 0.227 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 1.54176 -1.89376 4.97727

44 1.91148 -2.56850 6.39146

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

Might still be problematic!

Could it be the case of an Moving Average (MA) model?

MA(1):

1 ttt aay

{at } are still assumed to be uncorrelated and identically distributed with mean zero and constant variance

MA(q):

qtqttt aaay 11

• always stationary

• mean =

• is in effect a moving average with weights

q ,,,1 ,21

for the (unobserved) values at, at – 1, … , at – q

Index

AR(1

)_0.

2

200180160140120100806040201

5

4

3

2

1

0

Time Series Plot of AR(1)_0.2

Index

AR(1

)_0.

8

200180160140120100806040201

14

13

12

11

10

9

8

7

6

5

Time Series Plot of AR(1)_0.8

Index

MA(

1)_0

.2

3002702402101801501209060301

3

2

1

0

-1

-2

-3

Time Series Plot of MA(1)_0.2

Index

MA(

1)_0

.8

3002702402101801501209060301

4

3

2

1

0

-1

-2

-3

-4

Time Series Plot of MA(1)_0.8

Index

MA(

1)_(

-0.5

)

3002702402101801501209060301

4

3

2

1

0

-1

-2

-3

Time Series Plot of MA(1)_ (-0.5)

Index

AR(1

)_(-

0.5)

200180160140120100806040201

5

4

3

2

1

0

-1

-2

-3

Time Series Plot of AR(1)_ (-0.5)

Try an MA(1):

Final Estimates of Parameters

Type Coef SE Coef T P

MA 1 -1.0459 0.0205 -51.08 0.000

Constant 4.5995 0.3438 13.38 0.000

Mean 4.5995 0.3438

Number of observations: 42

Residuals: SS = 115.337 (backforecasts excluded)

MS = 2.883 DF = 40

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 38.3 92.0 102.2 *

DF 10 22 34 *

P-Value 0.000 0.000 0.000 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 1.27305 -2.05583 4.60194

44 4.59948 -0.21761 9.41656

Not at all good!

Much wider!

4035302520151051

15

10

5

0

Time

CPIC

hnge

Time Series Plot for CPIChnge(with forecasts and their 95% confidence limits)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

Still seems to be problems with residuals

Look again at ACF and PACF of original series:

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

Autocorrelation Function for CPIChnge(with 5% significance limits for the autocorrelations)

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

Partial Autocorrelation Function for CPIChnge(with 5% significance limits for the partial autocorrelations)

The pattern corresponds neither with pure AR(p), nor with pure MA(q)

Could it be a combination of these two?

Auto Regressive Moving Average (ARMA) model

ARMA(p,q):

qtqttptptt aaayyy 1111

• stationarity conditions harder to define

• mean value calculations more difficult

• identification patterns exist, but might be complex:

– exponentially decreasing patterns or

– sinusoidal decreasing patterns

in both ACF and PACF (no cutting of at a certain lag)

Index

ARM

A(1,

1)_(

0.2)

(0.2

)

3002702402101801501209060301

3

2

1

0

-1

-2

-3

Time Series Plot of ARMA(1,1)_ (0.2)(0.2)

Index

ARM

A(1,

1)_(

-0.2

)(-0

.2)

3002702402101801501209060301

3

2

1

0

-1

-2

-3

Time Series Plot of ARMA(1,1)_ (-0.2)(-0.2)

Index

ARM

A(2,

1)_(

0.1)

(0.1

)_(-

0.1)

3002702402101801501209060301

3

2

1

0

-1

-2

-3

-4

Time Series Plot of ARMA(2,1)_ (0.1)(0.1)_ (-0.1)

Always try to keep p and q small.

Try an ARMA(1,1):

Type Coef SE Coef T P

AR 1 0.6558 0.1330 4.93 0.000

MA 1 -0.9324 0.0878 -10.62 0.000

Constant 1.3778 0.4232 3.26 0.002

Mean 4.003 1.230

Number of observations: 42

Residuals: SS = 77.6457 (backforecasts excluded)

MS = 1.9909 DF = 39

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 8.4 21.5 28.3 *

DF 9 21 33 *

P-Value 0.492 0.429 0.699 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 -1.01290 -3.77902 1.75321

44 0.71356 -4.47782 5.90494

Much better!

4035302520151051

15

10

5

0

-5

Time

CPIC

hnge

Time Series Plot for CPIChnge(with forecasts and their 95% confidence limits)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

Now OK!

Calculating forecasts

For AR(p) models quite simple:

1)1(211

)2(2)1(1

)2(2112

)1(1211

ˆˆˆˆˆˆˆˆ

ˆˆˆˆˆˆˆ

ˆˆˆˆˆˆ

ˆˆˆˆˆ

tpptptpt

tpptptpt

ptpttt

ptpttt

yyyy

yyyy

yyyy

yyyy

at + k is set to 0 for all values of k

For MA(q) ??

MA(1):

1ˆˆˆ ttt aay

If we e.g. would set at and at – 1 equal to 0

the forecast would constantly be

which is not desirable.

Note that

ˆ)ˆ1(ˆˆ

)1(0

1

1

2

1

211

ttt

ttt

t

ttt

ttt

yya

yyaa

aayaay

Similar investigations for ARMA-models.