AR- MA- and ARMA-

30
2000 1990 1980 1970 1960 14 12 10 8 6 4 2 0 Y ear CPIChnge Y early changes in Consum er Price Index (CPI),U SA,1960-2001 How should these data be modelled?

description

AR- MA- and ARMA-. How should these data be modelled?. Identification step: Look at the SAC and SPAC. Looks like an AR (1)-process. (Spikes are clearly decreasing in SAC and there is maybe only one sign. spike in SPAC). Then we should try to fit the model - PowerPoint PPT Presentation

Transcript of AR- MA- and ARMA-

Page 1: AR- MA- and ARMA-

20001990198019701960

14

12

10

8

6

4

2

0

Year

CPIC

hnge

Yearly changes in Consumer Price Index (CPI), USA, 1960-2001

How should these data be modelled?

Page 2: AR- MA- and ARMA-

Identification step: Look at the SAC and SPAC

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

Autocorrelation Function for CPIChnge(with 5% significance limits for the autocorrelations)

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

Partial Autocorrelation Function for CPIChnge(with 5% significance limits for the partial autocorrelations)

Looks like an AR(1)-process. (Spikes are clearly decreasing in SAC and there is maybe only one sign. spike in SPAC)

Page 3: AR- MA- and ARMA-

Then we should try to fit the model

The parameters to be estimated are and .

One possibility might be to uses Least-Squares estimation (like for ordinary regression analysis)

Not so wise, as both response and explanatory variable are randomly varying.

Maximum Likelihood better So-called Conditional Least-Squares method can be derived

ttt ayy 1

Use MINITAB’s ARIMA-procedure!!

Page 4: AR- MA- and ARMA-

AR(1)

We can always ask for forecasts

Page 5: AR- MA- and ARMA-

MTB > ARIMA 1 0 0 'CPIChnge';

SUBC> Constant;

SUBC> Forecast 2 ;

SUBC> GSeries;

SUBC> GACF;

SUBC> GPACF;

SUBC> Brief 2.

ARIMA Model: CPIChnge

Estimates at each iteration

Iteration SSE Parameters

0 316.054 0.100 4.048

1 245.915 0.250 3.358

2 191.627 0.400 2.669

3 153.195 0.550 1.980

4 130.623 0.700 1.292

5 123.976 0.820 0.739

6 123.786 0.833 0.645

7 123.779 0.836 0.626

8 123.778 0.837 0.622

9 123.778 0.837 0.621

Relative change in each estimate less than 0.0010

Page 6: AR- MA- and ARMA-

Final Estimates of Parameters

Type Coef SE Coef T P

AR 1 0.8369 0.0916 9.13 0.000

Constant 0.6211 0.2761 2.25 0.030

Mean 3.809 1.693

Number of observations: 42

Residuals: SS = 122.845 (backforecasts excluded)

MS = 3.071 DF = 40

Page 7: AR- MA- and ARMA-

4035302520151051

15

10

5

0

Time

CPIC

hnge

Time Series Plot for CPIChnge(with forecasts and their 95% confidence limits)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

All spikes should be within red limits here, i.e. no correlation should be left in the residuals!

Page 8: AR- MA- and ARMA-

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 26.0 35.3 39.8 *

DF 10 22 34 *

P-Value 0.004 0.036 0.227 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 1.54176 -1.89376 4.97727

44 1.91148 -2.56850 6.39146

Page 9: AR- MA- and ARMA-

Ljung-Box statistic:

where

n is the sample size

d is the degree of non-seasonal differencing used to transform original series to be stationary. Non-seasonal means taking differences at lags nearby.

rl2(â) is the sample autocorrelation at lag l for the residuals

of the estimated model.

K is a number of lags covering multiples of seasonal cycles, e.g. 12, 24, 36,… for monthly data

K

ll arldndndnKQ

1

2* )ˆ(2

Page 10: AR- MA- and ARMA-

Under the assumption of no correlation left in the residuals the Ljung-Box statistic is chi-square distributed with K – nC degrees of freedom, where nC is the number of estimated parameters in model except for the constant

A low P-value for any K should be taken as evidence for correlated residuals, and thus the estimated model must be revised.

In this example:Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 26.0 35.3 39.8 *

DF 10 22 34 *

P-Value 0.004 0.036 0.227 *

Here, data is not supposed to possess seasonal variation so interest is mostly paid to K = 12.

P – value for K =12 is lower than 0.05 Model needs revision!

K

Page 11: AR- MA- and ARMA-

A new look at the SAC and SPAC of original data:

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

Autocorrelation Function for CPIChnge(with 5% significance limits for the autocorrelations)

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

Partial Autocorrelation Function for CPIChnge(with 5% significance limits for the partial autocorrelations)

The second spike in SPAC might be considered crucial!

If an AR(p)-model is correct, the ACF should decrease exponentially (monotonically or oscillating)

and PACF should have exactly p significant spikes

Try an AR(2)

i.e.

tttt ayyy 2211

Page 12: AR- MA- and ARMA-

Type Coef SE Coef T P

AR 1 1.1684 0.1509 7.74 0.000

AR 2 -0.4120 0.1508 -2.73 0.009

Constant 1.0079 0.2531 3.98 0.000

Mean 4.137 1.039

Number of observations: 42

Residuals: SS = 103.852 (backforecasts excluded)

MS = 2.663 DF = 39

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 18.6 30.6 36.8 *

DF 9 21 33 *

P-Value 0.029 0.081 0.297 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 0.76866 -2.43037 3.96769

44 1.45276 -3.46705 6.37257

PREVIOUS MODEL:

Residuals: SS = 122.845 (backforecasts excluded)

MS = 3.071 DF = 40

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 26.0 35.3 39.8 *

DF 10 22 34 *

P-Value 0.004 0.036 0.227 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 1.54176 -1.89376 4.97727

44 1.91148 -2.56850 6.39146

Page 13: AR- MA- and ARMA-

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

Might still be problematic!

Page 14: AR- MA- and ARMA-

Could it be the case of an Moving Average (MA) model?

MA(1):

1 ttt aay

{at } are still assumed to be uncorrelated and identically distributed with mean zero and constant variance

Page 15: AR- MA- and ARMA-
Page 16: AR- MA- and ARMA-

MA(q):

qtqttt aaay 11

• always stationary

• mean =

• is in effect a moving average with weights

q ,,,1 ,21

for the (unobserved) values at, at – 1, … , at – q

Page 17: AR- MA- and ARMA-

Index

AR(1

)_0.

2

200180160140120100806040201

5

4

3

2

1

0

Time Series Plot of AR(1)_0.2

Index

AR(1

)_0.

8

200180160140120100806040201

14

13

12

11

10

9

8

7

6

5

Time Series Plot of AR(1)_0.8

Index

MA(

1)_0

.2

3002702402101801501209060301

3

2

1

0

-1

-2

-3

Time Series Plot of MA(1)_0.2

Index

MA(

1)_0

.8

3002702402101801501209060301

4

3

2

1

0

-1

-2

-3

-4

Time Series Plot of MA(1)_0.8

Page 18: AR- MA- and ARMA-

Index

MA(

1)_(

-0.5

)

3002702402101801501209060301

4

3

2

1

0

-1

-2

-3

Time Series Plot of MA(1)_ (-0.5)

Index

AR(1

)_(-

0.5)

200180160140120100806040201

5

4

3

2

1

0

-1

-2

-3

Time Series Plot of AR(1)_ (-0.5)

Page 19: AR- MA- and ARMA-

Try an MA(1):

Page 20: AR- MA- and ARMA-

Final Estimates of Parameters

Type Coef SE Coef T P

MA 1 -1.0459 0.0205 -51.08 0.000

Constant 4.5995 0.3438 13.38 0.000

Mean 4.5995 0.3438

Number of observations: 42

Residuals: SS = 115.337 (backforecasts excluded)

MS = 2.883 DF = 40

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 38.3 92.0 102.2 *

DF 10 22 34 *

P-Value 0.000 0.000 0.000 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 1.27305 -2.05583 4.60194

44 4.59948 -0.21761 9.41656

Not at all good!

Much wider!

Page 21: AR- MA- and ARMA-

4035302520151051

15

10

5

0

Time

CPIC

hnge

Time Series Plot for CPIChnge(with forecasts and their 95% confidence limits)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

Page 22: AR- MA- and ARMA-

Still seems to be problems with residuals

Look again at ACF and PACF of original series:

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

Autocorrelation Function for CPIChnge(with 5% significance limits for the autocorrelations)

1110987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

Partial Autocorrelation Function for CPIChnge(with 5% significance limits for the partial autocorrelations)

The pattern corresponds neither with pure AR(p), nor with pure MA(q)

Could it be a combination of these two?

Auto Regressive Moving Average (ARMA) model

Page 23: AR- MA- and ARMA-

ARMA(p,q):

qtqttptptt aaayyy 1111

• stationarity conditions harder to define

• mean value calculations more difficult

• identification patterns exist, but might be complex:

– exponentially decreasing patterns or

– sinusoidal decreasing patterns

in both ACF and PACF (no cutting of at a certain lag)

Page 24: AR- MA- and ARMA-

Index

ARM

A(1,

1)_(

0.2)

(0.2

)

3002702402101801501209060301

3

2

1

0

-1

-2

-3

Time Series Plot of ARMA(1,1)_ (0.2)(0.2)

Index

ARM

A(1,

1)_(

-0.2

)(-0

.2)

3002702402101801501209060301

3

2

1

0

-1

-2

-3

Time Series Plot of ARMA(1,1)_ (-0.2)(-0.2)

Index

ARM

A(2,

1)_(

0.1)

(0.1

)_(-

0.1)

3002702402101801501209060301

3

2

1

0

-1

-2

-3

-4

Time Series Plot of ARMA(2,1)_ (0.1)(0.1)_ (-0.1)

Page 25: AR- MA- and ARMA-

Always try to keep p and q small.

Try an ARMA(1,1):

Page 26: AR- MA- and ARMA-

Type Coef SE Coef T P

AR 1 0.6558 0.1330 4.93 0.000

MA 1 -0.9324 0.0878 -10.62 0.000

Constant 1.3778 0.4232 3.26 0.002

Mean 4.003 1.230

Number of observations: 42

Residuals: SS = 77.6457 (backforecasts excluded)

MS = 1.9909 DF = 39

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 8.4 21.5 28.3 *

DF 9 21 33 *

P-Value 0.492 0.429 0.699 *

Forecasts from period 42

95% Limits

Period Forecast Lower Upper Actual

43 -1.01290 -3.77902 1.75321

44 0.71356 -4.47782 5.90494

Much better!

Page 27: AR- MA- and ARMA-

4035302520151051

15

10

5

0

-5

Time

CPIC

hnge

Time Series Plot for CPIChnge(with forecasts and their 95% confidence limits)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Auto

corr

elat

ion

ACF of Residuals for CPIChnge(with 5% significance limits for the autocorrelations)

10987654321

1.00.80.60.40.20.0

-0.2-0.4-0.6-0.8-1.0

Lag

Part

ial A

utoc

orre

latio

n

PACF of Residuals for CPIChnge(with 5% significance limits for the partial autocorrelations)

Now OK!

Page 28: AR- MA- and ARMA-

Calculating forecasts

For AR(p) models quite simple:

1)1(211

)2(2)1(1

)2(2112

)1(1211

ˆˆˆˆˆˆˆˆ

ˆˆˆˆˆˆˆ

ˆˆˆˆˆˆ

ˆˆˆˆˆ

tpptptpt

tpptptpt

ptpttt

ptpttt

yyyy

yyyy

yyyy

yyyy

at + k is set to 0 for all values of k

Page 29: AR- MA- and ARMA-

For MA(q) ??

MA(1):

1ˆˆˆ ttt aay

If we e.g. would set at and at – 1 equal to 0

the forecast would constantly be

which is not desirable.

Page 30: AR- MA- and ARMA-

Note that

ˆ)ˆ1(ˆˆ

)1(0

1

1

2

1

211

ttt

ttt

t

ttt

ttt

yya

yyaa

aayaay

Similar investigations for ARMA-models.