Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Decision 411: Class 4...

Decision 411: Class 4Decision 411: Class 4

NonNon--seasonal averaging & smoothing seasonal averaging & smoothing modelsmodels

Simple moving average (SMA) modelSimple moving average (SMA) modelSimple exponential smoothing (SES) modelSimple exponential smoothing (SES) modelLinear exponential smoothing (LES) modelLinear exponential smoothing (LES) model

Combining seasonal adjustment with Combining seasonal adjustment with nonnon--seasonal smoothingseasonal smoothingWinters’ seasonal smoothing modelWinters’ seasonal smoothing model

Guidelines for future HW Guidelines for future HW writeupswriteupsPresentation should Presentation should stand on its ownstand on its own (SG files are (SG files are mainly just for audit trail)mainly just for audit trail)What’s the bottom line? (forecast, trend, key drivers?)What’s the bottom line? (forecast, trend, key drivers?)Clearly Clearly define the variablesdefine the variables (units, dates, (units, dates, transformations, etc.) used in the analysistransformations, etc.) used in the analysisUse Use bullet pointsbullet points for key observations & findingsfor key observations & findingsUse Use tablestables to present key numbers (forecasts & CI’s)to present key numbers (forecasts & CI’s)Embed the most important Embed the most important chart(schart(s), with annotations), with annotationsShow Show where the numbers came fromwhere the numbers came fromExplain your model’s Explain your model’s assumptionsassumptions in layman’s termsin layman’s terms

Averaging & smoothing modelsAveraging & smoothing models

Today’s topics

Later: ARIMA modelsLater: ARIMA modelsWe’ll meet ARIMA later in the course,

but briefly, an “ARIMA (p,d,q)” model is like a regression model in which the dependent variable is a

d-order difference of the input variable, and the independent variables are p

lagged values of the dependent variable (AR terms) and/or q lagged values of

the forecast errors (MA terms), plus an optional constant term. Many of the averaging & smoothing models are

special cases, e.g., an ARIMA(0,1,1) model is an SES model.

p = # AR terms (lags of dependent variable)

q = # MA terms (lags of errors)

d = order of differencing of input variable

Averaging & smoothing modelsAveraging & smoothing modelsThe problem: sometimes The problem: sometimes nonseasonalnonseasonal (or (or seasonally adjusted) data appears to be “locally seasonally adjusted) data appears to be “locally stationary” with a timestationary” with a time--varying meanvarying mean

The mean (constant) model doesn’t track The mean (constant) model doesn’t track changes in the mean, has changes in the mean, has positivelypositivelyautocorrelatedautocorrelated errors errors

The random walk model may not perform well The random walk model may not perform well either in this situation: it “either in this situation: it “oversteersoversteers”, picks up ”, picks up too much “noise” in the data, and yields too much “noise” in the data, and yields negatively negatively correlated errorscorrelated errors

Residual Autocorrelations for XConstant mean = 463.136

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Constant mean = 463.136

0 20 40 60 80 100 120100

300

500

700

900

Example: series “X”Example: series “X”•• Mean (constant) model yields positively Mean (constant) model yields positively

autocorrelatedautocorrelated errors.... doesn’t react to errors.... doesn’t react to changes in the local mean ...RMSE = 121changes in the local mean ...RMSE = 121

Strong positive autocorrelation at lag 1No reaction to local changes in data

Residual Autocorrelations for XRandom walk

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Random walk

0 20 40 60 80 100 120100

300

500

700

900

Example, continuedExample, continued

Random walk model for series X yields Random walk model for series X yields negativelynegatively autocorrelatedautocorrelated errors.... errors.... overreactsoverreactsto changes... RMSE=122 …not any better!to changes... RMSE=122 …not any better!

Strong negative autocorrelation at lag 1Over-reaction to local changes in data (always 1 period too late)

A solution: A solution:

Use a model that averages or “Use a model that averages or “smoothssmooths” ” the recent data to filter out some of the the recent data to filter out some of the noise and estimate the local mean, such noise and estimate the local mean, such as the Simple Moving Average (SMA) as the Simple Moving Average (SMA) model:model:

1 2Y Y ... Yt t t mt mY + + +− − −=

…i.e., just average the last m observed values.

m=3 ⇒ avg. age = 2

m=5 ⇒ avg. age = 3

m=9 ⇒ avg. age = 5, etc.

…hence it lags behind turning points by (m+1)/2 periods

Properties of SMA modelProperties of SMA modelAverage ageAverage age of the data in the forecast is (of the data in the forecast is (mm+1)/2+1)/2

(m+1)/2 is midway between 1 period old

and m periods old

1 2Y Y ... Yt t t mt mY + + +− − −=

Properties of SMA, continuedProperties of SMA, continuedLongLong--term forecasts = term forecasts = horizontal straight linehorizontal straight line

( = ( = simplesimple average of last few valuesaverage of last few values))

Confidence limits??? No theory!!Confidence limits??? No theory!!

Works well on Works well on highly irregularhighly irregular data: no data data: no data point receives more weight than others, so point receives more weight than others, so it’s relatively robust against “outliers” it’s relatively robust against “outliers”

Can also be “tapered” for even greater Can also be “tapered” for even greater robustnessrobustness

Residual Autocorrelations for XSimple moving average of 3 terms

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Simple moving average of 3 terms

0 20 40 60 80 100 120100

300

500

700

900

Example, continuedExample, continuedSMA with SMA with mm=3 (average age=2) yields RMSE=104 =3 (average age=2) yields RMSE=104 (significantly better!) and less negative autocorrelation. (significantly better!) and less negative autocorrelation. 50% confidence limits are shown here, but 50% confidence limits are shown here, but don’t trust don’t trust themthem: they are based on the assumption of the mean : they are based on the assumption of the mean remaining remaining fixedfixed at the latest value.at the latest value.

Forecasts lag behind turning points by about 2 periods

No autocorrelation at lag 150% confidence limits (?)


0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions


0 20 40 60 80 100 120100

300

500

700

900

Example, continuedExample, continuedSMA with SMA with mm=5 (average age=3) yields =5 (average age=3) yields RMSE=102 (very slightly better), “smoother” RMSE=102 (very slightly better), “smoother” forecasts, slight forecasts, slight positive positive autocorrelation in errorsautocorrelation in errors

Forecasts lag behind turning point by about 3 periods Slight positive autocorrelation at lag 1

50% confidence limits shown


0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions


0 20 40 60 80 100 120100

300

500

700

900


SMA with m=9 (average age=5) yields RMSE=104 (slightly worse), more positive autocorrelation in errors

Forecasts lag behind turning point by about 5 periods

More positive autocorrelation at lag 1


0 20 40 60 80 100 120100

300

500

700

900Residual Autocorrelations for XSimple moving average of 19 terms

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions


SMA with m=19 (average age=10) yields RMSE=118 (significantly worse), very smooth forecasts, much more positive autocorrelation

Forecasts lag behind turning point by about 10 periods

Strong positive autocorrelation at lag 1

Smoothness vs. responsivenessSmoothness vs. responsiveness

Note that the more we smooth the data, the Note that the more we smooth the data, the more clearly we see the “signal” stand out.more clearly we see the “signal” stand out.

But...greater clarity comes at the expense of But...greater clarity comes at the expense of getting the news getting the news laterlater..

If we want our forecasting model to respond If we want our forecasting model to respond quicklyquickly to changes, it will also pick up “false to changes, it will also pick up “false alarms” due to noise in the data.alarms” due to noise in the data.

ConclusionsConclusions

For a time series with a For a time series with a randomly varying randomly varying local meanlocal mean, the SMA model may , the SMA model may outperform both the mean model and the outperform both the mean model and the random walk modelrandom walk model

It allows us to “strike a balance” between It allows us to “strike a balance” between averaging over too much past data or too averaging over too much past data or too little past data.little past data.

HoweverHowever......

Shortcomings of SMA modelShortcomings of SMA modelIt’s hard to optimize the number of terms It’s hard to optimize the number of terms ((mm), because it is a discrete parameter... ), because it is a discrete parameter... you must use trial and error.you must use trial and error.

Intuitively, you should Intuitively, you should not equally weightnot equally weight the the last last m m observations when computing the observations when computing the average... it would be better to “discount” average... it would be better to “discount” the older data in a gradual fashion.the older data in a gradual fashion.

These observations motivate....These observations motivate....

Brown’s Simple Exponential SmoothingBrown’s Simple Exponential Smoothing

Let: Let: αα = “smoothing constant” (0<= “smoothing constant” (0<αα<<1)1)SStt = smoothed series at period = smoothed series at period tt

RecursiveRecursive smoothing formula:smoothing formula:

SStt = = ααYYtt + (1+ (1−− αα) ) SStt--11

Forecast for next period = Forecast for next period = current smoothed current smoothed valuevalue::

tt SY =+1ˆ

Mathematically equivalent formulas for Mathematically equivalent formulas for SES forecastsSES forecasts

ttt YYY ˆα)1(αˆ 1 −+=+

ttt eYY αˆˆ 1 +=+ ttt YYe ˆ−=

ttt eYY α)1(ˆ 1 −−=+

forecast = interpolation between previous forecastand previous observation

forecast = previous forecast plus fraction α of previous error:

forecast = previous observationminus fraction 1-α of previous error

Mathematically equivalent formulas for Mathematically equivalent formulas for SES forecasts, continuedSES forecasts, continued

...]α)1(α)1(α)1(α[ˆ 33

22

11 +−+−+−+= −−−+ ttttt YYYYY

forecast = exponentially weighted moving average of all past observations

…or in other words, a discounted moving average with a discount factor of 1-α per period

Last but not least:

Properties of SES modelProperties of SES modelSES uses a smoothing parameter (SES uses a smoothing parameter (αα) ) which is which is continuously variablecontinuously variable, so it is , so it is easily optimized by least squareseasily optimized by least squares

If If αα = 1, SES = 1, SES →→ random walk model random walk model

If If αα = 0, SES = 0, SES →→ constant model constant model

Average ageAverage age of data in SES forecast is 1/of data in SES forecast is 1/ααExamples:Examples: αα = 0.5 = 0.5 ⇒⇒ avg. age = 2avg. age = 2

αα = 0.2 = 0.2 ⇒⇒ avg. age = 5avg. age = 5αα = 0.1 = 0.1 ⇒⇒ avg. age = 10, etc.avg. age = 10, etc.

Properties of SES, continuedProperties of SES, continuedFor a given average age, SES is For a given average age, SES is somewhat superior to SMA because somewhat superior to SMA because it it places relatively more weight on the places relatively more weight on the most recent observationmost recent observation

Hence it is slightly more "responsive" to Hence it is slightly more "responsive" to changes changes occuringoccuring in the recent past.in the recent past.

Caveat: it is also more sensitive to Caveat: it is also more sensitive to recent “outliers” than the SMA modelrecent “outliers” than the SMA model----not so good for messy data.not so good for messy data.

SMA (SMA (mm=9) vs. SES (=9) vs. SES (αα=0.2)=0.2)

0

0.05

0.1

0.15

0.2

0.25

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Lag

SMA weightSES weight

SMA weights are 1/9 on first 9 lags of Y, zero afterward

SES weights are larger than SMA weights at first few lags, then gradually decline to zero

average age = 5 for both models

Average age is the center of mass (“balancing point”) of the weight distribution

Properties of SES, continuedProperties of SES, continuedLongLong--term forecasts from the basic SES term forecasts from the basic SES model are a model are a horizontal straight linehorizontal straight line (no trend, (no trend, as in random walk and SMA)as in random walk and SMA)

SES = ARIMA(0,1,1), i.e., random walk model SES = ARIMA(0,1,1), i.e., random walk model ((withoutwithout drift) plus MA = 1, which adds a drift) plus MA = 1, which adds a multiple of lagmultiple of lag--1 forecast error:1 forecast error:

ttt eYY α)1(ˆ 1 −−=+

random walk lag-1 error

Properties of SES, continuedProperties of SES, continued

Note that it increases with Note that it increases with kk more slowly than for the more slowly than for the random walk model, which is the special case random walk model, which is the special case αα = 1:= 1:

)1(2

)( α)1(1 fcstkfcst SEkSE −+=

)1()( fcstkfcst SEkSE =

Exact Exact kk--step ahead forecast standard error step ahead forecast standard error can can be computed using ARIMA theory:be computed using ARIMA theory:

Hence the SES model assumes the series is Hence the SES model assumes the series is “more predictable” than a random walk“more predictable” than a random walk

Residual Autocorrelations for XSimple exponential smoothing with alpha = 0.2961

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Example, continuedExample, continuedSES with optimal α=0.3 (average age=3.3) yields RMSE = 99 (best yet, by a small margin), no significant residual autocorrelations

Don’t worry about an isolated spike at an oddball lag like lag 9—probably just due to a pair of large errors separated by 9 periods


Simple exponential smoothing with alpha = 0.2961

0 20 40 60 80 100 120100

300

500

700

900

SES with constant trendSES with constant trend

A A constant linear trendconstant linear trend can be added to can be added to an SES model by fitting it as an an SES model by fitting it as an ARIMA(0,1,1) model ARIMA(0,1,1) model with constantwith constant

Alas, the ARIMA implementation of SES Alas, the ARIMA implementation of SES models can’t be combined with seasonal models can’t be combined with seasonal adjustment in the Forecasting procedure adjustment in the Forecasting procedure in in StatgraphicsStatgraphics (although you could (although you could seasonally adjust and then fit an ARIMA seasonally adjust and then fit an ARIMA model in two steps)model in two steps)

SES with constant trend, continuedSES with constant trend, continued

A constant A constant exponential trendexponential trend can be can be added to SES by using the added to SES by using the inflation inflation adjustmentadjustment option in option in StatgraphicsStatgraphicsThe average percentage growth per The average percentage growth per period can be estimated from the slope period can be estimated from the slope coefficient of a linear trend model or coefficient of a linear trend model or ARIMA(0,1,0)+c model fitted with a ARIMA(0,1,0)+c model fitted with a natural lognatural log transformationtransformationSee video clip #10 for examplesSee video clip #10 for examples

Evidently what is needed is an estimate of Evidently what is needed is an estimate of the the local trendlocal trend as well as the local meanas well as the local mean

This is the motivating idea behind Brown’s This is the motivating idea behind Brown’s LinearLinear Exponential Smoothing (LES) modelExponential Smoothing (LES) model

It’s also sometimes called “double It’s also sometimes called “double exponential smoothing, because it involves exponential smoothing, because it involves a double application of exponential a double application of exponential smoothingsmoothing

What if the series has a What if the series has a timetime--varying varying trendtrend, as well as a time, as well as a time--varying mean?varying mean?

How LES worksHow LES worksApply SES once to get a singlyApply SES once to get a singly--smoothed smoothed series series SStt′′ that lags behind the that lags behind the current current value by 1/value by 1/αα −− 11 periods.*periods.*

Smooth the smoothed series (using Smooth the smoothed series (using same same αα) to get an ) to get an even smoothereven smoother series series SStt″″ that lags behind by 2(1/that lags behind by 2(1/αα −− 1) periods1) periods

To forecast the future, To forecast the future, extrapolate a lineextrapolate a linebetween the two points (between the two points (t t −− ((1/1/αα −− 1)1), , SStt′′ ) ) and (and (t t −− 22(1/(1/αα −− 1)1), , SStt″″ ) )

*Average age relative to next value is 1/α, so age relative to current value is 1/α - 1

XS'S''

0 20 40 60 80 1000

200

400

600

800

LES forecasts from LES forecasts from t t = 90, = 90, αα=0.1*=0.1*1. Draw a horizontal line extending 9 periods back in time from the current value of the singly-smoothed series

2. Draw a horizontal line extending 18 periods back in time from the current value of the doubly-smoothed series

3. Extrapolate a line into the future through the left endpoints of the

two horizontal lines

*1/α = 10, so 1/α - 1 = 9

How LES worksHow LES worksThere are two equivalent sets of There are two equivalent sets of mathematical formulas for implementing mathematical formulas for implementing the logic of the LES modelthe logic of the LES modelOne set of formulas (I) explicitly computes One set of formulas (I) explicitly computes the the current estimates of level and trendcurrent estimates of level and trend in in each periodeach periodThe other set of formulas (II) merely The other set of formulas (II) merely computes the next forecast from computes the next forecast from the the observed data and forecast errors in the observed data and forecast errors in the last two periodslast two periods

LES formulas: ILES formulas: I1. Compute singly smoothed series at period 1. Compute singly smoothed series at period tt::

S'S'tt = = ααYYtt + (1+ (1--αα))S'S'tt--112. Compute doubly smoothed series:2. Compute doubly smoothed series:

S''S''tt = = αα S'S'tt + (1+ (1--αα) ) S''S''tt--113. Compute the estimated 3. Compute the estimated levellevel at period at period tt::

LLt t = 2= 2S'S'tt −− S''S''tt4. Compute the estimated 4. Compute the estimated trendtrend at period at period tt::

TTtt = (= (αα/(1/(1--αα))())(S'S't t −− S''S''tt ))5. Finally, the 5. Finally, the kk--step ahead step ahead forecastforecast is given by:is given by:

ttkt kTLY +=+ˆ

Startup: S'1 = S''1 = Y1

Very important startVery important start--up values:up values:

(If you don’t use these start(If you don’t use these start--up values, the up values, the early forecasts will gyrate wildly!)early forecasts will gyrate wildly!)

LES formulas: IILES formulas: II

Mathematically equivalent formula (requires Mathematically equivalent formula (requires fewer columns on a spreadsheet):fewer columns on a spreadsheet):

12

11 )α1()α1(22ˆ−−+ −+−−−= ttttt eeYYY

1221112 ,0 hence,ˆˆ YYeeYYY −====

Example, continuedExample, continuedLES model is optimized at α=0.16, yielding RMSE=102 (about the same as SES) …but the forecast plot shows a decreasing trend due to the local downward trend at end of series, confidence intervals also widen more rapidly due to assumption that trend may be varying


Brown's linear exp. smoothing with alpha = 0.1608

0 20 40 60 80 100 120100

300

500

700

900 Residual Autocorrelations for XBrown's linear exp. smoothing with alpha = 0.1608

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

LES vs. SESLES vs. SESSES assumes only a timeSES assumes only a time--varying varying level level (i.e., a (i.e., a local mean), while LES assumes a timelocal mean), while LES assumes a time--varying varying level and trend.level and trend.

SES assumes that the series is SES assumes that the series is moremore predictable predictable than a random walk, while LES is assumes it is than a random walk, while LES is assumes it is lessless predictable.predictable.

LES model is relatively unstable, hence it may be LES model is relatively unstable, hence it may be dangerous to extrapolate the local trend very far.dangerous to extrapolate the local trend very far.

There are fancier versions of LES that include a There are fancier versions of LES that include a “trend“trend--dampening” factor.dampening” factor.

LES vs. SES, continuedLES vs. SES, continuedIn both SES and LES, the In both SES and LES, the smaller smaller the value of the value of αα, the , the more smoothingmore smoothing (i.e., less response to (i.e., less response to the most recent observation)the most recent observation)

Remember that the “average age” is 1/Remember that the “average age” is 1/αα in in SES model (amount of lag behind turning SES model (amount of lag behind turning points).points).

In LES model, forecast is based on what was In LES model, forecast is based on what was happening between happening between 1/1/αα and 2and 2//αα periods ago.periods ago.

When fitted to the same series, LES usually When fitted to the same series, LES usually has a smaller optimal has a smaller optimal αα than SES.than SES.

LES vs. SES, continuedLES vs. SES, continuedSES is the most widely used nonSES is the most widely used non--seasonal forecasting model.seasonal forecasting model.

It has a sounder underlying theory than the SMA model, and it It has a sounder underlying theory than the SMA model, and it is computationally convenient to use on hundreds or thousands is computationally convenient to use on hundreds or thousands of parallel time series (e.g., for SKUof parallel time series (e.g., for SKU--level forecasting).level forecasting).

Its assumption of Its assumption of no trendno trend is often unrealistic, but it is is often unrealistic, but it is surprisingly robust in practice for shortsurprisingly robust in practice for short--term forecaststerm forecasts----often often better than LES even for series that have trends.better than LES even for series that have trends.

You can add an You can add an exponentialexponential trend via the inflation adjustment trend via the inflation adjustment option.option.

You can add a You can add a linearlinear trend to an SES model by fitting it as an trend to an SES model by fitting it as an ARIMA(0,1,1) model ARIMA(0,1,1) model with constantwith constant----but you can’t combine but you can’t combine ARIMA with seasonal adjustment in the Forecasting procedure.ARIMA with seasonal adjustment in the Forecasting procedure.

Estimation issuesEstimation issuesOptimization of Optimization of αα is performed by is performed by nonlinear nonlinear least squaresleast squares (like Excel(like Excel’’s nonlinear solver).s nonlinear solver).

Nonlinear estimation requires a Nonlinear estimation requires a ““searchsearch””process whose solution is inexact and may process whose solution is inexact and may depend on the starting value.depend on the starting value.

In In StatgraphicsStatgraphics, you may notice that the optimal , you may notice that the optimal αα varies slightly when the model is revisited, varies slightly when the model is revisited, because it restarts the estimation from the because it restarts the estimation from the previous optimum.previous optimum.

Estimation issues, continuedEstimation issues, continuedαα is constrained to lie between 0.0001 and is constrained to lie between 0.0001 and 0.9999 for SES and LES models.0.9999 for SES and LES models.

If the best SES model is actually a random walk If the best SES model is actually a random walk model (model (αα=1), then the estimation algorithm will =1), then the estimation algorithm will converge to 0.9999. This will often happen if the converge to 0.9999. This will often happen if the series has a significant trend.series has a significant trend.

Once Once αα hits its upper bound (0.9999), the hits its upper bound (0.9999), the estimation may get estimation may get ““stuckstuck”” there. Try manually there. Try manually changing the initial value to (say) 0.5 before rechanging the initial value to (say) 0.5 before re--fitting the model if the data sample is changed.fitting the model if the data sample is changed.

Estimation issues, continuedEstimation issues, continuedBecause LES and SES use Because LES and SES use ““recursiverecursive””formulas in which each forecast depends on formulas in which each forecast depends on prior errors, their estimation also depends on prior errors, their estimation also depends on how they are how they are initializedinitialized (i.e., on the (i.e., on the ““prior prior errorserrors”” that are assumed at the very that are assumed at the very beginning).beginning).

The usual approach is to just assume that the The usual approach is to just assume that the first error is zero.first error is zero.

A more sophisticated approach, available as A more sophisticated approach, available as an estimation option in an estimation option in StatgraphicsStatgraphics, is to use , is to use ““backforecastingbackforecasting””* to start up the model.* to start up the model.

*We’ll discuss this in more detail later in the course.

Holt’s linear exponential smoothingHolt’s linear exponential smoothing

Holt’s model improves on LES by Holt’s model improves on LES by introducing separate smoothing constants introducing separate smoothing constants for level and trend (“alpha” and “beta”)for level and trend (“alpha” and “beta”)In theory, this allows it to perform more In theory, this allows it to perform more stable trend estimation while adapting to stable trend estimation while adapting to sudden jumps in levelsudden jumps in level

Holt’s model formulasHolt’s model formulas

1. Updated level 1. Updated level LLtt is an interpolation is an interpolation between the most recent data point and the between the most recent data point and the previous forecast of the level:previous forecast of the level:

1 1(1 )( )t t t tL Y L Tα α − −= + − +

Most recent data point Forecast of Ltmade at period t-1


2. Updated trend 2. Updated trend TTtt is an interpolation is an interpolation between the change in the estimated between the change in the estimated level and the previous estimate of the level and the previous estimate of the trend:trend:

11 1 −− β−+−β= tttt TLLT )()(

Just-observed change in the level

Previous trend estimate


3. 3. kk--step ahead forecast from period step ahead forecast from period t:t:

Extrapolation of level and trend from period t

t k t tY L kT+ = +

Example, continuedExample, continuedHolt’s model is optimized at α=0.306, β=0.007 yielding RMSE = 100 (essentially same as SES & LES) …but forecast plot shows a slightly increasinglocal trend at end of series, due to relatively heavy smoothing of trend!


Residual Autocorrelations for XHolt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Holt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069

0 20 40 60 80 100 120100

300

500

700

900

Model comparisonsModel comparisonsModels B-C-D-E hardly differ on error measures.

Model choice should also depend on “theoretical”

considerations, such as the

reasonableness of the trend

assumptions

A cautionary word about trend A cautionary word about trend extrapolationextrapolation

If you are forecasting If you are forecasting more than one more than one period aheadperiod ahead, it is especially important to , it is especially important to estimate the trend correctlyestimate the trend correctly

In general, trend assumptions and In general, trend assumptions and estimation should be based on estimation should be based on everything everything you knowyou know about a time series, not just about a time series, not just error statistics of oneerror statistics of one--periodperiod--ahead ahead forecasts or tforecasts or t--stats of slope coefficientsstats of slope coefficients

A cautionary word about trend A cautionary word about trend extrapolationextrapolation

Extrapolation of timeExtrapolation of time--varying trends varying trends estimated by “double smoothing” can be estimated by “double smoothing” can be dangerousdangerous

Hence SES (perhaps with fixed trend) often Hence SES (perhaps with fixed trend) often works better in practiceworks better in practice

A A trend dampening factortrend dampening factor is often is often used in conjunction with LES or Holt’s:used in conjunction with LES or Holt’s:

2ˆ ( ... )kt k t tY L Tφ φ φ+ = + + + +

(0 1)φ< <

Combining seasonal adjustment with Combining seasonal adjustment with a nona non--seasonal smoothing modelseasonal smoothing model

Often a seasonally adjusted series looks like a good Often a seasonally adjusted series looks like a good candidate for fitting with a smoothing or averaging model.candidate for fitting with a smoothing or averaging model.

Hence, you can forecast a seasonal series by a Hence, you can forecast a seasonal series by a combination of seasonal adjustment and noncombination of seasonal adjustment and non--seasonal seasonal smoothing (or other nonsmoothing (or other non--seasonal model).seasonal model).

This “hybrid” approach allows you to model the seasonal This “hybrid” approach allows you to model the seasonal pattern explicitly, but it does not have a solid underlying pattern explicitly, but it does not have a solid underlying statistical theorystatistical theory----confidence limits may be dubious.confidence limits may be dubious.

There is also some danger of There is also some danger of overfittingoverfitting the seasonal the seasonal pattern if you don’t have enough seasons of data.pattern if you don’t have enough seasons of data.

Example of LES + seasonal Example of LES + seasonal adjustment on a spreadsheetadjustment on a spreadsheet

The single-equation form of the LES model is easily implemented on a spread-sheet, and Solver can be used to find the value of αα that minimizes RMSE.

LES outLES out--ofof--sample forecastssample forecasts

The LES model, like any other one-step-ahead forecasting model, can extrapolate its forecasts into the future by “bootstrapping” itself, i.e., by

substituting the one-step-ahead forecast for the next data point and then forecasting the next period from there, and so on.

LES forecasts for seasonally LES forecasts for seasonally adjusted dataadjusted data

0.000

50.000

100.000

150.000

200.000

250.000

300.000

350.000

400.000

450.000

500.000

Dec

-83

Dec

-84

Dec

-85

Dec

-86

Dec

-87

Dec

-88

Dec

-89

Dec

-90

Dec

-91

Dec

-92

Dec

-93

Dec

-94

Seasonally adjustedLES forecast

Note that LES lags behind turning points, like all smoothing models…

…but it tracks the data pretty well during stretches where

the trend is consistent……and its out-of-sample forecasts extrapolate the

most recent trend

ReRe--seasonalizedseasonalized LES forecastsLES forecasts

0.0

100.0

200.0

300.0

400.0

500.0

600.0

Dec-83

Jun-8

4Dec

-84Ju

n-85

Dec-85

Jun-8

6Dec

-86Ju

n-87

Dec-87

Jun-8

8Dec

-88Ju

n-89

Dec-89

Jun-9

0Dec

-90Ju

n-91

Dec-91

Jun-9

2Dec

-92Ju

n-93

Dec-93

Jun-9

4Dec

-94Ju

n-95

Original seriesReseasonalized forecast

Not bad! (if we believe local trend

estimate…)

Example: housing startsExample: housing starts

Series displays strong seasonality as well as cyclicality

Original data (not seasonally adjusted)Original data (not seasonally adjusted)

Time Series Plot for HousesNSAH

ouse

sNSA

1/83 1/87 1/91 1/95 1/99 1/0339

59

79

99

119

139

New residential construction since 1983

Note the last observation…

Seasonally adjusted dataSeasonally adjusted data

After seasonal adjustment, variations in level and trend are clearer

Time Series Plot for SADJUSTEDSA

DJU

STED

1/83 1/87 1/91 1/95 1/99 1/0354

74

94

114

134

In seasonally adjusted terms, the last observation is abnormally large!

How will different models react to it?

(This abnormality was not so

apparent on the unadjusted graph!)

Time Sequence Plot for SADJUSTEDRandom walk with drift = 0.139171

1/83 1/88 1/93 1/98 1/03 1/0850

100

150actualforecast50.0% limits

NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: fitted to adjusted data: RW+driftRW+drift

Depending on the kind of long-term trend assumptions we feel are appropriate, we could fit the seasonally adjusted series with

a non-seasonal model such as a random walk with drift...

This model extrapolates the long-term trend from the most recent (higher)

level

Time Sequence Plot for SADJUSTEDSimple exponential smoothing with alpha = 0.4682

1/83 1/88 1/93 1/98 1/03 1/0850

100


…or a simple exponential smoothing model...

This model extrapolates a flat

trend from an exponentially-

weighted average of recent levels

NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: SESfitted to adjusted data: SES

Time Sequence Plot for SADJUSTEDBrown's linear exp. smoothing with alpha = 0.2352

1/83 1/88 1/93 1/98 1/03 1/0850

100


…or Brown’s linear exponential smoothing model...

This model tries to extrapolate the

recent trend, which is jerked upward by the

last observation

NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: Brown’s LESfitted to adjusted data: Brown’s LES

Time Sequence Plot for SADJUSTEDHolt's linear exp. smoothing with alpha = 0.4765 and beta = 0.015

1/83 1/88 1/93 1/98 1/03 1/0850

100


… or Holt’s linear exponential smoothing model...

This model also tries to extrapolate the recent trend,

but the trend estimate is more conservative due

to small “beta” (heavy smoothing)

NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: Holt’s LESfitted to adjusted data: Holt’s LES

Hybrid seasonal models in SGHybrid seasonal models in SGYou can fit hybrid models in the Forecasting You can fit hybrid models in the Forecasting procedure in procedure in StatgraphicsStatgraphics by selecting by selecting “multiplicative seasonal adjustment” in conjunction “multiplicative seasonal adjustment” in conjunction with a RW or SES or LES model type.with a RW or SES or LES model type.

The forecasts are automatically “The forecasts are automatically “reseasonalizedreseasonalized” in ” in the plots and model comparison statisticsthe plots and model comparison statistics

Be on guard against Be on guard against overfittingoverfitting: seasonal : seasonal adjustment adds many parameters to the model, adjustment adds many parameters to the model, and estimation period statistics may not be fully and estimation period statistics may not be fully adjusted to correct for additional parameters.adjusted to correct for additional parameters.

Hybrid seasonal modelsHybrid seasonal models

Time Sequence Plot for HousesNSARandom walk with drift = 0.142988

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


RW + seasonal adjustmentRW + seasonal adjustment

Here’s the result of fitting the RW-with-drift model with multiplicative seasonal adjustment

Note sharply raised

forecasts, driven by unusual

seasonally adjusted value

of last data point

Time Sequence Plot for HousesNSASimple exponential smoothing with alpha = 0.4617

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Here’s the result of fitting the SES model with multiplicative seasonal adjustment

More conservative (though still raised) forecasts, tighter confidence limits

SES + seasonal adjustmentSES + seasonal adjustment

Time Sequence Plot for HousesNSABrown's linear exp. smoothing with alpha = 0.2365

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Here’s the result of fitting the LES model with multiplicative seasonal adjustment

Forecasts march steeply upward, confidence limits are rather wide

Brown’s LES + seasonal adjustmentBrown’s LES + seasonal adjustment

Time Sequence Plot for HousesNSAHolt's linear exp. smoothing with alpha = 0.4667 and beta = 0.0144

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Here’s the result of fitting Holt’s model with multiplicative seasonal adjustment

Forecasts start from higher level

but with flatter trend than LES, but confidence limits are rather

optimistic

Holt’s LES + seasonal adjustmentHolt’s LES + seasonal adjustment

Time Sequence Plot for HousesNSALinear trend = 76.7875 + 0.0262053 t

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Just for fun, here’s a linear trend model with multiplicative seasonal adjustment

Obviously not appropriate!

Linear trend + seasonal adjustment (?)Linear trend + seasonal adjustment (?)

Model comparison report shows that SES and Holt’s do the best in estimation

period, although RW model is slightly “luckier” in

validation period (last 4 years of data were held out)

Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594

1/83 1/87 1/91 1/95 1/99 1/03-18

-8

2

12

22Re

sidua

l

Residual Autocorrelations for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594

lag

Aut

ocor

relat

ions

0 5 10 15 20 25-1

-0.6

-0.2

0.2

0.6

1

Residual plots for SES model show stable

variance, no significant autocorrelation… model

appears “OK”

Even the (vertical) probability plot looks good.* This is a “pane option” behind the “residual plots”.

Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594

prop

ortio

n

-18 -8 2 12 220.1

15

2050809599

99.9

*This result validates the use of normal distribution theory to compute the confidence intervals from the forecast standard errors.

What’s the best forecast?What’s the best forecast?The main issue here is what to infer from the recent The main issue here is what to infer from the recent jump in jump in seasonally adjustedseasonally adjusted housing starts.housing starts.

Our modeling results do not really answer this Our modeling results do not really answer this question for usquestion for us——they merely show the they merely show the consequences of different assumptions we may consequences of different assumptions we may wish to make.wish to make.

Ideally, “domain knowledge” should shed additional Ideally, “domain knowledge” should shed additional light on the appropriateness of the assumptions.light on the appropriateness of the assumptions.

The SES model is clearly the most “conservative” The SES model is clearly the most “conservative” choice, because its forecasts are less radically choice, because its forecasts are less radically affected by one recent observation.affected by one recent observation.

Winter’s Seasonal SmoothingWinter’s Seasonal SmoothingThe logic of Holt’s model can be extended to The logic of Holt’s model can be extended to recursively estimate recursively estimate timetime--varying seasonal varying seasonal indicesindices as well as level and trend.as well as level and trend.

Let Let LLtt, , TTtt, , and and SStt denote the estimated level, denote the estimated level, trend, and seasonal index at period trend, and seasonal index at period tt. .

Let Let ss denote the number of periods in a denote the number of periods in a season.season.

Let Let αα, , ββ, and , and γγ denote denote separate smoothing separate smoothing constants*constants* for level, trend, and seasonalityfor level, trend, and seasonality

*numbers between 0 and 1: smaller values → more smoothing

Winters’ model formulasWinters’ model formulas

1. Updated level 1. Updated level LLtt is an interpolation is an interpolation between the between the seasonally adjustedseasonally adjusted value of value of the most recent data point and the the most recent data point and the previous forecast of the level:previous forecast of the level:

))(( 111 −−−

+α−+α= ttst

tt TL

SYL

Seasonally adjusted value of Yt

Forecast of Ltmade at period t-1


2. Updated trend 2. Updated trend TTtt is an interpolation is an interpolation between the change in the estimated between the change in the estimated level and the previous estimate of the level and the previous estimate of the trend:trend:

11 1 −− β−+−β= tttt TLLT )()(

Just-observed change in the level

Previous trend estimate


3. Updated seasonal index 3. Updated seasonal index SStt is an is an interpolation between the ratio of the interpolation between the ratio of the data point to the estimated level and the data point to the estimated level and the previous estimate of the seasonal index:previous estimate of the seasonal index:

stt

tt S

LYS −γ−+γ= )(1

“Ratio to moving average” of

current data point

Last estimate of seasonal index in the same season


4. 4. kk--step ahead forecast from period step ahead forecast from period t:t:

Extrapolation of level and trend from period t

Most recent estimate of the seasonal index for kth

period in the future

kstttkt SkTLY +−+ += )(ˆ

Estimation issuesEstimation issues

Estimation of Winters’ model is tricky, Estimation of Winters’ model is tricky, and not all software does it well: and not all software does it well: sometimes you get crazy results.sometimes you get crazy results.

There are three separate smoothing There are three separate smoothing constants to be jointly estimated by constants to be jointly estimated by nonlinear least squares (nonlinear least squares (αα, , ββ, , γγ).).

Initialization is also tricky, especially for Initialization is also tricky, especially for the seasonal indices.the seasonal indices.

Estimation issuesEstimation issuesSome common initialization schemes:Some common initialization schemes:

Naïve approach: set initial level = 1st data Naïve approach: set initial level = 1st data point, trend = 0, seasonal indices = 1.0point, trend = 0, seasonal indices = 1.0

More sophisticated: perform a seasonal More sophisticated: perform a seasonal decomposition to obtain initial seasonal decomposition to obtain initial seasonal indices & fit trend line to obtain initial trendindices & fit trend line to obtain initial trend

Even more sophisticated: use Even more sophisticated: use backforecastingbackforecasting

Calculation of confidence intervals is also Calculation of confidence intervals is also complicated & not always done correctly.complicated & not always done correctly.

Time Sequence Plot for HousesNSAWinter's exp. smoothing with alpha = 0.4454, beta = 0.0146, gamma = 0.2843

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Winter’s model fitted to housing startsWinter’s model fitted to housing starts

Results of fitting Winters’ model

In this case, the Winters forecasts

& confidence intervals look

similar to those of the Holt’s model

with seasonal adjustment (alpha and beta are very similar as should

be expected)

Model comparison report shows that

Winters’ fits a little less well than SES or Holt’s model, but is otherwise

“OK”

Winters’ model in practiceWinters’ model in practiceThe Winters model is popular in “automatic The Winters model is popular in “automatic forecasting” software, because it has a little forecasting” software, because it has a little of everything (level, trend, seasonality).of everything (level, trend, seasonality).

Sometimes it works well, but difficulties in Sometimes it works well, but difficulties in initialization & estimation can lead to strange initialization & estimation can lead to strange results in other cases.results in other cases.

In principle it is similar to linear exponential In principle it is similar to linear exponential smoothing and can produce similarly smoothing and can produce similarly unstable longunstable long--term trend projections.term trend projections.

DATE

VariablesRW+driftSESLESHOLTWINTERSACTUAL

2002 2003 2004 2005 2006 200770

100

130

160

190

220

All models overpredicted housing starts for the rest of 1992 and 1993, over-responding to the Feb. ‘02 jump, but later values were in the middle range of predictions until recent plunge

What really happened in last 5 years?What really happened in last 5 years?

Class 4 recapClass 4 recapAveraging and smoothing models enable you to Averaging and smoothing models enable you to estimate estimate timetime--varying levels and trendsvarying levels and trends..

SMA, SES, and LES models can be combined with SMA, SES, and LES models can be combined with seasonal adjustmentseasonal adjustment to forecast seasonal data to forecast seasonal data (...but beware of changing seasonal patterns and (...but beware of changing seasonal patterns and possibility of possibility of overfittingoverfitting))

Winters’ estimates Winters’ estimates timetime--varying seasonal indicesvarying seasonal indices..

YouYou need to exercise judgment in model selection need to exercise judgment in model selection in order to make appropriate assumptions about in order to make appropriate assumptions about changing levels and trends & unusual events.changing levels and trends & unusual events.

Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Decision 411: Class 4...

Documents

Transcript of Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Decision 411: Class 4...