Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Decision 411: Class 4...
Transcript of Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Decision 411: Class 4...
Decision 411: Class 4Decision 411: Class 4
NonNon--seasonal averaging & smoothing seasonal averaging & smoothing modelsmodels
Simple moving average (SMA) modelSimple moving average (SMA) modelSimple exponential smoothing (SES) modelSimple exponential smoothing (SES) modelLinear exponential smoothing (LES) modelLinear exponential smoothing (LES) model
Combining seasonal adjustment with Combining seasonal adjustment with nonnon--seasonal smoothingseasonal smoothingWinters’ seasonal smoothing modelWinters’ seasonal smoothing model
Guidelines for future HW Guidelines for future HW writeupswriteupsPresentation should Presentation should stand on its ownstand on its own (SG files are (SG files are mainly just for audit trail)mainly just for audit trail)What’s the bottom line? (forecast, trend, key drivers?)What’s the bottom line? (forecast, trend, key drivers?)Clearly Clearly define the variablesdefine the variables (units, dates, (units, dates, transformations, etc.) used in the analysistransformations, etc.) used in the analysisUse Use bullet pointsbullet points for key observations & findingsfor key observations & findingsUse Use tablestables to present key numbers (forecasts & CI’s)to present key numbers (forecasts & CI’s)Embed the most important Embed the most important chart(schart(s), with annotations), with annotationsShow Show where the numbers came fromwhere the numbers came fromExplain your model’s Explain your model’s assumptionsassumptions in layman’s termsin layman’s terms
Averaging & smoothing modelsAveraging & smoothing models
Today’s topics
Later: ARIMA modelsLater: ARIMA modelsWe’ll meet ARIMA later in the course,
but briefly, an “ARIMA (p,d,q)” model is like a regression model in which the dependent variable is a
d-order difference of the input variable, and the independent variables are p
lagged values of the dependent variable (AR terms) and/or q lagged values of
the forecast errors (MA terms), plus an optional constant term. Many of the averaging & smoothing models are
special cases, e.g., an ARIMA(0,1,1) model is an SES model.
p = # AR terms (lags of dependent variable)
q = # MA terms (lags of errors)
d = order of differencing of input variable
Averaging & smoothing modelsAveraging & smoothing modelsThe problem: sometimes The problem: sometimes nonseasonalnonseasonal (or (or seasonally adjusted) data appears to be “locally seasonally adjusted) data appears to be “locally stationary” with a timestationary” with a time--varying meanvarying mean
The mean (constant) model doesn’t track The mean (constant) model doesn’t track changes in the mean, has changes in the mean, has positivelypositivelyautocorrelatedautocorrelated errors errors
The random walk model may not perform well The random walk model may not perform well either in this situation: it “either in this situation: it “oversteersoversteers”, picks up ”, picks up too much “noise” in the data, and yields too much “noise” in the data, and yields negatively negatively correlated errorscorrelated errors
Residual Autocorrelations for XConstant mean = 463.136
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Constant mean = 463.136
0 20 40 60 80 100 120100
300
500
700
900
Example: series “X”Example: series “X”•• Mean (constant) model yields positively Mean (constant) model yields positively
autocorrelatedautocorrelated errors.... doesn’t react to errors.... doesn’t react to changes in the local mean ...RMSE = 121changes in the local mean ...RMSE = 121
Strong positive autocorrelation at lag 1No reaction to local changes in data
Residual Autocorrelations for XRandom walk
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Random walk
0 20 40 60 80 100 120100
300
500
700
900
Example, continuedExample, continued
Random walk model for series X yields Random walk model for series X yields negativelynegatively autocorrelatedautocorrelated errors.... errors.... overreactsoverreactsto changes... RMSE=122 …not any better!to changes... RMSE=122 …not any better!
Strong negative autocorrelation at lag 1Over-reaction to local changes in data (always 1 period too late)
A solution: A solution:
Use a model that averages or “Use a model that averages or “smoothssmooths” ” the recent data to filter out some of the the recent data to filter out some of the noise and estimate the local mean, such noise and estimate the local mean, such as the Simple Moving Average (SMA) as the Simple Moving Average (SMA) model:model:
1 2Y Y ... Yt t t mt mY + + +− − −=
…i.e., just average the last m observed values.
m=3 ⇒ avg. age = 2
m=5 ⇒ avg. age = 3
m=9 ⇒ avg. age = 5, etc.
…hence it lags behind turning points by (m+1)/2 periods
Properties of SMA modelProperties of SMA modelAverage ageAverage age of the data in the forecast is (of the data in the forecast is (mm+1)/2+1)/2
(m+1)/2 is midway between 1 period old
and m periods old
1 2Y Y ... Yt t t mt mY + + +− − −=
Properties of SMA, continuedProperties of SMA, continuedLongLong--term forecasts = term forecasts = horizontal straight linehorizontal straight line
( = ( = simplesimple average of last few valuesaverage of last few values))
Confidence limits??? No theory!!Confidence limits??? No theory!!
Works well on Works well on highly irregularhighly irregular data: no data data: no data point receives more weight than others, so point receives more weight than others, so it’s relatively robust against “outliers” it’s relatively robust against “outliers”
Can also be “tapered” for even greater Can also be “tapered” for even greater robustnessrobustness
Residual Autocorrelations for XSimple moving average of 3 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Simple moving average of 3 terms
0 20 40 60 80 100 120100
300
500
700
900
Example, continuedExample, continuedSMA with SMA with mm=3 (average age=2) yields RMSE=104 =3 (average age=2) yields RMSE=104 (significantly better!) and less negative autocorrelation. (significantly better!) and less negative autocorrelation. 50% confidence limits are shown here, but 50% confidence limits are shown here, but don’t trust don’t trust themthem: they are based on the assumption of the mean : they are based on the assumption of the mean remaining remaining fixedfixed at the latest value.at the latest value.
Forecasts lag behind turning points by about 2 periods
No autocorrelation at lag 150% confidence limits (?)
Residual Autocorrelations for XSimple moving average of 5 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Simple moving average of 5 terms
0 20 40 60 80 100 120100
300
500
700
900
Example, continuedExample, continuedSMA with SMA with mm=5 (average age=3) yields =5 (average age=3) yields RMSE=102 (very slightly better), “smoother” RMSE=102 (very slightly better), “smoother” forecasts, slight forecasts, slight positive positive autocorrelation in errorsautocorrelation in errors
Forecasts lag behind turning point by about 3 periods Slight positive autocorrelation at lag 1
50% confidence limits shown
Residual Autocorrelations for XSimple moving average of 9 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Simple moving average of 9 terms
0 20 40 60 80 100 120100
300
500
700
900
Example, continuedExample, continued
SMA with m=9 (average age=5) yields RMSE=104 (slightly worse), more positive autocorrelation in errors
Forecasts lag behind turning point by about 5 periods
More positive autocorrelation at lag 1
Simple moving average of 19 terms
0 20 40 60 80 100 120100
300
500
700
900Residual Autocorrelations for XSimple moving average of 19 terms
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Example, continuedExample, continued
SMA with m=19 (average age=10) yields RMSE=118 (significantly worse), very smooth forecasts, much more positive autocorrelation
Forecasts lag behind turning point by about 10 periods
Strong positive autocorrelation at lag 1
Smoothness vs. responsivenessSmoothness vs. responsiveness
Note that the more we smooth the data, the Note that the more we smooth the data, the more clearly we see the “signal” stand out.more clearly we see the “signal” stand out.
But...greater clarity comes at the expense of But...greater clarity comes at the expense of getting the news getting the news laterlater..
If we want our forecasting model to respond If we want our forecasting model to respond quicklyquickly to changes, it will also pick up “false to changes, it will also pick up “false alarms” due to noise in the data.alarms” due to noise in the data.
ConclusionsConclusions
For a time series with a For a time series with a randomly varying randomly varying local meanlocal mean, the SMA model may , the SMA model may outperform both the mean model and the outperform both the mean model and the random walk modelrandom walk model
It allows us to “strike a balance” between It allows us to “strike a balance” between averaging over too much past data or too averaging over too much past data or too little past data.little past data.
HoweverHowever......
Shortcomings of SMA modelShortcomings of SMA modelIt’s hard to optimize the number of terms It’s hard to optimize the number of terms ((mm), because it is a discrete parameter... ), because it is a discrete parameter... you must use trial and error.you must use trial and error.
Intuitively, you should Intuitively, you should not equally weightnot equally weight the the last last m m observations when computing the observations when computing the average... it would be better to “discount” average... it would be better to “discount” the older data in a gradual fashion.the older data in a gradual fashion.
These observations motivate....These observations motivate....
Brown’s Simple Exponential SmoothingBrown’s Simple Exponential Smoothing
Let: Let: αα = “smoothing constant” (0<= “smoothing constant” (0<αα<<1)1)SStt = smoothed series at period = smoothed series at period tt
RecursiveRecursive smoothing formula:smoothing formula:
SStt = = ααYYtt + (1+ (1−− αα) ) SStt--11
Forecast for next period = Forecast for next period = current smoothed current smoothed valuevalue::
tt SY =+1ˆ
Mathematically equivalent formulas for Mathematically equivalent formulas for SES forecastsSES forecasts
ttt YYY ˆα)1(αˆ 1 −+=+
ttt eYY αˆˆ 1 +=+ ttt YYe ˆ−=
ttt eYY α)1(ˆ 1 −−=+
forecast = interpolation between previous forecastand previous observation
forecast = previous forecast plus fraction α of previous error:
forecast = previous observationminus fraction 1-α of previous error
Mathematically equivalent formulas for Mathematically equivalent formulas for SES forecasts, continuedSES forecasts, continued
...]α)1(α)1(α)1(α[ˆ 33
22
11 +−+−+−+= −−−+ ttttt YYYYY
forecast = exponentially weighted moving average of all past observations
…or in other words, a discounted moving average with a discount factor of 1-α per period
Last but not least:
Properties of SES modelProperties of SES modelSES uses a smoothing parameter (SES uses a smoothing parameter (αα) ) which is which is continuously variablecontinuously variable, so it is , so it is easily optimized by least squareseasily optimized by least squares
If If αα = 1, SES = 1, SES →→ random walk model random walk model
If If αα = 0, SES = 0, SES →→ constant model constant model
Average ageAverage age of data in SES forecast is 1/of data in SES forecast is 1/ααExamples:Examples: αα = 0.5 = 0.5 ⇒⇒ avg. age = 2avg. age = 2
αα = 0.2 = 0.2 ⇒⇒ avg. age = 5avg. age = 5αα = 0.1 = 0.1 ⇒⇒ avg. age = 10, etc.avg. age = 10, etc.
Properties of SES, continuedProperties of SES, continuedFor a given average age, SES is For a given average age, SES is somewhat superior to SMA because somewhat superior to SMA because it it places relatively more weight on the places relatively more weight on the most recent observationmost recent observation
Hence it is slightly more "responsive" to Hence it is slightly more "responsive" to changes changes occuringoccuring in the recent past.in the recent past.
Caveat: it is also more sensitive to Caveat: it is also more sensitive to recent “outliers” than the SMA modelrecent “outliers” than the SMA model----not so good for messy data.not so good for messy data.
SMA (SMA (mm=9) vs. SES (=9) vs. SES (αα=0.2)=0.2)
0
0.05
0.1
0.15
0.2
0.25
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Lag
SMA weightSES weight
SMA weights are 1/9 on first 9 lags of Y, zero afterward
SES weights are larger than SMA weights at first few lags, then gradually decline to zero
average age = 5 for both models
Average age is the center of mass (“balancing point”) of the weight distribution
Properties of SES, continuedProperties of SES, continuedLongLong--term forecasts from the basic SES term forecasts from the basic SES model are a model are a horizontal straight linehorizontal straight line (no trend, (no trend, as in random walk and SMA)as in random walk and SMA)
SES = ARIMA(0,1,1), i.e., random walk model SES = ARIMA(0,1,1), i.e., random walk model ((withoutwithout drift) plus MA = 1, which adds a drift) plus MA = 1, which adds a multiple of lagmultiple of lag--1 forecast error:1 forecast error:
ttt eYY α)1(ˆ 1 −−=+
random walk lag-1 error
Properties of SES, continuedProperties of SES, continued
Note that it increases with Note that it increases with kk more slowly than for the more slowly than for the random walk model, which is the special case random walk model, which is the special case αα = 1:= 1:
)1(2
)( α)1(1 fcstkfcst SEkSE −+=
)1()( fcstkfcst SEkSE =
Exact Exact kk--step ahead forecast standard error step ahead forecast standard error can can be computed using ARIMA theory:be computed using ARIMA theory:
Hence the SES model assumes the series is Hence the SES model assumes the series is “more predictable” than a random walk“more predictable” than a random walk
Residual Autocorrelations for XSimple exponential smoothing with alpha = 0.2961
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Example, continuedExample, continuedSES with optimal α=0.3 (average age=3.3) yields RMSE = 99 (best yet, by a small margin), no significant residual autocorrelations
Don’t worry about an isolated spike at an oddball lag like lag 9—probably just due to a pair of large errors separated by 9 periods
50% confidence limits shown
Simple exponential smoothing with alpha = 0.2961
0 20 40 60 80 100 120100
300
500
700
900
SES with constant trendSES with constant trend
A A constant linear trendconstant linear trend can be added to can be added to an SES model by fitting it as an an SES model by fitting it as an ARIMA(0,1,1) model ARIMA(0,1,1) model with constantwith constant
Alas, the ARIMA implementation of SES Alas, the ARIMA implementation of SES models can’t be combined with seasonal models can’t be combined with seasonal adjustment in the Forecasting procedure adjustment in the Forecasting procedure in in StatgraphicsStatgraphics (although you could (although you could seasonally adjust and then fit an ARIMA seasonally adjust and then fit an ARIMA model in two steps)model in two steps)
SES with constant trend, continuedSES with constant trend, continued
A constant A constant exponential trendexponential trend can be can be added to SES by using the added to SES by using the inflation inflation adjustmentadjustment option in option in StatgraphicsStatgraphicsThe average percentage growth per The average percentage growth per period can be estimated from the slope period can be estimated from the slope coefficient of a linear trend model or coefficient of a linear trend model or ARIMA(0,1,0)+c model fitted with a ARIMA(0,1,0)+c model fitted with a natural lognatural log transformationtransformationSee video clip #10 for examplesSee video clip #10 for examples
Evidently what is needed is an estimate of Evidently what is needed is an estimate of the the local trendlocal trend as well as the local meanas well as the local mean
This is the motivating idea behind Brown’s This is the motivating idea behind Brown’s LinearLinear Exponential Smoothing (LES) modelExponential Smoothing (LES) model
It’s also sometimes called “double It’s also sometimes called “double exponential smoothing, because it involves exponential smoothing, because it involves a double application of exponential a double application of exponential smoothingsmoothing
What if the series has a What if the series has a timetime--varying varying trendtrend, as well as a time, as well as a time--varying mean?varying mean?
How LES worksHow LES worksApply SES once to get a singlyApply SES once to get a singly--smoothed smoothed series series SStt′′ that lags behind the that lags behind the current current value by 1/value by 1/αα −− 11 periods.*periods.*
Smooth the smoothed series (using Smooth the smoothed series (using same same αα) to get an ) to get an even smoothereven smoother series series SStt″″ that lags behind by 2(1/that lags behind by 2(1/αα −− 1) periods1) periods
To forecast the future, To forecast the future, extrapolate a lineextrapolate a linebetween the two points (between the two points (t t −− ((1/1/αα −− 1)1), , SStt′′ ) ) and (and (t t −− 22(1/(1/αα −− 1)1), , SStt″″ ) )
*Average age relative to next value is 1/α, so age relative to current value is 1/α - 1
XS'S''
0 20 40 60 80 1000
200
400
600
800
LES forecasts from LES forecasts from t t = 90, = 90, αα=0.1*=0.1*1. Draw a horizontal line extending 9 periods back in time from the current value of the singly-smoothed series
2. Draw a horizontal line extending 18 periods back in time from the current value of the doubly-smoothed series
3. Extrapolate a line into the future through the left endpoints of the
two horizontal lines
*1/α = 10, so 1/α - 1 = 9
How LES worksHow LES worksThere are two equivalent sets of There are two equivalent sets of mathematical formulas for implementing mathematical formulas for implementing the logic of the LES modelthe logic of the LES modelOne set of formulas (I) explicitly computes One set of formulas (I) explicitly computes the the current estimates of level and trendcurrent estimates of level and trend in in each periodeach periodThe other set of formulas (II) merely The other set of formulas (II) merely computes the next forecast from computes the next forecast from the the observed data and forecast errors in the observed data and forecast errors in the last two periodslast two periods
LES formulas: ILES formulas: I1. Compute singly smoothed series at period 1. Compute singly smoothed series at period tt::
S'S'tt = = ααYYtt + (1+ (1--αα))S'S'tt--112. Compute doubly smoothed series:2. Compute doubly smoothed series:
S''S''tt = = αα S'S'tt + (1+ (1--αα) ) S''S''tt--113. Compute the estimated 3. Compute the estimated levellevel at period at period tt::
LLt t = 2= 2S'S'tt −− S''S''tt4. Compute the estimated 4. Compute the estimated trendtrend at period at period tt::
TTtt = (= (αα/(1/(1--αα))())(S'S't t −− S''S''tt ))5. Finally, the 5. Finally, the kk--step ahead step ahead forecastforecast is given by:is given by:
ttkt kTLY +=+ˆ
Startup: S'1 = S''1 = Y1
Very important startVery important start--up values:up values:
(If you don’t use these start(If you don’t use these start--up values, the up values, the early forecasts will gyrate wildly!)early forecasts will gyrate wildly!)
LES formulas: IILES formulas: II
Mathematically equivalent formula (requires Mathematically equivalent formula (requires fewer columns on a spreadsheet):fewer columns on a spreadsheet):
12
11 )α1()α1(22ˆ−−+ −+−−−= ttttt eeYYY
1221112 ,0 hence,ˆˆ YYeeYYY −====
Example, continuedExample, continuedLES model is optimized at α=0.16, yielding RMSE=102 (about the same as SES) …but the forecast plot shows a decreasing trend due to the local downward trend at end of series, confidence intervals also widen more rapidly due to assumption that trend may be varying
50% confidence limits shown
Brown's linear exp. smoothing with alpha = 0.1608
0 20 40 60 80 100 120100
300
500
700
900 Residual Autocorrelations for XBrown's linear exp. smoothing with alpha = 0.1608
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
LES vs. SESLES vs. SESSES assumes only a timeSES assumes only a time--varying varying level level (i.e., a (i.e., a local mean), while LES assumes a timelocal mean), while LES assumes a time--varying varying level and trend.level and trend.
SES assumes that the series is SES assumes that the series is moremore predictable predictable than a random walk, while LES is assumes it is than a random walk, while LES is assumes it is lessless predictable.predictable.
LES model is relatively unstable, hence it may be LES model is relatively unstable, hence it may be dangerous to extrapolate the local trend very far.dangerous to extrapolate the local trend very far.
There are fancier versions of LES that include a There are fancier versions of LES that include a “trend“trend--dampening” factor.dampening” factor.
LES vs. SES, continuedLES vs. SES, continuedIn both SES and LES, the In both SES and LES, the smaller smaller the value of the value of αα, the , the more smoothingmore smoothing (i.e., less response to (i.e., less response to the most recent observation)the most recent observation)
Remember that the “average age” is 1/Remember that the “average age” is 1/αα in in SES model (amount of lag behind turning SES model (amount of lag behind turning points).points).
In LES model, forecast is based on what was In LES model, forecast is based on what was happening between happening between 1/1/αα and 2and 2//αα periods ago.periods ago.
When fitted to the same series, LES usually When fitted to the same series, LES usually has a smaller optimal has a smaller optimal αα than SES.than SES.
LES vs. SES, continuedLES vs. SES, continuedSES is the most widely used nonSES is the most widely used non--seasonal forecasting model.seasonal forecasting model.
It has a sounder underlying theory than the SMA model, and it It has a sounder underlying theory than the SMA model, and it is computationally convenient to use on hundreds or thousands is computationally convenient to use on hundreds or thousands of parallel time series (e.g., for SKUof parallel time series (e.g., for SKU--level forecasting).level forecasting).
Its assumption of Its assumption of no trendno trend is often unrealistic, but it is is often unrealistic, but it is surprisingly robust in practice for shortsurprisingly robust in practice for short--term forecaststerm forecasts----often often better than LES even for series that have trends.better than LES even for series that have trends.
You can add an You can add an exponentialexponential trend via the inflation adjustment trend via the inflation adjustment option.option.
You can add a You can add a linearlinear trend to an SES model by fitting it as an trend to an SES model by fitting it as an ARIMA(0,1,1) model ARIMA(0,1,1) model with constantwith constant----but you can’t combine but you can’t combine ARIMA with seasonal adjustment in the Forecasting procedure.ARIMA with seasonal adjustment in the Forecasting procedure.
Estimation issuesEstimation issuesOptimization of Optimization of αα is performed by is performed by nonlinear nonlinear least squaresleast squares (like Excel(like Excel’’s nonlinear solver).s nonlinear solver).
Nonlinear estimation requires a Nonlinear estimation requires a ““searchsearch””process whose solution is inexact and may process whose solution is inexact and may depend on the starting value.depend on the starting value.
In In StatgraphicsStatgraphics, you may notice that the optimal , you may notice that the optimal αα varies slightly when the model is revisited, varies slightly when the model is revisited, because it restarts the estimation from the because it restarts the estimation from the previous optimum.previous optimum.
Estimation issues, continuedEstimation issues, continuedαα is constrained to lie between 0.0001 and is constrained to lie between 0.0001 and 0.9999 for SES and LES models.0.9999 for SES and LES models.
If the best SES model is actually a random walk If the best SES model is actually a random walk model (model (αα=1), then the estimation algorithm will =1), then the estimation algorithm will converge to 0.9999. This will often happen if the converge to 0.9999. This will often happen if the series has a significant trend.series has a significant trend.
Once Once αα hits its upper bound (0.9999), the hits its upper bound (0.9999), the estimation may get estimation may get ““stuckstuck”” there. Try manually there. Try manually changing the initial value to (say) 0.5 before rechanging the initial value to (say) 0.5 before re--fitting the model if the data sample is changed.fitting the model if the data sample is changed.
Estimation issues, continuedEstimation issues, continuedBecause LES and SES use Because LES and SES use ““recursiverecursive””formulas in which each forecast depends on formulas in which each forecast depends on prior errors, their estimation also depends on prior errors, their estimation also depends on how they are how they are initializedinitialized (i.e., on the (i.e., on the ““prior prior errorserrors”” that are assumed at the very that are assumed at the very beginning).beginning).
The usual approach is to just assume that the The usual approach is to just assume that the first error is zero.first error is zero.
A more sophisticated approach, available as A more sophisticated approach, available as an estimation option in an estimation option in StatgraphicsStatgraphics, is to use , is to use ““backforecastingbackforecasting””* to start up the model.* to start up the model.
*We’ll discuss this in more detail later in the course.
Holt’s linear exponential smoothingHolt’s linear exponential smoothing
Holt’s model improves on LES by Holt’s model improves on LES by introducing separate smoothing constants introducing separate smoothing constants for level and trend (“alpha” and “beta”)for level and trend (“alpha” and “beta”)In theory, this allows it to perform more In theory, this allows it to perform more stable trend estimation while adapting to stable trend estimation while adapting to sudden jumps in levelsudden jumps in level
Holt’s model formulasHolt’s model formulas
1. Updated level 1. Updated level LLtt is an interpolation is an interpolation between the most recent data point and the between the most recent data point and the previous forecast of the level:previous forecast of the level:
1 1(1 )( )t t t tL Y L Tα α − −= + − +
Most recent data point Forecast of Ltmade at period t-1
Holt’s model formulasHolt’s model formulas
2. Updated trend 2. Updated trend TTtt is an interpolation is an interpolation between the change in the estimated between the change in the estimated level and the previous estimate of the level and the previous estimate of the trend:trend:
11 1 −− β−+−β= tttt TLLT )()(
Just-observed change in the level
Previous trend estimate
Holt’s model formulasHolt’s model formulas
3. 3. kk--step ahead forecast from period step ahead forecast from period t:t:
Extrapolation of level and trend from period t
t k t tY L kT+ = +
Example, continuedExample, continuedHolt’s model is optimized at α=0.306, β=0.007 yielding RMSE = 100 (essentially same as SES & LES) …but forecast plot shows a slightly increasinglocal trend at end of series, due to relatively heavy smoothing of trend!
50% confidence limits shown
Residual Autocorrelations for XHolt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069
0 5 10 15 20 25
lag
-1
-0.6
-0.2
0.2
0.6
1
Aut
ocor
rela
tions
Holt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069
0 20 40 60 80 100 120100
300
500
700
900
Model comparisonsModel comparisonsModels B-C-D-E hardly differ on error measures.
Model choice should also depend on “theoretical”
considerations, such as the
reasonableness of the trend
assumptions
A cautionary word about trend A cautionary word about trend extrapolationextrapolation
If you are forecasting If you are forecasting more than one more than one period aheadperiod ahead, it is especially important to , it is especially important to estimate the trend correctlyestimate the trend correctly
In general, trend assumptions and In general, trend assumptions and estimation should be based on estimation should be based on everything everything you knowyou know about a time series, not just about a time series, not just error statistics of oneerror statistics of one--periodperiod--ahead ahead forecasts or tforecasts or t--stats of slope coefficientsstats of slope coefficients
A cautionary word about trend A cautionary word about trend extrapolationextrapolation
Extrapolation of timeExtrapolation of time--varying trends varying trends estimated by “double smoothing” can be estimated by “double smoothing” can be dangerousdangerous
Hence SES (perhaps with fixed trend) often Hence SES (perhaps with fixed trend) often works better in practiceworks better in practice
A A trend dampening factortrend dampening factor is often is often used in conjunction with LES or Holt’s:used in conjunction with LES or Holt’s:
2ˆ ( ... )kt k t tY L Tφ φ φ+ = + + + +
(0 1)φ< <
Combining seasonal adjustment with Combining seasonal adjustment with a nona non--seasonal smoothing modelseasonal smoothing model
Often a seasonally adjusted series looks like a good Often a seasonally adjusted series looks like a good candidate for fitting with a smoothing or averaging model.candidate for fitting with a smoothing or averaging model.
Hence, you can forecast a seasonal series by a Hence, you can forecast a seasonal series by a combination of seasonal adjustment and noncombination of seasonal adjustment and non--seasonal seasonal smoothing (or other nonsmoothing (or other non--seasonal model).seasonal model).
This “hybrid” approach allows you to model the seasonal This “hybrid” approach allows you to model the seasonal pattern explicitly, but it does not have a solid underlying pattern explicitly, but it does not have a solid underlying statistical theorystatistical theory----confidence limits may be dubious.confidence limits may be dubious.
There is also some danger of There is also some danger of overfittingoverfitting the seasonal the seasonal pattern if you don’t have enough seasons of data.pattern if you don’t have enough seasons of data.
Example of LES + seasonal Example of LES + seasonal adjustment on a spreadsheetadjustment on a spreadsheet
The single-equation form of the LES model is easily implemented on a spread-sheet, and Solver can be used to find the value of αα that minimizes RMSE.
LES outLES out--ofof--sample forecastssample forecasts
The LES model, like any other one-step-ahead forecasting model, can extrapolate its forecasts into the future by “bootstrapping” itself, i.e., by
substituting the one-step-ahead forecast for the next data point and then forecasting the next period from there, and so on.
LES forecasts for seasonally LES forecasts for seasonally adjusted dataadjusted data
0.000
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
450.000
500.000
Dec
-83
Dec
-84
Dec
-85
Dec
-86
Dec
-87
Dec
-88
Dec
-89
Dec
-90
Dec
-91
Dec
-92
Dec
-93
Dec
-94
Seasonally adjustedLES forecast
Note that LES lags behind turning points, like all smoothing models…
…but it tracks the data pretty well during stretches where
the trend is consistent……and its out-of-sample forecasts extrapolate the
most recent trend
ReRe--seasonalizedseasonalized LES forecastsLES forecasts
0.0
100.0
200.0
300.0
400.0
500.0
600.0
Dec-83
Jun-8
4Dec
-84Ju
n-85
Dec-85
Jun-8
6Dec
-86Ju
n-87
Dec-87
Jun-8
8Dec
-88Ju
n-89
Dec-89
Jun-9
0Dec
-90Ju
n-91
Dec-91
Jun-9
2Dec
-92Ju
n-93
Dec-93
Jun-9
4Dec
-94Ju
n-95
Original seriesReseasonalized forecast
Not bad! (if we believe local trend
estimate…)
Example: housing startsExample: housing starts
Series displays strong seasonality as well as cyclicality
Original data (not seasonally adjusted)Original data (not seasonally adjusted)
Time Series Plot for HousesNSAH
ouse
sNSA
1/83 1/87 1/91 1/95 1/99 1/0339
59
79
99
119
139
New residential construction since 1983
Note the last observation…
Seasonally adjusted dataSeasonally adjusted data
After seasonal adjustment, variations in level and trend are clearer
Time Series Plot for SADJUSTEDSA
DJU
STED
1/83 1/87 1/91 1/95 1/99 1/0354
74
94
114
134
In seasonally adjusted terms, the last observation is abnormally large!
How will different models react to it?
(This abnormality was not so
apparent on the unadjusted graph!)
Time Sequence Plot for SADJUSTEDRandom walk with drift = 0.139171
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: fitted to adjusted data: RW+driftRW+drift
Depending on the kind of long-term trend assumptions we feel are appropriate, we could fit the seasonally adjusted series with
a non-seasonal model such as a random walk with drift...
This model extrapolates the long-term trend from the most recent (higher)
level
Time Sequence Plot for SADJUSTEDSimple exponential smoothing with alpha = 0.4682
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
…or a simple exponential smoothing model...
This model extrapolates a flat
trend from an exponentially-
weighted average of recent levels
NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: SESfitted to adjusted data: SES
Time Sequence Plot for SADJUSTEDBrown's linear exp. smoothing with alpha = 0.2352
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
…or Brown’s linear exponential smoothing model...
This model tries to extrapolate the
recent trend, which is jerked upward by the
last observation
NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: Brown’s LESfitted to adjusted data: Brown’s LES
Time Sequence Plot for SADJUSTEDHolt's linear exp. smoothing with alpha = 0.4765 and beta = 0.015
1/83 1/88 1/93 1/98 1/03 1/0850
100
150actualforecast50.0% limits
… or Holt’s linear exponential smoothing model...
This model also tries to extrapolate the recent trend,
but the trend estimate is more conservative due
to small “beta” (heavy smoothing)
NonseasonalNonseasonal forecasting model forecasting model fitted to adjusted data: Holt’s LESfitted to adjusted data: Holt’s LES
Hybrid seasonal models in SGHybrid seasonal models in SGYou can fit hybrid models in the Forecasting You can fit hybrid models in the Forecasting procedure in procedure in StatgraphicsStatgraphics by selecting by selecting “multiplicative seasonal adjustment” in conjunction “multiplicative seasonal adjustment” in conjunction with a RW or SES or LES model type.with a RW or SES or LES model type.
The forecasts are automatically “The forecasts are automatically “reseasonalizedreseasonalized” in ” in the plots and model comparison statisticsthe plots and model comparison statistics
Be on guard against Be on guard against overfittingoverfitting: seasonal : seasonal adjustment adds many parameters to the model, adjustment adds many parameters to the model, and estimation period statistics may not be fully and estimation period statistics may not be fully adjusted to correct for additional parameters.adjusted to correct for additional parameters.
Hybrid seasonal modelsHybrid seasonal models
Time Sequence Plot for HousesNSARandom walk with drift = 0.142988
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
RW + seasonal adjustmentRW + seasonal adjustment
Here’s the result of fitting the RW-with-drift model with multiplicative seasonal adjustment
Note sharply raised
forecasts, driven by unusual
seasonally adjusted value
of last data point
Time Sequence Plot for HousesNSASimple exponential smoothing with alpha = 0.4617
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Here’s the result of fitting the SES model with multiplicative seasonal adjustment
More conservative (though still raised) forecasts, tighter confidence limits
SES + seasonal adjustmentSES + seasonal adjustment
Time Sequence Plot for HousesNSABrown's linear exp. smoothing with alpha = 0.2365
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Here’s the result of fitting the LES model with multiplicative seasonal adjustment
Forecasts march steeply upward, confidence limits are rather wide
Brown’s LES + seasonal adjustmentBrown’s LES + seasonal adjustment
Time Sequence Plot for HousesNSAHolt's linear exp. smoothing with alpha = 0.4667 and beta = 0.0144
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Here’s the result of fitting Holt’s model with multiplicative seasonal adjustment
Forecasts start from higher level
but with flatter trend than LES, but confidence limits are rather
optimistic
Holt’s LES + seasonal adjustmentHolt’s LES + seasonal adjustment
Time Sequence Plot for HousesNSALinear trend = 76.7875 + 0.0262053 t
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Just for fun, here’s a linear trend model with multiplicative seasonal adjustment
Obviously not appropriate!
Linear trend + seasonal adjustment (?)Linear trend + seasonal adjustment (?)
Model comparison report shows that SES and Holt’s do the best in estimation
period, although RW model is slightly “luckier” in
validation period (last 4 years of data were held out)
Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594
1/83 1/87 1/91 1/95 1/99 1/03-18
-8
2
12
22Re
sidua
l
Residual Autocorrelations for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594
lag
Aut
ocor
relat
ions
0 5 10 15 20 25-1
-0.6
-0.2
0.2
0.6
1
Residual plots for SES model show stable
variance, no significant autocorrelation… model
appears “OK”
Even the (vertical) probability plot looks good.* This is a “pane option” behind the “residual plots”.
Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594
prop
ortio
n
-18 -8 2 12 220.1
15
2050809599
99.9
*This result validates the use of normal distribution theory to compute the confidence intervals from the forecast standard errors.
What’s the best forecast?What’s the best forecast?The main issue here is what to infer from the recent The main issue here is what to infer from the recent jump in jump in seasonally adjustedseasonally adjusted housing starts.housing starts.
Our modeling results do not really answer this Our modeling results do not really answer this question for usquestion for us——they merely show the they merely show the consequences of different assumptions we may consequences of different assumptions we may wish to make.wish to make.
Ideally, “domain knowledge” should shed additional Ideally, “domain knowledge” should shed additional light on the appropriateness of the assumptions.light on the appropriateness of the assumptions.
The SES model is clearly the most “conservative” The SES model is clearly the most “conservative” choice, because its forecasts are less radically choice, because its forecasts are less radically affected by one recent observation.affected by one recent observation.
Winter’s Seasonal SmoothingWinter’s Seasonal SmoothingThe logic of Holt’s model can be extended to The logic of Holt’s model can be extended to recursively estimate recursively estimate timetime--varying seasonal varying seasonal indicesindices as well as level and trend.as well as level and trend.
Let Let LLtt, , TTtt, , and and SStt denote the estimated level, denote the estimated level, trend, and seasonal index at period trend, and seasonal index at period tt. .
Let Let ss denote the number of periods in a denote the number of periods in a season.season.
Let Let αα, , ββ, and , and γγ denote denote separate smoothing separate smoothing constants*constants* for level, trend, and seasonalityfor level, trend, and seasonality
*numbers between 0 and 1: smaller values → more smoothing
Winters’ model formulasWinters’ model formulas
1. Updated level 1. Updated level LLtt is an interpolation is an interpolation between the between the seasonally adjustedseasonally adjusted value of value of the most recent data point and the the most recent data point and the previous forecast of the level:previous forecast of the level:
))(( 111 −−−
+α−+α= ttst
tt TL
SYL
Seasonally adjusted value of Yt
Forecast of Ltmade at period t-1
Winters’ model formulasWinters’ model formulas
2. Updated trend 2. Updated trend TTtt is an interpolation is an interpolation between the change in the estimated between the change in the estimated level and the previous estimate of the level and the previous estimate of the trend:trend:
11 1 −− β−+−β= tttt TLLT )()(
Just-observed change in the level
Previous trend estimate
Winters’ model formulasWinters’ model formulas
3. Updated seasonal index 3. Updated seasonal index SStt is an is an interpolation between the ratio of the interpolation between the ratio of the data point to the estimated level and the data point to the estimated level and the previous estimate of the seasonal index:previous estimate of the seasonal index:
stt
tt S
LYS −γ−+γ= )(1
“Ratio to moving average” of
current data point
Last estimate of seasonal index in the same season
Winters’ model formulasWinters’ model formulas
4. 4. kk--step ahead forecast from period step ahead forecast from period t:t:
Extrapolation of level and trend from period t
Most recent estimate of the seasonal index for kth
period in the future
kstttkt SkTLY +−+ += )(ˆ
Estimation issuesEstimation issues
Estimation of Winters’ model is tricky, Estimation of Winters’ model is tricky, and not all software does it well: and not all software does it well: sometimes you get crazy results.sometimes you get crazy results.
There are three separate smoothing There are three separate smoothing constants to be jointly estimated by constants to be jointly estimated by nonlinear least squares (nonlinear least squares (αα, , ββ, , γγ).).
Initialization is also tricky, especially for Initialization is also tricky, especially for the seasonal indices.the seasonal indices.
Estimation issuesEstimation issuesSome common initialization schemes:Some common initialization schemes:
Naïve approach: set initial level = 1st data Naïve approach: set initial level = 1st data point, trend = 0, seasonal indices = 1.0point, trend = 0, seasonal indices = 1.0
More sophisticated: perform a seasonal More sophisticated: perform a seasonal decomposition to obtain initial seasonal decomposition to obtain initial seasonal indices & fit trend line to obtain initial trendindices & fit trend line to obtain initial trend
Even more sophisticated: use Even more sophisticated: use backforecastingbackforecasting
Calculation of confidence intervals is also Calculation of confidence intervals is also complicated & not always done correctly.complicated & not always done correctly.
Time Sequence Plot for HousesNSAWinter's exp. smoothing with alpha = 0.4454, beta = 0.0146, gamma = 0.2843
1/83 1/88 1/93 1/98 1/03 1/0850
75
100
125
150
175actualforecast50.0% limits
Winter’s model fitted to housing startsWinter’s model fitted to housing starts
Results of fitting Winters’ model
In this case, the Winters forecasts
& confidence intervals look
similar to those of the Holt’s model
with seasonal adjustment (alpha and beta are very similar as should
be expected)
Model comparison report shows that
Winters’ fits a little less well than SES or Holt’s model, but is otherwise
“OK”
Winters’ model in practiceWinters’ model in practiceThe Winters model is popular in “automatic The Winters model is popular in “automatic forecasting” software, because it has a little forecasting” software, because it has a little of everything (level, trend, seasonality).of everything (level, trend, seasonality).
Sometimes it works well, but difficulties in Sometimes it works well, but difficulties in initialization & estimation can lead to strange initialization & estimation can lead to strange results in other cases.results in other cases.
In principle it is similar to linear exponential In principle it is similar to linear exponential smoothing and can produce similarly smoothing and can produce similarly unstable longunstable long--term trend projections.term trend projections.
DATE
VariablesRW+driftSESLESHOLTWINTERSACTUAL
2002 2003 2004 2005 2006 200770
100
130
160
190
220
All models overpredicted housing starts for the rest of 1992 and 1993, over-responding to the Feb. ‘02 jump, but later values were in the middle range of predictions until recent plunge
What really happened in last 5 years?What really happened in last 5 years?
Class 4 recapClass 4 recapAveraging and smoothing models enable you to Averaging and smoothing models enable you to estimate estimate timetime--varying levels and trendsvarying levels and trends..
SMA, SES, and LES models can be combined with SMA, SES, and LES models can be combined with seasonal adjustmentseasonal adjustment to forecast seasonal data to forecast seasonal data (...but beware of changing seasonal patterns and (...but beware of changing seasonal patterns and possibility of possibility of overfittingoverfitting))
Winters’ estimates Winters’ estimates timetime--varying seasonal indicesvarying seasonal indices..
YouYou need to exercise judgment in model selection need to exercise judgment in model selection in order to make appropriate assumptions about in order to make appropriate assumptions about changing levels and trends & unusual events.changing levels and trends & unusual events.