Forecasting Sales for Dairy Products - Galit Shmueli

19
Indian School of Business Forecasting Sales for Dairy Products

Transcript of Forecasting Sales for Dairy Products - Galit Shmueli

Page 1: Forecasting Sales for Dairy Products - Galit Shmueli

Indian School of Business

Forecasting Sales for Dairy Products

Page 2: Forecasting Sales for Dairy Products - Galit Shmueli

Contents EXECUTIVE SUMMARY .............................................................................................................................. 3

Data Analysis ............................................................................................................................................. 3

Forecast Horizon: ...................................................................................................................................... 4

Forecasting Models: .................................................................................................................................. 4

Fresh milk - AmulTaaza (500 ml) ........................................................................................................... 4

Dahi/ Yogurt - Saras (200 gm) ............................................................................................................... 4

Dahi/Yogurt - Yakult Probiotic (350 ml) : .............................................................................................. 4

Assumptions .............................................................................................................................................. 5

Conclusions and Recommendations ......................................................................................................... 5

TECHNICAL ANALYSIS ................................................................................................................................ 6

Data Preparation: .................................................................................................................................. 6

Class: Fresh Milk; Item Description: AmulTaaza 500 ml ..................................................................... 6

Class: Dahi and Yoghurt; Item Description: SarasDahi 200gm ............................................................. 6

Class: Dahi& Yogurt; Item Description: Yakult Probiotic Drink ............................................................. 7

Appendix 1: Technical Details - Class: Fresh Milk; Item Description: AmulTaaza 500 ml ........................ 8

Appendix 2: Technical Details - Class: Dahi and Yoghurt; Item Description: SarasDahi 200gm ............. 11

Appendix 3: Technical Details - Class: Dahi& Yogurt; Item Description: Yakult Probiotic Drink ............ 15

Page 3: Forecasting Sales for Dairy Products - Galit Shmueli

EXECUTIVE SUMMARY

Retailers face a formidable challenge of ensuring that they have optimum levels of inventory for goods

that are perishable. This is because these goods have short shelf life without any salvage value and can

hurt the profitability of the retailers significantly. It therefore becomes critical for the retailers to know

accurate forecasts for perishable items such as fresh milk and yoghurt. These subclasses also drive

footfall into the retail stores and hence it is important to maintain high levels of service for these

products.

We have chosen to forecast unit sales for two product classes - Fresh Milk and Dahi-Yoghurt. The

rationale behind choosing these 2 product classes is that typically sales of these product classes would

have correlation and henceforth provide additional insights.

The sales data available from retail store for “Yakult Probiotic drink 325 ml” and “SarasDahi 200gm”

from class “Dahi and Yoghurt” and “AmulTaaza 500 ml” from class “Fresh Milk” is aggregated at a daily,

weekly and monthly level, to visualize any trends or recognizable patterns.

Data Analysis

Page 4: Forecasting Sales for Dairy Products - Galit Shmueli

Daily sales for AmulTaaza go up to 100 units, for SarasDahi up to 80 units and for Yakult Probiotic drink

up to 120 units. There is no consistent increase/decrease in sales figures or seasonal effects observable

during the 13 months of data. Very sharp increase in sales observed on a few days, without any

explainable reason.

The trend of weekly sales shows that sales rise over the weekends and come down during the

weekdays, most probably because more people come to the store on weekends. Sales also rise on

Wednesdays within weekdays.

Monthly sales for AmulTaaza show a decrease in variability as we move from Aug 2011 to Aug 2012. For

Yakult and SarasDahi, monthly sales show jumps up and down without any regular pattern.

Forecast Horizon:

In light of our business proposition for perishable goods, forecasting sales on a daily basis would be most

useful

Forecasting Models:

Fresh milk - AmulTaaza (500 ml)

We have chosen Moving Average (7) as our final forecasting model. After evaluating various methods

such as moving averages for other periods, naive forecastsetc, we decided to go for the MA (7) method

with a MSE of 351.25 on training and MSE of 304.24 on validation.

Dahi/ Yogurt - Saras (200 gm)

We have chosen Holt Winters No Trend (Alpha =0, Gamma = 0.03) as the final forecasting method. The

MSE of 222.66 produced by this model was superior to the MSE of 270.44 produced by the multiple

linear regression model.

Dahi/Yogurt - YakultProbiotic (350 ml)

We have tried the Holt Winters No Trend (Alpha =0.2, Gamma = 0.09) as well as the multiple regression

model. While the MSE for data validation for the two methods are comparable, the MSE for the training

data is much better for linear regression than the Holt Winters method. (96.79 Vs 140.65)

Page 5: Forecasting Sales for Dairy Products - Galit Shmueli

Assumptions

We have assumed that the historical purchasing pattern will be an indicator of the future purchasing

pattern as well and there will be no drastic change in it

The correlation that has been identified among products will continue to exist in the future

The forecasting model is based on daily sales which are assumed to represent demand of the products and does not consider the possibility of stock-outs that might have happened over the sales period

We suggest that the chosen models for each of the SKU's be used to forecast the daily level demand for

the month of September 2012.

Conclusions and Recommendations

We suggest that the chosen models for each of the SKU's be used to forecast the daily level demand for

the month of September 2012.

As seen from its forecasting model, sales of AmulTaaza do not exhibit any seasonality although there are

several random spikes. Hence, the implication for the managers is that any sudden increase or decrease

in sales should not be considered as increasing or decreasing trend. Average sales over past week are

the best predictor for next day’s sale. Since the sales go up and down drastically, managers should

consider having a high stock only if the stock out costs are considerably high than cost of overstocking.

For Saras (350 ml), we propose that the model be used with care. There are occasional peaks which the

model has not captured. While making decisions about inventory levels and safety stock, these

occasional peaks should be borne in mind. Since data till only May 2012 has been used, we strongly

recommend the use of latest data as soon as appropriate data is available for those months. The data

also suggest occasional supply disruptions, so the retailer may want to thoroughly check the data and

the corresponding forecasts for such anomalies.

As shown from the model for Yacult Probiotic drink, the daily sales exhibit seasonality. Hence, managers

should consider seasonality during the week while doing stock planning. The model provides reasonably

accurate forecasts (as seen from low values of MSE). Hence, managers can have good confidence on the

forecast.

Page 6: Forecasting Sales for Dairy Products - Galit Shmueli

TECHNICAL ANALYSIS

Data Preparation:

As part of the data preparation, total Quantity sold data was aggregated by Date for all the classes and

SKU’s. Quantity sold was then checked for every customer as well, for customers buying in bulk. The

mean of sales was calculated for each item description and any outliers which were more than two

standard deviations away from the mean value were replaced with mean value. The replacement of

outliers with mean value was chosen over deletion of such records as this is a time series data where

continuity is required for forecasting.

Class: Fresh Milk; Item Description: AmulTaaza 500 ml

As the data doesn’t have any trend or seasonality, the appropriate methods to capture the randomness

in data are the following:

Moving average smoothing

Neural Network

Charts: As observable from the charts below, the neural network is not able to fit the data, hence MA

method is chosen for further optimization.

The best fitting MA model was obtained for MA(7). Please see Technical details in Appendix 1

Class: Dahi and Yoghurt; Item Description: SarasDahi 200gm

Data Preparation: We realized that data for many of the months had a series of 0 sales. The time series

shows that except for the period from Dec '11 to May '12, all months have several 0 values making those

Page 7: Forecasting Sales for Dairy Products - Galit Shmueli

months inappropriate for analysis. We believe that there was probably a supply disruption in this period.

The final months we have chosen are from December '11 to May '12 as proper values are available.

As the data has no trends but considerable seasonality, the following forecasting models were used

Multiple Linear Regression with days of the week as dummy variables

Holt Winters with no trend

Charts:

Based on RMSE values and plotted chart, we decided to proceed with the Holt Winters No Trend

method (Alpha = 0, Gamma = 0.03) with season length = 7, please see details in Appendix 2

Class: Dahi& Yogurt; Item Description: Yakult Probiotic Drink

The models that can be proposed for this data are the following:

Linear Regression Model

Holt Winters No Trend

Charts: As observable from the chart below, Linear Regression gives an accurate forecast

The best fitting model was obtained

for Linear Regression. Please see

Technical details in Appendix 3

Page 8: Forecasting Sales for Dairy Products - Galit Shmueli

Appendix 1: Technical Details - Class: Fresh Milk; Item Description: AmulTaaza

500 ml

Data Availability: 1-Aug-2011 to 31-Aug-2012

RMSE/MSE Table:

As MA (7) gives lowest error on both training and validation data, MA (7) is chosen to forecast daily ales of

AmulTaaza

Forecasted Values:

Output of MA (7) on AmulTaaza Daily Sales

Transaction

Date Actual Predicted Residual

1-Jul-12 48 28.28571 19.71429

2-Jul-12 18 26.57143 -8.57143

3-Jul-12 15 26.57143 -11.5714

4-Jul-12 54 24.42857 29.57143

5-Jul-12 9 29.14286 -20.1429

6-Jul-12 36 29.14286 6.857143

7-Jul-12 51 30.42857 20.57143

MA 1

(naïve) MA 2 MA 3 MA 4 MA 5 MA 6 MA 7 MA 8 MA 9

MSE on training 630.25 501.43 429.11 408.27 384.35 374.67 351.25 351.81 355.50

MSE on validation 525.19 420.46 360.77 324.15 319.66 315.38 304.24 328.11 310.84

Page 9: Forecasting Sales for Dairy Products - Galit Shmueli

8-Jul-12 48 33 15

9-Jul-12 33 33 0

10-Jul-12 30 35.14286 -5.14286

11-Jul-12 9 37.28571 -28.2857

12-Jul-12 21 30.85714 -9.85714

13-Jul-12 24 32.57143 -8.57143

14-Jul-12 33 30.85714 2.142857

15-Jul-12 66 28.28571 37.71429

16-Jul-12 33 30.85714 2.142857

17-Jul-12 30 30.85714 -0.85714

18-Jul-12 3 30.85714 -27.8571

19-Jul-12 3 30 -27

20-Jul-12 6 27.42857 -21.4286

21-Jul-12 24 24.85714 -0.85714

22-Jul-12 0 23.57143 -23.5714

23-Jul-12 27 14.14286 12.85714

24-Jul-12 3 13.28571 -10.2857

25-Jul-12 3 9.428571 -6.42857

26-Jul-12 0 9.428571 -9.42857

27-Jul-12 24 9 15

28-Jul-12 30 11.57143 18.42857

29-Jul-12 51 12.42857 38.57143

30-Jul-12 30 19.71429 10.28571

31-Jul-12 12 20.14286 -8.14286

1-Aug-12 66 21.42857 44.57143

2-Aug-12 45 30.42857 14.57143

3-Aug-12 9 36.85714 -27.8571

4-Aug-12 27 34.71429 -7.71429

5-Aug-12 45 34.28571 10.71429

6-Aug-12 18 33.42857 -15.4286

7-Aug-12 3 31.71429 -28.7143

8-Aug-12 30 30.42857 -0.42857

9-Aug-12 6 25.28571 -19.2857

10-Aug-12 51 19.71429 31.28571

Page 10: Forecasting Sales for Dairy Products - Galit Shmueli

11-Aug-12 51 25.71429 25.28571

12-Aug-12 30 29.14286 0.857143

13-Aug-12 12 27 -15

14-Aug-12 9 26.14286 -17.1429

15-Aug-12 39 27 12

16-Aug-12 18 28.28571 -10.2857

17-Aug-12 36 30 6

18-Aug-12 12 27.85714 -15.8571

19-Aug-12 9 22.28571 -13.2857

20-Aug-12 12 19.28571 -7.28571

21-Aug-12 12 19.28571 -7.28571

22-Aug-12 27 19.71429 7.285714

23-Aug-12 9 18 -9

24-Aug-12 42 16.71429 25.28571

25-Aug-12 18 17.57143 0.428571

26-Aug-12 33 18.42857 14.57143

27-Aug-12 12 21.85714 -9.85714

28-Aug-12 15 21.85714 -6.85714

29-Aug-12 18 22.28571 -4.28571

30-Aug-12 45 21 24

31-Aug-12 39 26.14286 12.85714

Page 11: Forecasting Sales for Dairy Products - Galit Shmueli

Appendix 2: Technical Details - Class: Dahi and Yoghurt; Item Description:

SarasDahi 200gm

Data available from 1-Aug-2011 to 31-Aug-2012

Trend Analysis

Forecasting method

The data we had from May to December had no trend and a constant seasonality. The seasonality was

weekly with sales jumping over the weekend. Based on this we considered two methods for this data -

1) Multiple Linear Regression with days of the week as dummy variables 2) Holt Winters with no trend.

For Holt Winters, we fine tuned values of alpha & gamma to come up with the best fitted model. For

linear regression, we performed a multi-step process to check if there is correlation with other products

such as AmulTaaza or Yakult and that is explained in the subsequent section

Charts

The chart below shows the Actual Vs Plotted for the Validation Data for both the linear regression and

the Holt Winters method. For both models, we have used the finest tuned version to compare.

Page 12: Forecasting Sales for Dairy Products - Galit Shmueli

Multiple linear regression was initially applied for Saras sales using the previous day sales of AmulTaaza,

previous day sales of Yakult, time, and dummy variables for weekday.

As shown in the Regression output below, the p-value for previous day sales of AmulTaaza and time

variables were significantly greater than the chosen cutoff value of 0.05. On removing the previous sales

of AmulTaaza as a predictor in the model, we saw that there was a slight increase in MSE value (as

shown in the table above). Hence, we added back previous sales of AmulTaaza to the model. The next

step was to remove time variable from the model. The fact that p-value of time variable was 0.41

reinforces that fact that the data for Sarasdahi quantity sold does not display any trend. However, the

statistically significant p-values for weekday wise sales shown weekday-wise seasonality in the data. On

removing the time variable from the model, MSE reduced to 270.44 from 297.65.

The output of the final regression model run is as follows:

Linear Regression Iteration 1

Linear Regression Iteration 2

Page 13: Forecasting Sales for Dairy Products - Galit Shmueli

Linear Regression Iteration 3

Page 14: Forecasting Sales for Dairy Products - Galit Shmueli

The output for the Holt Winters Method is shown below

Holt Winters Output and Errors

Transaction

DateActual Forecast Error LCI UCI

1-May-12 9 10.2128329 -1.21283291 -17.5798437 38.0055095

2-May-12 15 10.6351448 4.36485518 -17.1575317 38.4278214

3-May-12 3 10.4789002 -7.47890016 -17.3137764 38.2715767

4-May-12 0 12.6415702 -12.6415702 -15.1511063 40.4342468

5-May-12 24 25.9181642 -1.91816422 -1.87451235 53.7108408

6-May-12 12 24.5459421 -12.5459421 -3.24673442 52.3386187

7-May-12 0 10.0710076 -10.0710076 -17.721669 37.8636842

8-May-12 24 10.2128329 13.7871671 -17.5798437 38.0055095

9-May-12 30 10.6351448 19.3648552 -17.1575317 38.4278214

10-May-12 6 10.4789002 -4.47890016 -17.3137764 38.2715767

11-May-12 36 12.6415702 23.3584298 -15.1511063 40.4342468

12-May-12 18 25.9181642 -7.91816422 -1.87451235 53.7108408

13-May-12 18 24.5459421 -6.54594215 -3.24673442 52.3386187

14-May-12 0 10.0710076 -10.0710076 -17.721669 37.8636842

15-May-12 6 10.2128329 -4.21283291 -17.5798437 38.0055095

16-May-12 24 10.6351448 13.3648552 -17.1575317 38.4278214

17-May-12 6 10.4789002 -4.47890016 -17.3137764 38.2715767

18-May-12 0 12.6415702 -12.6415702 -15.1511063 40.4342468

19-May-12 21 25.9181642 -4.91816422 -1.87451235 53.7108408

20-May-12 78 24.5459421 53.4540579 -3.24673442 52.3386187

21-May-12 3 10.0710076 -7.0710076 -17.721669 37.8636842

22-May-12 0 10.2128329 -10.2128329 -17.5798437 38.0055095

23-May-12 0 10.6351448 -10.6351448 -17.1575317 38.4278214

24-May-12 0 10.4789002 -10.4789002 -17.3137764 38.2715767

25-May-12 27 12.6415702 14.3584298 -15.1511063 40.4342468

26-May-12 12 25.9181642 -13.9181642 -1.87451235 53.7108408

27-May-12 0 24.5459421 -24.5459421 -3.24673442 52.3386187

28-May-12 0 10.0710076 -10.0710076 -17.721669 37.8636842

29-May-12 0 10.2128329 -10.2128329 -17.5798437 38.0055095

30-May-12 0 10.6351448 -10.6351448 -17.1575317 38.4278214

31-May-12 0 10.4789002 -10.4789002 -17.3137764 38.2715767

0102030405060708090

Sara

s S

ale

s

Transaction Date

Time Plot of Actual Vs Forecast (Validation Data)

Actual Forecast

Error Measures

(Training)

MAPE 58.859906

MAD 10.823147

MSE 201.07061

Error Measures

(Validation)

Page 15: Forecasting Sales for Dairy Products - Galit Shmueli

RMSE/MSE tables:

The below shows various values for each methods we tried in order to come up with the most optimized

one.

Method MSE Values

Linear Regression [Time, Dummy Variables (Weekday), Yakult(t-1), AmulTaaza(t-1)] 297.65

Linear Regression [Time, Dummy Variables (Weekday), Yakult(t-1)] 298.97

Linear Regression [Dummy Variables (Weekday), Yakult(t-1), AmulTaaza(t-1)] 270.44

Holt Winters No Trend (Alpha = 0.2, Gamma = 0.05) 245.38

Holt Winters No Trend (Alpha = 0.00, Gamma = 0.03) 222.66

Appendix 3: Technical Details - Class: Dahi& Yogurt; Item Description: Yakult

Probiotic Drink

Seasonality was captured using a linear regression model by creating a categorical variable for Day of

the week. This categorical variable was then used to create dummy variables, which are then used as

predictors in the Model

MAPE 75.982246

MAD 11.659562

MSE 222.66142

Page 16: Forecasting Sales for Dairy Products - Galit Shmueli

Linear Regression:

Regression Model

Regression Equation

Quantity =- 957.2644 + 0.02372543 * t - 2.74370599 * Day of the Week_Monday + 10.47619915 * Day

of the week_Saturday + 27.12547493 * Day of the week_Sunday - 3.09158349 * Day of the

week_Thursday -1.90358925 * Day of the week_Tuesday + 1.60529459 * Day of the week_Wednesday

Forecasting

Date Actual Forecast

Regression

Forecast

Holt

Winters

Regression

Residual

Holt Winters

Residual

8/1/2012 9 19.972351 19.57667224 -10.97235067 -10.57667224

8/2/2012 36 15.299274 17.10501376 20.70072628 18.89498624

8/3/2012 6 18.414658 16.5764155 -12.41465834 -10.5764155

8/4/2012 27 28.914659 31.22367656 -1.91465862 -4.223676558

8/5/2012 75 45.587736 57.30486095 29.41226447 17.69513905

8/6/2012 21 15.742356 32.24542733 5.25764426 -11.24542733

8/7/2012 9 16.606274 20.78549845 -7.60627361 -11.78549845

8/8/2012 6 20.138959 19.57667224 -14.13895858 -13.57667224

8/9/2012 15 15.465882 17.10501376 -0.46588163 -2.105013762

8/10/2012 24 18.581266 16.5764155 5.41873375 7.423584498

8/11/2012 21 29.081267 31.22367656 -8.08126653 -10.22367656

8/12/2012 45 45.754343 57.30486095 -0.75434344 -12.30486095

Page 17: Forecasting Sales for Dairy Products - Galit Shmueli

8/13/2012 15

15.908964

32.24542733

-0.90896365

-17.24542733

8/14/2012 12 16.772882 20.78549845 -4.77288152 -8.785498445

8/15/2012 48 20.305566 19.57667224 27.69443351 28.42332776

8/16/2012 6 15.63249 17.10501376 -9.63248954 -11.10501376

8/17/2012 21 18.747874 16.5764155 2.25212584 4.423584498

8/18/2012 12 29.247874 31.22367656 -17.24787444 -19.22367656

8/19/2012 72 45.920951 57.30486095 26.07904865 14.69513905

8/20/2012 24 16.075572 32.24542733 7.92442844 -8.245427327

8/21/2012 6 16.939489 20.78549845 -10.93948943 -14.78549845

8/22/2012 9 20.472174 19.57667224 -11.4721744 -10.57667224

8/23/2012 24 15.799097 17.10501376 8.20090255 6.894986238

8/24/2012 27 18.914482 16.5764155 8.08551793 10.4235845

8/25/2012 6 29.414482 31.22367656 -23.41448235 -25.22367656

8/26/2012 63 46.087559 57.30486095 16.91244074 5.69513905

8/27/2012 12 16.242179 32.24542733 -4.24217947 -20.24542733

8/28/2012 9 17.106097 20.78549845 -8.10609734 -11.78549845

8/29/2012 6 20.638782 19.57667224 -14.63878231 -13.57667224

8/30/2012 3 15.965705 17.10501376 -12.96570536 -14.10501376

8/31/2012 12 19.08109 16.5764155 -7.08108998 -4.576415502

Page 18: Forecasting Sales for Dairy Products - Galit Shmueli

Holt Winters No Trend: Since the data has no trend but a weekly seasonality, Holt Winter No Trend model can be used for Forecasting

Model Performance Comparison

Sensitivity Report

Both models are doing equally well, however linear regression scores slightly better on Training &

Validation data.

Residuals

Page 19: Forecasting Sales for Dairy Products - Galit Shmueli

ACF Plot for Linear Regression and Holt Winters model