High De nition Price elasticity of demand for current ...

16
Submitted to Management Science manuscript High Definition Price elasticity of demand for current Consumer Packaged Goods and Retail challenges: a new hyperparameter optimization strategy for dynamic linear models V. Sanchez-Gil, D. Pizarroso Data Science at Wise Athena [email protected], wiseathena.com R. A. Queralt, J. M. Lopez Zafra CUNEF, [email protected]; [email protected] A new strategy has been developed for finding the best input variances of the state transitions for any dynamic linear model based on its predictive capability. This new approach, here called as Minimum Predictive Error (MinPE) criteria, has been applied to measuring price elasticity of demand in order to evaluate the sensitivity of consumers to daily price changes. The method has been compared to static linear models using real data from consumer packaged goods companies. It has been proven that subtle oscillations in price elasticity of demand are essential for correctly modeling the observed sales. The new proposed criteria for estimating the input variances compared to those provided by the Maximum Likelihood Estimation (MLE) method not only increases predictive capability of the model but also explores different and consistent results rejected by MLE. Moreover, it has been proven that subtle daily deviations in the price elasticity of demand can be unraveled only using MinPE criteria. Key words : price-demand elasticity, Kalman filter, dynamic linear model, DLM, predictive elasticity, CPG companies, Maximum Likelihood Estimation, MLE 1. Introduction Since the 19th century, price elasticity of demand has been used as main indicator for describing the behavior of products in the market. It is a valuable metric to evaluate product sensitivity to price changes, and is essential in pricing and trade promotions for companies all over the world (Heerde et al. (2013), Kocabiyikoglu and Popescu (2011)). This is reflected in the several manuscripts recently published about price elasticity of demand for food (Andreyeva et al. (2010)), water (Espey et al. (1997)), gasoline (Hughes et al. (2006)), housing (Green et al. (2005)) and online products (Granados et al. (2011)). 1

Transcript of High De nition Price elasticity of demand for current ...

Page 1: High De nition Price elasticity of demand for current ...

Submitted to Management Sciencemanuscript

High Definition Price elasticity of demand for currentConsumer Packaged Goods and Retail challenges: a

new hyperparameter optimization strategy fordynamic linear models

V. Sanchez-Gil, D. PizarrosoData Science at Wise Athena [email protected], wiseathena.com

R. A. Queralt, J. M. Lopez ZafraCUNEF, [email protected]; [email protected]

A new strategy has been developed for finding the best input variances of the state transitions for

any dynamic linear model based on its predictive capability. This new approach, here called as Minimum

Predictive Error (MinPE) criteria, has been applied to measuring price elasticity of demand in order to

evaluate the sensitivity of consumers to daily price changes. The method has been compared to static linear

models using real data from consumer packaged goods companies. It has been proven that subtle oscillations

in price elasticity of demand are essential for correctly modeling the observed sales. The new proposed criteria

for estimating the input variances compared to those provided by the Maximum Likelihood Estimation

(MLE) method not only increases predictive capability of the model but also explores different and consistent

results rejected by MLE. Moreover, it has been proven that subtle daily deviations in the price elasticity of

demand can be unraveled only using MinPE criteria.

Key words : price-demand elasticity, Kalman filter, dynamic linear model, DLM, predictive elasticity, CPG

companies, Maximum Likelihood Estimation, MLE

1. Introduction

Since the 19th century, price elasticity of demand has been used as main indicator for

describing the behavior of products in the market. It is a valuable metric to evaluate

product sensitivity to price changes, and is essential in pricing and trade promotions for

companies all over the world (Heerde et al. (2013), Kocabiyikoglu and Popescu (2011)).

This is reflected in the several manuscripts recently published about price elasticity of

demand for food (Andreyeva et al. (2010)), water (Espey et al. (1997)), gasoline (Hughes

et al. (2006)), housing (Green et al. (2005)) and online products (Granados et al. (2011)).

1

Page 2: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models2 Article submitted to Management Science; manuscript no.

Equally relevant is demand forecasting, which has a major impact in stock and pro-

duction planning, inventory management and new products launch in fields as diverse as

Energy (Suganthi and Samuel (2012)), Tourism (Law (2000)), Retail (Ma et al. (2016))

and grocery stores (Ali et al. (2009)). Nowadays, decision-making processes in Consumer

Packaged Goods and Retail companies strongly depend on accurate predictive models for

demand and price elasticity.

Even though studies of elasticity dynamics have provided valuable evidences of its time

variations and its importance for pricing products optimally (Simon (1979), Simon (1989),

Liu and Hanssens (1981), Fibich et al. (2005)), there are no specific computational methods

in the literature for estimating these time variations. The present study tries to correct

this deficiency.

Among the many models available, Dynamic Linear Models (DLM) with time-varying

parameters (Young (2011)) prove extremely useful for evaluating underlying unobserved

variables such as the price elasticity of demand and its variations over time. In combination

with the Kalman Filter (KF), dynamic linear models provide insightful solutions in many

different business problems (Beravs et al. (2012), Vaquero et al. (2017), Li et al. (2004),

Fei et al. (2011)). Since DLM models are very sensitive to input variances, Maximum

Likelihood Estimation (MLE) has been widely used for decades in order to find the best

possible combination (Myung (2003), McCulloch (1997)).

In this article, we propose a new hyperparameter optimization strategy for dynamic

linear models, called MinPE, as an alternative for the classic MLE. The MinPE perfor-

mance in predicting demand and price elasticity, has been compared to those provided

by a dynamic linear model with MLE and a static linear model, based on real prices and

quantities demanded for 83 products.

2. Method description

2.1. Demand modeling

The demand of a product is affected by its price, seasonal effects, marketing campaigns,

cannibalization and competitors effects coming from other products, etcetera (Kapoor

and Ravi (2016)). Since the aim of this study is estimating price elasticity of demand

Page 3: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 3

fluctuations we are assuming that daily demand of a product in the market at time t,

defined as qt, is mainly affected by its price pt, and its weekly seasonality. It can be linearly

modeled as:

log qt = βt · log pt + γ1,t · d1,t + γ2,t · d2,t + γ3,t · d3,t + γ4,t · d4,t + γ5,t · d5,t + γ1,t6 · d6,t +αt (1)

where βt is the price elasticity of demand, d1,t, d2,t . . . d6,t are binary variables from

Monday to Saturday and αt is an additive value for a given day t. Sunday is not required

as it is collinear to the remaining days of week. Long term seasonality effects (monthly,

yearly. . . ), cannibalization and competitors effects could be added to the model. It is

important to note that this demand model could be generalized for any aggregation level,

from daily sales of a product in a store to monthly sales of one category of goods using a

weighted monthly price (Hoch et al. (1995)).

2.2. Static linear model (SLM)

As a first approximation, we use a static linear model where all time-dependent linear

parameters from eq. [1], βt, γ1,t. . . γ6,t and αt, are constant over time. Therefore, a constant

descriptive value for price elasticity of demand over a range of dates can be easily obtained

with a multiple linear regression from real historical data. Due to its simplicity, the SLM

approach is one of the most common procedures followed by companies for estimating price

elasticities (Shy (2008)). Nevertheless, this approach is unstable and very sensitive to local

effects for short periods of time.

2.3. Dynamic linear model with Kalman Filter

The second approach to be considered in this analysis is a dynamic linear model (DLM)

where all linear parameters in eq. [1] are time-dependent. In contrast to SLM, DLM pro-

vides a both continuous and soft time series for βt, γ1,t . . . γ6,t and αt.

For this study, we propose a State-Space Model (SSM) with an integrated random walk

(Young (2011)) for price elasticity βt with its time variations coming from its derivative

δt, elementary random walks for weekly seasonality γ1,t. . . γ6,t and the additive value αt

(see appendix A). The state vector θt, containing statistical noise, changes randomly over

Page 4: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models4 Article submitted to Management Science; manuscript no.

time according to state transition equations. The Kalman filter (KF) provides an educated

guess of the current value of the state vector based on joint probability distributions every

time we make a new observation (Kalman (1960), Grewal and S. (2011)). The observation

and transitions equations are mathematically described as:

log qt = Ftθt + εt ; εt ∼N(0, σ2q) (2)

θt =Gtθt−1 +ωt ;ωt ∼N(0,Wt) (3)

where the 9-dimensional state vector, θt and Gt, Ft and Wt matrices are correspondingly

defined as:

θt =

βt

δt

γ1,t...

γ6,t

αt

, Ft =

[pt 0 d1,t d2,t d3,t d4,t d5,t d6,t 1

], (4)

Gt =

1 1 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 1

, Wt =

σ2β 0 0 0 0 0 0 0 0

0 σ2δ 0 0 0 0 0 0 0

0 0 σ2γ 0 0 0 0 0 0

0 0 0 σ2γ 0 0 0 0 0

0 0 0 0 σ2γ 0 0 0 0

0 0 0 0 0 σ2γ 0 0 0

0 0 0 0 0 0 σ2γ 0 0

0 0 0 0 0 0 0 σ2γ 0

0 0 0 0 0 0 0 0 σ2α

. (5)

As can be seen in the Wt matrix, it is assumed that the six weekly seasonality terms

have the same transition variance σ2γ = σ2

γ,1 . . . σ2γ,6, hence the three variances of the state

transitions to be optimized are σ2γ, σ

2δ and σ2

α. More detailed information about the selected

SSM can be found in appendix A.

DLM with KF can forecast further into the future by a sequence of generation steps

without observation steps, but they are extremely sensitive to the chosen variances of

state transitions which define the variability and the shape of the resulting state vector

Page 5: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 5

curves. For this reason, there are different strategies available in order to find the best

combination of transition variances. In this article, we compare the classical Maximum

Likelihood Estimation (MLE) variances optimization strategy with the new approach called

Minimum Predictive Error (MinPE).

2.3.1. Maximum Likelihood Estimation (DLM+MLE) Maximum Likelihood Esti-

mation (MLE) finds the hyperparameters combination for a given model that maximizes

the likelihood of observing the real historical data (White (1982), Akaike (1998)). In this

case, MLE automatically finds the values of the variances σ2δ , σ

2γ and σ2

α that better fit

the observed demand values q. Initial starting variances chosen for MLE optimization are

σ2init,δ = σ2

init,γ = σ2init,α = 10−5 and the maximum number of models tried per product is 364.

The actual number of trained models varies from 133 to 364 when the convergence criterion

is reached before all combinations have been calculated. This means that, in average, 203

models are tested per product.

2.3.2. Minimum Predictive Error (DLM+MinPE) In contrast, Minimum Predictive

Error criteria finds the parameters combination for a given model that minimizes the

predictive error in the test dataset (Sanchez-Gil et al. (2019)). The main idea behind this

approach is that the higher the predictive capability, the more robust the model is. As a

standard practice in machine learning models, historical data are split in two groups of

data: train and test. Train dataset contains most of the available data and it is known by

DLM and KF for estimating the state vector θt. The remaining available data conform the

test dataset, a small fraction of the whole available data that are not visible to the model

(usually the most recent data) for testing purposes. In this case, 365 days and the last 30

days are used in train and test datasets, respectively.

All different combinations from a finite list of possible values of the required transition

variances are proven, and for every combination the predictive error of the trained model

is evaluated using the test dataset following the workflow diagram shown in figure 1. In

order to measure the error of the forecasting model, the metric being used is the mean

absolute percentage error, mape, in the last N test days:

mape(q,q) =1

N

N∑j=1

∣∣∣∣ qj − qjqj

∣∣∣∣ , (6)

Page 6: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models6 Article submitted to Management Science; manuscript no.

where qj is the predicted value for a given combination of transition variances compared

with the observed one qj. Other error metrics such as mae, wmae, mse, rmse, etcetera

could be also used in this new approach.

Figure 1 Flow diagram of MinPE hyperparamater optimization strategy applied to demand prediction and price

elasticity estimation.

In our demand model, the list of possible values for the three transitions variances σ2δ ,

σ2γ, σ

2α are given by powers of ten from 10−5 to 10−10, from σ2

δ · 100 to σ2δ · 10−5 and from

σ2δ · 100 to σ2

δ · 10−5 respectively. Different ranges have been chosen on purpose, exclusively

testing equal or lower (but not higher) values for σ2γ, σ

2α compared to σ2

δ in order to enable

β fluctuations. Therefore, 216 models are trained per product for finding the optimal

variances combination (of the same order of magnitude as 203 mean attempts by SKU

tried using MLE criteria). In complex models with a higher number of input variances,

testing enough combinations in reasonable computational times would require efficient

search algorithms.

3. Input Data

Input data used in this article for testing and comparing the different approaches, provided

by Wise Athena c©, are daily sellout prices and sales at a given point of sale. Wise Athena c©

helps CPG companies to increase their sales and margins by optimizing pricing and trade

promotion through machine learning models fed with very granular data of dozens of

CPG companies from different countries. These input data have been anonymized for

confidentiality reasons. Among all the available pieces of information, the most recent

aggregated data at Stock-Keeping Unit (SKU) level of 83 selected products from different

countries have been used for an easy comparison of results (83 time series from August

Page 7: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 7

2017 to September 2018 with 395 data points). In the case of DLM+MinPE, the train

dataset contains 365 points from 31th August 2017 to 31th August 2018 and the test

dataset contains 30 points corresponding to 1st-30th September 2018. Two time series

examples are shown in figure 2, where the dashed vertical line indicates the border between

train and test datasets. The remaining input data are attached as additional information

in Supplementary Materials section.

SKU1 SKU2

2500

5000

7500

10000

q

1.5

2.0

2.5

Oct 2017 Jan 2018 Apr 2018 Jul 2018 Oct 2018

Date

p

0

20000

40000

60000

80000

q

75

80

85

90

95

Oct 2017 Jan 2018 Apr 2018 Jul 2018

Date

p

Figure 2 Daily units sold q (above) and the corresponding daily prices p (below) of two example products. In

both examples, vertical dashed line indicates the border between train (left) and test (right) datasets.

4. Results

All calculations have been performed with R programming language and R library dlm

(R Development Core Team (2008), Petri (2010), Petri et al. (2009)) in 8 QEMU virtual

processors (cpu64-rhel6) with a total memory size of 118GiB supplied by Wise Athena c©.

Under these conditions, computer processing times by SKU for SLM, DLM+MLE and

DLM+MinPE are 0.05, 14.29 and 27.39 seconds. SLM is around ∼300 and ∼500 times

faster than dynamic linear models with MLE and MinPE criteria. As might be expected,

training around 200 models with time-dependent variables for providing a winning com-

bination of input variances is a lengthy task. In the case of DLM+MinPE, performing

all steps described in figure 1 for a given input variances combination takes about 0.13

seconds. It is important to note that powerful machines in terms of memory size are not

strictly necessary, since the most relevant requirement for the dlm library is the CPU

performance.

Page 8: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models8 Article submitted to Management Science; manuscript no.

1e−06

1e−03

1e+00

−4 −2 0

β

σ2 (β)

SLM

1e−06

1e−03

1e+00

−4 −2 0

β

σ2 (β)

DLM+MLE

1e−06

1e−03

1e+00

−4 −2 0

β

σ2 (β)

0.0

0.2

0.4

0.6

0.8

DLM+MinPE

Figure 3 Mean price elasticity of demand β versus its variance σ2(β) for the 83 goods under study. Color scale

denotes SKU mape prediction error in test. Points align below correspond to SKUs with constant β

(σ2(β)< 10−8). DLM+MinPE is the only method that always provides significant fluctuations for price elasticity.

Regarding demand estimation performance for the three approaches, static linear models

exhibit higher mape errors than dynamic linear models in train and test datasets (see

table 1). This fact evinces that time variations of β, γ, and α from eq. 1 are important for

correctly describing demand (as is clearly shown in train dataset of figure 5). Although MLE

maximizes the likelihood of observing input data, is not the method that provides the best

fit to the train dataset. This can be attributed to both the local minima and the fact that

only a small fraction of the entire state space is scanned using MLE. Alternatively, MinPE

criteria is the best at describing the demand in the train dataset in most of the products.

This could denote that uncorrelated combinations of input variances help faster sampling

of different regions of state space. Besides, having an extra requirement in convergence

criteria (MinPE criteria) reduces the number of equivalent solutions that correctly fit train

and test data at the same time. Unsurprisingly, DLM+MinPE is the best approach at

predicting demand on a test dataset since MinPE criteria optimizes this value.

Table 1 Each row shows on top the mean absolute percentage errors in train and test datasets for the three

approaches considered and the number of SKUs for which each approach yields the smallest error in the train and

test datasets, at the bottom. In both datasets, MinPE is the model with the smallest error and is the winning

model for most of the 83 products in study.

SLM DLM+MLE DLM+MinPE

mapetrain 19.59% 17.31% 17.10%

winningtrain 0 29 54

mapetest 31.81% 30.01% 25.69%

winningtest 5 0 78

Page 9: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 9

Concerning price elasticity estimation, table 2 shows the percentage of products with

constant elasticity (β time series with σ2 < 10−8 are considered as constant) according

to SLM, DLM+MLE and DLM+MinPE. Whereas SLM always provides static price elas-

ticities, DLM+MLE also exhibits constant results for 56 of the 83 products. Conversely,

all estimated elasticities using DLM+MinPE display fluctuations of varying intensity. In

14 of the remaining 27 cases for which β estimated with MLE shows variation over time,

MLE and MinPE provide similar results (see Supplementary Materials). In all cases, the

elasticities provided by all three methods are the same order of magnitude (between -5.5

and 0.5). For a large majority of SKUs, the non-constant price elasticity estimated by

DLM+MLE is the best one at describing and predicting the real demand. All these results

for the 83 products under study are summarized in fig. 3.

Table 2 Percentage of SKUs where the estimated price elasticity of demand β is constant over time for the

three considered approaches (β time series with σ2 < 10−8 are considered as constant). DLM+MinPE is the only

approach that systematically detects β fluctuations for all tested SKUs.

SLM DLM+MLE DLM+MinPE

constant β 100% 67% 0%

Finally, it is important to highlight the convergence issues found with MLE for a small

fraction of products not included in this study (less than 1%). In these cases, MLE does

not provide any valid solution as the convergence iterative process is not fulfilled for any

of the trained models (Mantel and Myers (1971)). MinPE criteria overcomes this problem

since it simply selects the model with the lowest test error.

4.1. Two SKU examples: a detailed review

For a better understanding, this section includes a more detailed examination of the results

provided by all three approaches for both products shown in figure 2.

Sku1 input data show that the largest price promotion in October 2017 did not bring a

relevant sales increase. However, the two price promotions around July 2018 and the multi-

ple price reductions in September 2018 brought visible demand raises ata lower investment

(see figure 2). This price sensitivity change over time is consistent with the price elasticity

of demand provided by DLM+MinPE (see fig. 4) where β of SKU1 continuously decreases

Page 10: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models10 Article submitted to Management Science; manuscript no.

SKU1 SKU2

−1.40

−1.35

−1.30

−1.25

−1.20

Oct 2017 Jan 2018 Apr 2018 Jul 2018 Oct 2018

Date

β

DLM+MinPESLMDLM+MLE

−3

−2

−1

0

Oct 2017 Jan 2018 Apr 2018 Jul 2018

Date

β

DLM+MinPESLMDLM+MLE

Figure 4 Estimated price elasticities of demand β for four different products according to SLM (red),

DLM-MLE (blue) and DLM-MinPE (yellow) models. In both examples, vertical dashed line indicates the border

between train (left) and test (right) datasets. In both examples, only using MinPE criteria can subtle changes in

elasticity be observed.

during the last 13 months. Despite the fact that MLE provides a slightly better fit to the

train dataset (see table 3), MinPE finds a similar solution in terms of accuracy which also

unveils price elasticity changes over time. For this sample product SKU1, the optimal σ2δ

provided by MinPE is four orders of magnitude higher than the variance found by MLE

(see table 3).

Table 3 From top to bottom, mape predictions errors in the train and test datasets and the optimal

combinations of input variances σ2δ , σ2

γ , σ2α provided by SLM, DLM+MLE and DLM+MinPE for both SKU

examples.

SKU1 SKU2

SLM DLM+MLE DLM+MinPE SLM DLM+MLE DLM+MinPE

mapetrain 24.20% 21.77% 22.81% 100.36% 33.89% 33.18%

mapetest 38.10% 38.17% 37.80% 46.13% 20.21% 11.25%

σ2δ - 8.0 · 10−13 10−9 - 3.5 · 10−15 10−7

σ2γ - 7.9 · 10−11 10−13 - 4.3 · 10−15 10−7

σ2α - 2.3 · 10−4 10−11 - 7.5 · 10−3 10−7

In the case of SKU2, figure 4 shows that MinPE provides a radically different solution

incompatible with the two other approaches. Whereas β provided by DLM+MinPE indi-

cates that SKU2 is an elastic product with values around -3, SLM and DLM+MLE estimate

that it is barely price-sensitive with β positive values close to zero. Input data from figure

Page 11: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 11

2 display price promotions in October 2017, January 2018 and September 2018 that posi-

tively affect demand, indicating an elastic behavior in full agreement with DLM+MinPE

results. Moreover, demand estimation using the MinPE criteria yields the lowest errors

(see table 3) and the best fitting, especially, in the test dataset (see figure 5). The price

reduction with its correlating demand increase in the test dataset (see figure 2), can only

be explained with a negative and relatively high price elasticity. As table 3 shows, accord-

ing to MLE most of demand fluctuations of SKU2 came from the additive term α while

MinPE finds same value 10−7 for the three optimized variances. In this specific example,

the best input variances combination (not explored by MLE) can be found using MinPE

criteria only.

Train SKU2 Test SKU2

0

20000

40000

Oct 2017 Jan 2018 Apr 2018 Jul 2018

Date

q

DLM+MinPESLMDLM+MLE

20000

40000

60000

80000

Aug 06 Aug 13 Aug 20 Aug 27

Date

q

DLM+MinPESLMDLM+MLE

Figure 5 Demand prediction in train and test dataset using using SLM (red), DLM-MLE (blue) and

DLM-MinPE (yellow) approaches compared with real provided sales (grey). In both datasets, MinPE criteria

yields the smallest predictive errors.

Ultimatelly, fluctuations in the β, γ and α defining variables of demand from equation

1 (price elasticity, weekly seasonality and additive terms), have a deep impact on taking

short and long term pricing decision-making. In figure 6, three demand curves of product

SKU1 in three different days show distinct behavioral changes during the year depending

on estimated daily values by DLM+MinPE. As dynamic linear models can predict demand

and price elasticity values in the near future (generations steps without observations),

pricing strategies based on DLM+MinPE results could help CPG and Retail companies

determine the optimal price every promotional cycle.

Page 12: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models12 Article submitted to Management Science; manuscript no.

2000

4000

6000

8000

10000

1.5 2.0 2.5

p

q

15th Sept 17

2000

4000

6000

8000

10000

1.5 2.0 2.5

p

q

12th March 18

2000

4000

6000

8000

10000

1.5 2.0 2.5

p

q

16th Sept 18

−1.41

−1.36

−1.31

β

−0.2

0.0

0.2

γ

9.40

9.42

9.44

Oct 2017 Jan 2018 Apr 2018 Jul 2018 Oct 2018

Date

α

Figure 6 Demand curve evolution over time (in three different days marked with grey points below) and the

three time series (price elasticity of demand β, weekly seasonality γ and the additive value α) calculated with

DLM+MinPE for SKU1. The third demand curve and the time series to the right of the vertical dashed line are

predictive values within the test dataset.

5. Conclusions

In this article, it has been shown that fluctuations over time are really important for

correctly modeling the demand of a product. It has also been proven that among the three

tested approaches, only using a dynamic linear model with the new MinPE criteria can

daily changes in elasticity be observed. These fluctuations open the possibility for more

effective everyday and promotional pricing strategies by CPG and Retail companies that

could exploit these small changes in price sensitivity.

Despite the fact that MinPE criteria is slightly more computationally expensive than

MLE, it is a more robust and relatively easy to implement alternative to MLE for opti-

mizing input variances in DLM. It has been proven that exploring solutions which provide

small errors in train as well as in test according to MinPE criteria, achieves singular and

more consistent results with the observed input data. This happens thanks to better sam-

pling of the state space. Finally, demand predicting errors using DLM+MinPE compared

Page 13: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 13

with those provided by SLM and DLM+MLE are the lowest for the vast majority of the

products tested.

Appendix A: State-Space model (SSM)

The observation equation for our proposed demand model follows eq. [1] with an additive term, εt, arising

from the zero-means observation noise:

log qt = βt · log pt + γ1,t · d1,t + γ2,t · d2,t + γ3,t · d3,t + γ4,t · d4,t + γ5,t · d5,t + γ1,t6 · d6,t +αt + εt, (7)

where εt ∼N(0, σ2q ) follows a normal distribution with variance σ2

q which is directly calculated from an

observed q time series.

Time evolution of linear parameters is mathematically described by the following nine state transition

equations:

βt = βt−1 + δt−1 +ωt,β , ωt,β ∼N(0, σ2β) (8)

δt = δt−1 +ωt,δ , ωt,δ ∼N(0, σ2δ ) (9)

γ1,t = γ1,t−1 +ωt,γ1 , ωt,γ1 ∼N(0, σ2γ1

) (10)

...... (11)

γ6,t = γ6,t−1 +ωt,γ6 , ωt,γ6 ∼N(0, σ2γ6

) (12)

αt = αt−1 +ωt,α , ωt,α ∼N(0, σ2α) (13)

where the price elasticity, β, follows an integrated random walk driven by its derivative δ, and the remaining

state variables δ, γ1 . . . γ6 and α are ruled by simpler randoms walks. For the sake of simplicity, we assume

equal variances for the six components of the weekly seasonality σ2γ = σ2

γ1= · · ·= σ2

γ6.

Observation and transitions equations can be translated into vectors and matrices for having a condensed

view for our proposed SSM:

log qt = Ftθt + εt ; εt ∼N(0, σ2q ) (14)

θt =Gtθt−1 +ωt ;ωt ∼N(0,Wt) (15)

where the 9-dimensional state vector, θt and Gt, Ft and Wt matrices are correspondingly defined as:

θt =

βtδtγ1,t

...γ6,tαt

, Ft =[pt 0 d1,t d2,t d3,t d4,t d5,t d6,t 1

], (16)

Page 14: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models14 Article submitted to Management Science; manuscript no.

Gt =

1 1 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

, Wt =

σ2β 0 0 0 0 0 0 0 0

0 σ2δ 0 0 0 0 0 0 0

0 0 σ2γ 0 0 0 0 0 0

0 0 0 σ2γ 0 0 0 0 0

0 0 0 0 σ2γ 0 0 0 0

0 0 0 0 0 σ2γ 0 0 0

0 0 0 0 0 0 σ2γ 0 0

0 0 0 0 0 0 0 σ2γ 0

0 0 0 0 0 0 0 0 σ2α

. (17)

Finally, the initial conditions for the state vector β0, γ0, γ1,0 . . . γ6,0, α0, and the initial variances σ20,γ1

. . .σ20,γ6

, σ20,α are taken from a multiple linear regression of eq. [1].

The initial values for the derivative of the price elasticity, δ0 and σ20,δ are chosen for convenience as zero.

Similarly, we choose σ2β = 0 and σ2

0,β = 0 since all price elasticity variations over time are directly derived

from its derivative δ. Consequently, the proposed demand model depends only on three input variances of

the state transitions σ2δ , σ2

γ and σ2α.

Acknowledgments

The authors gratefully acknowledge to A. Montoro, A. Vazquez, J. Rodrıguez, M.J. Martın, M. Romero

and J.M. Lopez-Zafra for their useful comments and corrections. Vicente Sanchez-Gil and Daniel Pizarroso

wish to acknowledge the support of Wise Athena c© in facilitating this research.

References

Akaike, H. 1998. Information theory and an extension of the maximum likelihood principle. Springer Series

in Statistics (Perspectives in Statistics) .

Ali, Ozden Gur, Serpil Sayın, Tom Van Woensel, Jan Fransoo. 2009. Sku demand forecasting in the presence

of promotions. Expert Systems with Applications 36(10) 12340–12348.

Andreyeva, Tatiana, Michael W. Long, Kelly D. Brownell. 2010. The impact of food prices on consumption:

A systematic review of research on the price elasticity of demand for food. Am J Public Health 100(2)

216–222.

Beravs, Tadej, Janez Podobnik, Marko Munih. 2012. Three-axial accelerometer calibration using kalman

filter covariance matrix for online estimation of optimal sensor orientation. IEEE Transactions on

Instrumentation and Measurement 61(9).

Espey, M., J. Espey, W. D. Shaw. 1997. Price elasticity of residential demand for water: A metaanalysis.

Water Resources Research 33(6).

Fei, Xiang, Chung-Cheng Lu, Ke Liu. 2011. A bayesian dynamic linear model approach for real-time short-

term freeway travel time prediction. Transportation Research Part C: Emerging Technologies 19(6)

1306–1318.

Page 15: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear modelsArticle submitted to Management Science; manuscript no. 15

Fibich, Gadi, Arieh Gavious, Oded Lowengart. 2005. The dynamics of price elasticity of demand in the

presence of reference price effects. Journal of the Academy of Marketing Science 33(1) 66.

Granados, Nelson, Alok Gupta, Robert J. Kauffman. 2011. Online and offline demand and price elasticities:

Evidence from the air travel industry. Information Systems Research 23(1). doi:10.1287/isre.1100.0312.

Green, Richard K., Stephen Malpezzi, Stephen K. Mayo. 2005. Metropolitan-specific estimates of the price

elasticity of supply of housing, and their sources. American Economic Review 95(2) 334–339.

Grewal, Mohinder S. 2011. Kalman Filtering . Springer Berlin Heidelberg, Berlin, Heidelberg. doi:10.1007/

978-3-642-04898-2 321.

Heerde, Harald J. Van, Maarten J. Gijsenberg, Marnik G. Dekimpe, Jan-Benedict E.M. Steenkamp. 2013.

Price and advertising effectiveness over the business cycle. Journal of Marketing Research 50(2) 177–

193. doi:10.1509/jmr.10.0414.

Hoch, Stephen J., Byung-Do Kim, Alan L. Montgomery, Peter E. Rossi. 1995. Determinants of store-level

price elasticity. Journal of Marketing Research 32(1) 17–29.

Hughes, Jonathan E., Christopher R. Knittel, Daniel Sperling. 2006. Evidence of a shift in the short-run price

elasticity of gasoline demand. The Energy Journal, International Association for Energy Economics

29(1) 113–134.

Kalman, R. E. 1960. A new approach to linear filtering and prediction problems. Journal of Basic Engineering

35–45.

Kapoor, Mudit, Shamika Ravi. 2016. Elasticity of intertemporal substitution in consumption in the presence

of inertia: Empirical evidence from a natural experiment. Management Science 63(12). doi:10.1287/

mnsc.2016.2564.

Kocabiyikoglu, Ayse, Ioana Popescu. 2011. An elasticity approach to the newsvendor with price-sensitive

demand. Operations Research 59(2). doi:10.1287/opre.1100.0890.

Law, Rob. 2000. Back-propagation learning in improving the accuracy of neural network-based tourism

demand forecasting. Tourism Management 21(4) 331–340.

Li, Gang, Haiyan Song, Stephen F Witt. 2004. Modeling tourism demand: A dynamic linear aids approach.

Journal of Travel Research 43(2) 141–150.

Liu, Lon-Mu, Dominique Hanssens. 1981. A bayesian approach to time-varying cross-sectional regression

models. Journal of Econometrics 15 341–356. doi:10.1016/0304-4076(81)90099-3.

Ma, Shaohui, Robert Fildes, Tao Huang. 2016. Demand forecasting with high dimensional data: The case of

sku retail sales forecasting with intra-and inter-category promotional information. European Journal

of Operational Research 249(1) 245–257.

Mantel, Nathan, Max Myers. 1971. Problems of convergence of maximum likelihood iterative procedures in

multiparameter situations. Journal of the American Statistical Association 66(335) 484–491.

Page 16: High De nition Price elasticity of demand for current ...

Author: New hyperparameter optimization strategy for dynamic linear models16 Article submitted to Management Science; manuscript no.

McCulloch, Charles E. 1997. Maximum likelihood algorithms for generalized linear mixed models. Journal

of the American statistical Association 92(437) 162–170.

Myung, In Jae. 2003. Tutorial on maximum likelihood estimation. Journal of mathematical Psychology 47(1)

90–100.

Petri, Giovanni. 2010. An r package for dynamic linear models. Journal of Statistical Software 36(12) 1–16.

Petri, Giovanni, Sonia Petrone, Patrizia Campagnoli. 2009. Dynamic Linear Models with R. Springer.

R Development Core Team. 2008. R: A Language and Environment for Statistical Computing . R Foundation

for Statistical Computing, Vienna, Austria. URL http://www.R-project.org. ISBN 3-900051-07-0.

Sanchez-Gil, V., D. Pizarroso, R. A. Queralt, J. M. Lopez-Zafra. 2019. Minimum predictive error: a new

hyperparameter optimization strategy for dynamic linear models. Working paper.

Shy, Oz. 2008. How to Price: A Guide to Pricing Techniques and Yield Management . Cambridge University

Press.

Simon, Hermann. 1979. Dynamics of price elasticity and brand life cycles: An empirical study. Journal of

Marketing Research 16(4) 439–452.

Simon, Hermann. 1989. Price Management . North Holland Elsevier Science Publishers, Amsterdam.

Suganthi, L, Anand A Samuel. 2012. Energy models for demand forecastinga review. Renewable and sus-

tainable energy reviews 16(2) 1223–1240.

Vaquero, Victor, Ivan del Pino an Francese Moreno-Noguer, Joan Sol, Alberto Sanfeliu. 2017. Deconvolutional

networks for point-cloud vehicle detection and tracking in driving scenarios. 2017 European Conference

on Mobile Robots (ECMR). –.

White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica The Econo-

metric Society 50(1) 1–25.

Young, Peter C. 2011. Recursive Estimation and Time-Series Analysis: An Introduction for the Student and

Practitioner . Springer.