Optimizing the return of advertising through time series...

Optimizing the return of advertising through time series forecasting and decomposition methods

Miguel Jerez (*), Alfredo García-Hiernaux, Sonia Sotoca

Universidad Complutense de Madrid

Abstract: In this paper the discussion of the application of time series analysis,

forecasting and decomposition methods to optimize the return of advertising is

addressed. To this end, we assume that there exists a dynamic log-linear relationship

between investment and response. We show that this assumption is supported by the

economic theory and very often by the data analyzed. In this framework, we formulate

three key questions for a decision-maker: How can one measure the return of

advertising? Which is the best timing for investments when the response is seasonal?

How much should a given advertiser invest? The answers and discussion of these

relevant questions are structured around a practical case based on the monthly sales of

Lydia Pinkham’s vegetable compound and the corresponding advertising investments.

Keywords: Marketing, Advertising, Forecasting, Time series decomposition, Lydia

Pinkham, Seasonality

JEL classification: C530; M37; M31

(*) Corresponding author. Departamento de Fundamentos del Análisis Económico II. Facultad de Ciencias Económicas. Campus de Somosaguas. 28223 Madrid (SPAIN). Email: mjerez@ccee.ucm.es Tel: (+34) 91 394 23 61, fax: (+34) 91 394 25 91.

1. Introduction.

While cross-section statistics and econometrics are now standard tools of

marketing, time series methods are not so widely applied. Perhaps it is our fault: much

effort has been devoted to specification, but few examples illustrate how time series

models may help the marketer.

This paper discusses the application of time series forecasting and

decomposition methods to improve the return of advertising. To this end, we assume

that there exists a dynamic log-linear relationship between advertising and response

(from now on “sales”) and, in this framework, we address three key questions: how can

one measure the return of advertising? which is the best timing for investments when

the response is seasonal? and how much should a given advertiser invest? The

discussion is structured around a practical case based on the monthly sales of Lydia

Pinkham’s vegetable compound and the corresponding advertising investments.

Section 2 provides some background on the dataset employed. Section 3

discusses the consequences of specifying a linear or log-linear relationship between

sales and advertising and Section 4 describes the model building process. Sections 5, 6

and 7 provide a step-by-step discussion about how can one apply a log-linear model to

solve the three issues previously described and, finally, Section 8 provides some

concluding remarks and synthesizes the learnings that could be extracted from the

previous exercises.

2. The dataset.

Now we will show how to apply time series decomposition methods to improve

the return of advertising. To this end, we will use the famous monthly series of sales

and advertising of the Lydia Pinkham vegetable compound, see Figure 1.

[Insert Figure 1]

This product was introduced in 1873. It was an herbal extract in a 18-20 percent

alcoholic solution, and was considered to be effective against “all those painful

Complaints and Weaknesses so common to our best female population”. Additional

medical claims followed its commercial success. The company gained strong public

exposure because of controversies around the product ingredients and a large court case,

which made public this dataset. This firm was finally sold in 1968, but some medicinal

products with the generic “Lydia Pinkham” brand can be acquired even today by direct

order.

These series are important in empirical research about advertising for several

reasons: (a) the product is a frequently purchased, low-cost consumer nondurable, being

this class of products specially interesting to marketing research, (b) advertising,

primarily in newspapers, was the only marketing instrument used by the company, (c)

price changes were small and infrequent while, (d) the distribution, mainly through drug

wholesalers, remained fairly stable, furthermore, (e) there were no direct competitors for

this product, so the market under study can be considered a closed sales-advertising

system and, last (f) due to confidentiality reasons, public sales and advertising datasets

are scarce. Because of these convenient features, this dataset has been modeled many

times, by early researchers such as Palda (1964) or Bhattacharyya (1982) and, more

recently, by Kim (2005) or Smith, Naik and Tsai (2006).

3. Data transformations and its implications in a sales-advertising framework.

Simple inspection of Figure 1 shows that both series display seasonal

fluctuations and a downward drift. Their profile suggests also that there is a positive

relationship between level and volatility, so that the higher their level, the higher their

volatility. This impression is confirmed by the mean-standard deviation plots shown in

Figure 2, which suggest that a logarithmic transformation may improve their statistical

properties (Box and Cox, 1964) by: (a) stabilizing the data volatility, (b) making it

independent of the level, (c) linearizing the relationship between both series and (d)

contributing to its gaussianity.

[Insert Figure 2]

On the other hand, log-transforming the data has other advantages when

modeling a sales-advertising system. To discuss this idea, consider the following

regressions:!

Rt = β0 + β1At1 + β2At

2 + St + Εt (3.1)

ln Rt = α0 + α1ln At1 + α2ln At

2 + st + εt (3.2)

where tR denotes the response at time t, tA1 and tA

2 are advertising investments of type

1 and 2, tS and ts are seasonal components and, last, the terms tE and εt are white

noise errors.

If the “true” model were the linear equation (3.1) this would imply that: (a)

advertising has constant returns of scale, (b) the return generated by tA1 does not

depend on the return of tA2 and vice versa, (c) ROI is independent of the specific period

t in which the investment is made, and, specifically, is not affected by the seasonal

cycle. These implications contradict the practical experience of most marketeers.

On the other hand, the log-linear specification has radically different

implications. As it is well known: (a) if α1 and α2 are less than one (3.2) implies

decreasing returns of scale and (b) in a log-linear framework all the terms considered

interact multiplicatively with each other. To see this, take antilog of both sides of (3.2)

obtaining:

Rt = exp α0( ) At1( )α1 At

2( )α2 exp st( )exp εt( ) (3.3)

where ( )exp denotes the exponential function.

The previously discussed implications have important consequences for optimal

investment decisions. For example, in the linear model (3.1) if β1 > β2 then the whole

budget should be allocated to the type 1 investment. Also the timing of investment is

irrelevant, as the expected response does not depend on t. On the other hand, it is easy

to see that expression (3.3) implies, for example, that the expected return of an increase

in tA1 depends, not only on the level of tA

2, but also on the phase of the seasonal cycle

( )exp ts . Because of this, the log-linear specification implies that diversifying over

different investment types and seasons is optimal.

Therefore, we can conclude that log-transforming the data is very important to

model the relationships between sales and investment over time for both, statistical and

practical reasons. It has also some inflexibilities and shortcomings. For example, it

cannot capture increasing returns for the first units invested, as S-shaped functions do,

see e.g. Hanssens, Parsons and Schultz (2003, Chapter 3), but provides a realistic

operational framework with a small cost in complexity.

4. Model building.

According to the notation and ideas in previous Section, we developed the

model-building exercise using the logged and scaled value of sales, rt =lnRt ×100 , and

the corresponding logged advertising, ln 100t ta A= × . The logged series were

multiplied by 100 to improve the scaling of the estimates.

On this basis, the modeling exercise can be viewed as finding a statistically

adequate representation for the terms in the decomposition:

rt = rta +rt

ε (4.1)

where atr is the deterministic, or input-related, component, and rt

ε is the stochastic, or

error-related, component. Following Box, Jenkins and Reinsel (2008) these terms can

be parametrized, respectively, with a rational transfer function

ω0 −ω1B−ω2B2 −…−ωrB

1−δ1B−δ2B2 −…−δsB

sat (4.2)

where B is the backshift operator, such that for any tw :! , 0, 1, 2,kt t kB w w k−= = ± ± K

On the other hand, the error-related term can be modeled as an ARIMA process: 2

(1 )(1 ) (1 ) (1 )(1 )(1 )

t d S D pp

S S Q SQ

tS S P SP

B B Br

B B B B BB B BB B B

ε θ θ θ

φ φ φ

ε⋅ ⋅

⋅ ⋅

− − − −=

− − − − − −−Θ −Θ − −Θ

⋅−Φ −Φ − −Φ

To choose the parametrization in (4.1)-(4.3) we: (a) did a standard univariate

analysis of log advertising, which yielded an IMA(1,1)x(1,1)12 specification, (b)

prewhitened both series using this model (Box, Jenkins and Reinsel, 2008, Chapter 11)

and (c) computed the corresponding sample cross-correlations, see Figure 3.

[Insert Figure 3]

Figure 3.a displays two significant cross-correlations at lags 0-1, perhaps mixed

with a rough sinusoidal response between current sales and lagged advertising. On the

other hand, Figure 3.b shows no substantial feedback.

We estimated several models within the family (4.1)-(4.3) coherent with the

pattern in Figure 3.a. In all cases, the observations between November 1957 and

February 1958 were treated as missing values because they were outliers. Finally, we

chose the following specification for the transfer function component (4.2):

at tr a

" " " " " "(.022)"(.022)" " (.026).048+ .016B + .043B= 1/.713B + .751B

(.121) (.087) (4.4)

where the figures in parentheses are standard errors. Note that:

1) The roots of the polynomials in the numerator and denominator of (4.4) are,

respectively, -.186±1.040i and .475±1.052i so, even though this model is

heavily parametrized, there are no redundant dynamic factors in this term.

2) Model (4.4) implies an impulse-response with both, positive and negative

values, see Figure 4. In a sales-advertising system this can happen when the

product has a loyal customer base and the advertising accelerates the

consumption, therefore changing the distribution of re-stocking purchases

over time.

[Insert Figure 4]

As for the error related-term (4.3), the chosen specification was an

IMA(1,1)x(1,1)12 process, with the following results:

( )( )( )( )

ˆ ˆt trε εε σ12

# # # # # # # #(.029)# # # # #(.063)1+.900B 1+.628B= ; = 7.5361+B 1+B

being tε is a white noise homoscedastic error. Note that a standard hypothesis testing

would not reject the null 1θ = , thus implying either overdifferencing or a deterministic

trend. However one must take into account that, under this null, Gaussian maximum-

likelihood (ML) estimates display a so-called “pile-up” effect, in which the probability

density under the null converges to a large positive value, so standard testing is biased

towards non-rejection. The test proposed by Davis, Chen and Dunsmuir (1994, Section

3) takes into account this distortion and, in this case, safely rejects 1θ = , in favor of

1θ < , with a 5% significance.

Finally an analysis of residual autocorrelations did not suggest any additional

structure. The corresponding value of the Q-statistic computed with the first ten

autocorrelations (9.69) confirms this impression. Also, there were no significant values

or patterns in the cross-correlation function between the residuals and the prewhitened

input and the corresponding Q-statistic, also computed with ten lags (6.94), was small

enough to consider that (4.3) captures all linear relationships between these series.

5. Estimating the return of advertising.

One of the basic uses of a sales-advertising model is estimating the ROI of

advertising. Building on previous results, it seems natural to do this using the terms in

the decomposition (4.1), which estimation requires solving the difference equation

(4.4). We did this using the algorithm proposed by Casals, Jerez and Sotoca (2010)

which provides orthogonal estimates for atr and tr

Figure 5 shows the profile of the sales series versus a “baseline”, computed as

( )exp /100trε . This baseline can therefore be interpreted as the expected value of sales

if the advertising investment were kept at a null steady state. According to this

interpretation, the grey area would be an estimate of the sales generated by advertising.

[Insert Figure 5]

These numbers provide some interesting conclusions. Total sales during the

sample period were roughly 10 million US$ with a total advertising of 4.8 million

(48.4% of previous figure); the estimate of sales generated resulting from Figure 5 is 4.5

million so the ROI is -.3 million US$ or, in percentage, -3.1% of sales, so advertising in

this case did not create value for the firm.

The superimposed regression lines in Figure 5 suggest that advertising could

have been inefficient, as the negative slope of the sales line is more pronounced than

that of the baseline, implying that actual sales deteriorated more rapidly than the

underlying market conditions.

6. Pro-seasonal vs. anti-seasonal investment.

To further investigate the cause of this poor performance we further decomposed

the error-related component into its structural components:

rtε = tt + ct + st +εt (6.1)

where the addends in the right-hand-side of (6.1) are, respectively, the trend ( tt ), cycle (

tc ), seasonal ( ts ) and irregular (εt ) components. Substituting (6.1) in (4.1) and taking

antilog of both sides of the resulting expression, we see that the sales can be written as:

Rt = exp rta( ) ⋅ exp tt ( ) ⋅ exp ct( ) ⋅ exp st( ) ⋅ exp εt( ) (6.2)

Assuming for simplicity that ( ) ( )exp expt tt c 1= = , it is clear that investing a

given budget when the seasonal factor, ( )exp ts , is larger should provide a higher return

than investing when it is smaller.

Figure 6 compares the advertising and seasonal factors in (6.2), being the latter

computed with the method proposed by Casals, Jerez and Sotoca (2002). The sample

correlation between both components is -.87, so advertising was systematically

increased in the “troughs” of the seasonal cycle and decreased in the “peaks”. Taking

into account equation (6.2) above, it is clear that this anti-cyclic budget allocation does

not maximize the sales.

[Insert Figure 6]

It is easy to measure the inefficiency of an anti-seasonal budget distribution

through a simple counterfactual experiment. To this end, we: (a) re-distributed the

annual advertising investment as a direct proportion of the multiplicative seasonal

component, (b) backcasted the corresponding sales, and (c) predicted the ROI of

advertising in this new scenario.

Now in formal terms. Assume that advertising budgets are allocated for T

consecutive periods (typically one year) and consider the investment sequences given

by the following vectors:

at:t+T -1 = at at+1 … at+T -1!"#

$%& (6.3)

at:t+T -1* = at

* at+1* … at+T -1

where 1t:t+T -a includes the logs of the investments actually done in T consecutive

periods while 1t:t+T -*a characterizes an alternative or “counterfactual” budget allocation

such that total expenditure in both is the same and, αai* = exp St( ) . Consider also the

impulse-response function implied by (4.4) and denoted by:

ν B( ) = ν0 +ν1B +ν2B2 +…

Under these conditions, the deterministic component can be written as the

product of a vector of log advertising values and a Toeplitz matrix of impulse-response

weights:

rta = at-k at-k+1 … at-1 at at+1 … at+T -1

ν0 ν1 ν2 … νT+k0 ν0 ν1 … νT+k-10 0 ν0 … νT+k-2 0 0 0 … ν0

#######

&&&&&&&!

and, therefore, it is straightforward to write the counterfactual budget allocation as:

rta* = at-k at-k+1 … at-1 at

* at+1* … at+T -1

ν0 ν1 ν2 … νT+k0 ν0 ν1 … νT+k-10 0 ν0 … νT+k-2 0 0 0 … ν0

#######

&&&&&&&

so, according to (6.2) the expected log sales in the counterfactual scenario would be:

Rt* = exp rt

a*( ) ⋅ exp tt ( ) ⋅ exp ct( ) ⋅ exp st( ) ⋅ exp εt( ) (6.7)

In our case, the alternative pro-seasonal allocation increased the estimate of total

sales generated by advertising by 13%, from 4.5 to 5.1 million US$. As the total amount

invested in both scenarios is 4.8 million USD, previous estimates imply that investing

against seasonality generated a negative ROI (4.5-4.8=-.3 million US$) while the

expected ROI of a pro-seasonal investment policy would have been positive (5.1 - 4.8 =

.3 million US$).

[Insert Figure 7]

7. Optimal investment budget.

A remarkable feature of Lydia Pinkham’s dataset is that, over the period

considered, advertising investment was more than 48% of sales. Therefore, after

estimating the loss of ROI that can be attributed to the anti-seasonal investment policy,

it is natural to wonder if the advertising-to-sales ratio was too high.

To test this hypothesis we defined a new experiment in which we computed the

counterfactual ROI corresponding to a dense grid of advertising budgets, from 0 to 6.5

million US$. The results are shown in the Figure 8.

[Insert Figure 8]

where the point “A” corresponds to the improved pro-seasonal allocation of the original

investment, 4.8 million US$ in 78 months. The point “B” corresponds to a pro-seasonal

allocation of the optimal investment: 1.0 million US$ in the same period.

8. Concluding remarks.

We have shown how to use a time series model relating sales and advertising to

solve three practical issues: measuring the ROI of advertising, assessing budget

allocation over the seasonal cycle and, last, optimizing the budget size.

About the first problem (ROI measurement) our proposal consists in deriving

estimates for the direct return from the advertising-related components of the sales

series. These measures can be computed “by hand” in the simplest cases, or using the

general decomposition procedure of Casals, Jerez and Sotoca (2010) when the

relationships are complex.

Second, sections 6 and 7 show how perform a counterfactual analysis to assess

the efficiency of the investment timing policy and the budget size. The basic idea

consists in comparing the results actually obtained with “backcasts” computed feeding

alternative investment scenarios to the model.

The results obtained in this particular case indicate that advertising performance

can be substantially improved by reallocating investments over time and adjusting the

budget to an efficient level. Obviously the practical relevance of these results for a

company that was sold in 1968 is scarce but, according to our professional experience,

the same improvement opportunities exist in many firms operating today and the

approach explained here can help them to bring this potential to fruition.

The computational procedures employed in this research are implemented in the

E4 functions “detcomp” (decomposition of a vector of time series into input-related and

error-related components) and “e4trend” (structural decomposition of a vector of time

series). E4 is a MATLAB toolbox for time series modeling, which can be downloaded

at: www.ucm.es/info/icae/e4. The source code for all the functions in the toolbox is

freely provided under the terms of the GNU General Public License. This site also

includes a complete user manual and other reference materials.

References:

Akaike, H. (1973). Information Theory and an Extension of the Maximum Likelihood

Principle, in B.N. Petrov and F. Csaki, Proc. 2nd International Symposium on

Information Theory, (eds.), 267-281, Akademia Kiado, Budapest.

Bhattacharyya, M.N. (1982). Lydia Pinkham Data Remodelled, Journal of Time Series

Analysis, 3, 81-102.

Box, G.E.P. and D.R. Cox (1964). An analysis of transformations. Journal of the Royal

Statistical Society, Series B 26, 2, 211–252.

Box, G.E.P. G.M. Jenkins, G.M. and G.C. Reinsel, G.C. (2008). Time Series Analysis:

Forecasting and Control. Wiley, New York.

Casals, J. M. Jerez and S. Sotoca (2002). An Exact Multivariate Model-based Structural

Decomposition, Journal of the American Statistical Association, 97, 458, 553-564.

Casals, J. M. Jerez and S. Sotoca (2010). Decomposition of a State-Space Model with

Inputs, Journal of Statistical Computation and Simulation, 80, 9, 979-992.

Davis, R.A. M. Chen and W.T.M. Dunsmuir (1994). Inference for MA(1) processes

with a unit root on or near the unit circle, Probability and Mathematical Statistics,

15, 227–242.

Hanssens, D.M. L.J. Parsons and R.L. Schultz (2003). Market Response Models:

Econometric and Time Series Analysis, International Series in Quantitative

Marketing, vol. 12, Kluwer Academic Publishers, Dordretch (Netherlands).

Kim, Jae H. (2005). Investigating the advertising-sales relationship in the Lydia

Pinkham data: a bootstrap approach, Applied Economics, 37, 3, 347 – 354.

Ljung, G. and G.E.P. Box (1978). On a Measure of Lack of Fit in Time Series Models,

Biometrika, 67, 297-303.

Palda, K. (1964). The Measurement of Cumulative Advertising Effects, Prentice-Hall,

Englewood Cliffs (NJ).

Smith, A. P.A. Naik, and C. Tsai (2006). Markov-switching model selection using

Kullback–Leibler divergence, Journal of Econometrics, 134, 2, 553-577.

Acknowledgements: This research was partially funded by Ministerio de Economía y

Competitividad through the Grant ECO2011-23972.

Figure 1: Monthly series of sales and advertising of the Lydia Pinkham vegetable

compound from January 1954 to June 1960 (78 monthly values). Source: Palda (1964).

Figure 2.a: Mean-standard deviation plot of sales, with a LS regression line

superimposed.

Figure 2.a: Mean-standard deviation plot of advertising, with a LS regression

superimposed.

Figure 3.a. Sample cross-correlation function between current sales and lagged

advertising. The lines at 0.2 are the limits of a 5% individual significance test, The Q-

statistic is 23.40, with a p-value smaller than 1%, thus rejecting the absence of a causal

relationship.

Figure 3.b. Sample cross-correlation function between current advertising and lagged

sales. The Q-statistic is 12.19, with a p-value of 11.5% and, therefore, does not reject

the absence of a causal relationship.

Figure 4: Impulse-response function implied by the transfer function model. The

implied cycle and damping factor are six months and 50% approximately. The long-

term gain is 0.103

Figure 5: Sales versus its baseline, computed as the exponential of the error-related

component. LS regression lines were added as an indication of the velocity of their

downward drift.

Figure 6: Multiplicative seasonality versus the advertising-related component of sales.

Figure 7: Budget optimization.

Optimizing the return of advertising through time series...

Documents

Transcript of Optimizing the return of advertising through time series...

Specs & Prices May 2019...Parsons Cherry 12 Parsons Hickory 6 Parsons Maple 9 Parsons Oak 5 Parsons Rustic Alder 10 Parsons Rustic Hickory 8 Portico Cherry 11 Portico Maple 10 Princeton

Mary Schultz

Parsons College E-News - Iris City Cleanersiriscitycleaners.com/parsons/parsons-fall-winter-1819.pdflegacy and spirit of Parsons College. Recipients of this award exemplify the spirit,

The Schultz Fire

Schultz - Musicalischer Lüstgarte

Parsons Les concepts clés de la chimie organique Parsons ...

Amit Joshi* Dominique M. Hanssens - business.ucf.edu · Dominique M. Hanssens *Amit Joshi is an Assistant Professor at the University of Central Florida and Dominique M. Hanssens

Hanssens Telecom

Angie Schultz

Plugin Schultz Sem

Welcome to Parsons ADL | Parsons ADL

Charles schultz

baje catalogo schultz

Howard Schultz

Haward schultz - Infografia

Rodger Schultz

BEREIDEN MET HOBART - Hanssens

t Schultz,

Christopher Parsons · 2017-06-15 · Christopher Parsons

The Parsons Game: Simulating Talcott Parsons’ …docshare01.docshare.tips/files/13087/130870575.pdfTHE PARSONS GAME: THE FIRST SIMULATION OF TALCOTT PARSONS' THEORY OF ACTION by