Just Google It: Can Internet Search Terms Help Explain Movements in Retail Sales?
description
Transcript of Just Google It: Can Internet Search Terms Help Explain Movements in Retail Sales?
Just Google It: Can Internet Search Terms Help Explain Movements in Retail Sales?
Daniel Ayoubkhani (ONS) & Matthew Swannell (ONS)
Contents
1. Introduction to Google Trends2. Existing Literature3. Aims of Current ONS Research4. Data5. Methods6. Results7. Conclusions8. Considerations
1. Introduction to Google Trends
• Google provide information on search query share for a given week
• Data are available in 25 top level categories and hundreds of lower level categories
• Reported as how share of search queries has grown since 1st week of January 2004
1. Introduction to Google Trends
Search Query: Football Transfers
Source: Google Insights for Search
1. Introduction to Google Trends
Summer Transfer Window January Transfer Window
January Transfer Deadline Reached
Summer Transfer Deadline Reached
Search Query: Football Transfers
Choi, H and Varian, H (2009) Predicting the Present with Google Trends:
• Paper pioneered use of Google Trends (GT) data as a nowcasting tool
• Applied log–linear “nowcast” to US retail sales
• Performance of models increased when Google Trends data were included
2. Existing Literature
Chamberlin, G (2010) Googling the Present, Economic and Labour Market Review (Dec 2010):
• Modelled 11 UK Retail Sales Index (RSI) time series
• Relatively simple benchmark models• Alternative models included GT category data
as predictors• GT terms significant in eight models
2. Existing Literature
Focus of this investigation: quality assurance of the UK RSI
1.Fit benchmark models that are representative of current ONS practice
2.Fit alternative models that include appropriate GT terms as predictors
3.Compare models using empirical measures
4.Draw conclusions to inform ONS strategy
3. Aims of Current ONS Research
• All Retail Sales
• Non-Specialised Food Stores
• Non-Specialised Non-Food Stores
• Textiles, Clothing and Footwear
• Furniture and Lighting
• Home Appliances
• Hardware, Paints and Glass
• Audio and Video Equipment and Recordings
• Books, Newspapers and Stationary
• Computers and Telecommunications
• Non-Store Retailing
4. Data – Retail Sales Index
All extracted RSI time series:• represent monthly GB retail sales• start in January 1988• end in June 2011• are not seasonally adjusted• are chained volume indices
4. Data – Retail Sales Index
4. Data – Retail Sales Index
Source: ONS
4. Data – Google Trends
• All extracted GT time series:• represent weekly UK search activity
• start in January 2004
• end in July 2011
• Each RSI series matched with:• at least one GT search category
• top five search queries with each category
4. Data – Google Trends
RSI Series: Furniture and Lighting
Google Trends Category Google Trends Queries
Lighting lighting, light, lights, lamp, lamps
Home and Garden furniture, ikea, garden, b&q, homebase
Homemaking and Interior Decorblinds, curtains, curtains curtains curtains, bedroom, ikea
Home Furnishingsfurniture, ikea, beds, lighting, table table
4. Data – Google Trends
• Raw data are weekly growth rates in query shares
• Indices constructed by setting first full week in January 2004 to 100 and applying growth rates
• Monthly data formed by taking weighted averages of weekly data
5. Methods – Benchmark Models
• Each RSI “month” is 4- or 5-week long period (SRP)• Disparity between survey and Gregorian months
evolves by one or two days each year (“phase shift”)• One-week long survey break every five or six years• Example – September SRP:
26 27 28 29 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 1 2 3 4 5200020012002200320042005200620072008200920102011
August September October
5. Methods – Benchmark Models
Therefore SRPs not comparable with each other due to:•their compositions•moving holidays
Holiday Position SRP
EasterGood Friday and Easter Monday
Mar or Apr
Spring (Late May) Last Monday in May May or Jun
Summer (Late August)
Last Monday in August
Aug or Sep
5. Methods – Benchmark Models
• Regression models used to estimate phase shift effects
• Example – Spring bank holiday variable:
1 In May, in years where the bank holiday is in the May SRPx t = -0.8 In June, in years where the bank holiday is in the May SRP
0 Otherwise
5. Methods – Benchmark Models
tititi
y x z
t ititi
y xz
Differenced (regular and seasonal)
Log transformed
Follows an ARMA process
5. Methods – Alternative Models
Benchmark models extended with (log transformed, differenced) GT variables
• Static relationships estimated for all series
• Lagged relationships modelled where identified
• Relationships identified at more than one lag modelled both individually and together
• Multiple regression models estimated for RSI series matched with more than one GT search category
5. Methods – Alternative Models
Lagged relationships identified from cross-correlation plots of pre-whitened series
• ARIMA models fit to all RSI and GT series• used the (0,1,1)(0,1,1) model for all series
• Each RSI residual series correlated with each of its corresponding GT residual series• series exhibit common trends and seasonality, so
correlate the shocks
5. Methods – Alternative Models
• Example – Furniture and Lighting vs “garden”
5. Methods – Alternative Models
• Example – Furniture and Lighting vs “garden”
No significant phase shift effects so models are:
t t ty x z
2t t ty x z
1 2 2 3t t t ty x x z
3t t ty x z
6. Results
Component of the RSI(and no. alternative models fitted)
% Alternative models with AICC
lower than benchmark
% GT terms significant at 5%
level
All Retail Sales (8) 0.0 37.5
Non–Specialised Food Stores (6) 0.0 0.0
Non-Specialised Non-Food Stores (6) 0.0 83.3
Textiles, Clothing and Footwear (23) 30.4 36.0
Furniture and Lighting (31) 90.3 78.8
Home Appliances (7) 14.3 0.0
Hardware, Paints and Glass (6) 50.0 100.0
Audio Equipment and Recordings (44) 43.2 51.0
Books, Newspapers and Stationary (6) 16.7 100.0
Computers and Telecoms (31) 9.7 15.2
Non-Store Retailing (7) 42.9 42.9
6. Results – Furniture and Lighting
Top three alternative models in terms of AICC
GT Term in Model Lag(s) GT Category AICC
lighting 0 Home Furnishings 412.47
curtains curtains curtains
0 & 1Homemaking & Interior Decor
414.76
lights 0 Lighting 415.63
Benchmark 432.29
6. Results – Furniture and Lighting
Top three alternative models in terms of MAPE• Out-of-sample, one-step-ahead predictions• 12 periods: July 2010 – June 2011
GT Term in Model Lag(s) GT Category MAPE
lighting 0 Home Furnishings 2.38
lighting 0 Lighting 2.49
Home Furnishings 0 N/A 2.51
Benchmark 3.87
7. Conclusions
• Promising results for some RSI components...• Furniture and Lighting• Hardware, Paints and Glass• Audio Equipment and Recordings
• ...but less so for others• All Retail Sales• Non-Specialised Food Stores• Non-Specialised Non-Food Stores
Additional information is only useful when the RSI series is not dominated by trend and seasonality
8. Considerations
• GT variable selection
• Transitory nature of search queries
• Changes to GT category taxonomy
• Future cost and accessibility of GT data?
• Wider applicability to ONS outputs?
Questions?