NCS’ Continuous Media Optimization with H2O H2O World
November 11, 2015
Satya Satyamoorthy Director of Software Development Nielsen Catalina Solutions [email protected]
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 2 2
The 200 Milliseconds Story - The Life of a Programmatic Ad Impression
eMarketer predicts that advertisers will spend more than $9 billion on Real-Time Bidding (RTB) by 2017.
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 3 3
WA
OR
CA NV
ID
MT
WY
UT CO
AZ NM
TX
OK
KS
NB
SD
ND MN
IA
MO
AK
LA MS AL
TN KY
IN IL
WI MI
OH WV VA
NC SC
GA
FL
MD DE NJ CT RI MA NH
VT
ME
PA
NY
Target/Reach & Measure ROI Measure ROI Target/Reach , Optimize
In-Flight & Measure ROI
TV Viewership
70+ Million HHs Of Daily Frequent Shopper Card Data
Online Mobile Television Radio Print Email In-Store
Who is Nielsen Catalina Solutions?
CRM
All Outlet Purchase
Verified HHs
Single Source Anonymous HHs
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 4 4
NCS by the numbers
More than 50 Media
Companies
More than 100
Agencies
Over 200 CPG
Advertisers
300+ Brands
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 5 5
TV
Watch Data Nielsen Media Data Nielsen Cookie Pool Set Top Box Data
100+ MM HH
Buy Data Catalina Frequent Shopper Card Data
Nielsen Homescan All-Outlet Data
70+ MM HH
ANONYMOUS SINGLE SOURCE HHs
WATCH
~1.4 MM Single Source
TV HHDs
For Precision Marketing For Sales Lift Measurement
~15+ million
Single Source HH
~41+ million
Single Source HH
Digital
NCS matches what consumers watch ONLINE/TV/Mobile with what they buy
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 6 6
Brand Specific (UPC-level) Data
• Anything off the bottle or box
• 121,000 Brands/1.5 MM UPCs
• Label elements
• Packaging claims
• Ingredients
• Nutrition information
Household Demographic
• Size of Household
• Gender
• Age
• Child Count
• Education
• Employment Type
• Income
• Martial Status
• Ethnicity
• Language
• Race
• Geography
Shopping Habits
• 52 Week Category Purchase
• 52 Week Brand (UPC)
Purchase
• 52 Week Competitive Brand
• Coupon Usage
• Trips/Specific Retailers Usage
Many types of data inform decisions
Attitudes and Behaviors
• Custom Surveys
• Homescan
• Spectra
• Prizm
• Claritas
Lifestyle/Life Stage
• Young Singles
• Established Families
• Older Singles
• New Family
• Pet Ownership
• Appliance Ownership
• Residence
• TV Subscriptions
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 7 7
Best-in-Class Data - Summary
• Largest, most representative CPG dataset – “Scale” from 70 MM HH of shopper data + Homescan for national “all-
outlet” representation
• Retailer data is co-mingled for holistic view
• Granularity based on UPC-level shopper data
• Recency/Freshness – daily POS data feeds
• 9 quarters of historical data
• TV data links to Nielsen currency + set-top-box data for scale
• 90%+ access to online HHs
• 10+ years experience, several thousand studies
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 8 8
NCS Closed-Loop Solutions -- Powered by Single Source Data
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 9 9
Measure delivery and sales response
1) in-flight and 2) post-campaign
Audience Selection
Measurement & Optimization
AdVantics Continuous Media Optimisation
Define audiences and predict the
potential sales impact
Activate Audiences via Contextual or
Addressable media
Audience Activation
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 10 10
NCS Digital Effectiveness Suite
Precision Marketing: Programmatic Audiences
• NCS can build any purchase based audience definition and sync to DSPs • Syndicated category or brand buyers for immediate activation on enabled DSPs
In Flight Optimization (Online & Mobile)
• Maximize incremental offline sales conversions in near real time, by continuously refining executional parameters during a campaign with NCS InFlight Measurement
• Any taggable element can be optimized: such as placement types, media units, publishers, creative, content, audience segments, etc.
Sales Lift Measurement (Digital, TV, Cross-Media, Mobile)
• NCS is the CPG leader in Sales Lift Measurement, with over 2,500 digital measurement studies completed to date
• NCS Sales Effect Measurement multivariate test and control methodology is the most rigorous in the industry
• Unique Cross Media Measurement analysis will provide a more accurate estimate of digital advertising ROAS among households also reached by TV
• In-App Mobile Sales Effect Measurement capability – the only Test and Control sales lift measurement for Mobile in the industry
1
2
3
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 11 11
Where We Play in the Programmatic Ecosystem
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 12 12
Publishers, portals, exchanges, ad networks, DSPs, DMPs and agency trading desks
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 13 13
Always On Inflight KPI Measures
• Campaign metrics, days after the exposure, to answer critical questions while the campaign is still in progress – How many impressions were delivered? Trend by Week
– How many impressions were delivered by target and delivery attributes?
– Where could I have done to increase my delivery by target?
– How much of my brand sales were influenced by the media, trended by week?
– What did the reached audience look like?
– How much did I pay per HH/volume messaged?
– What else does the reached audience buy?
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 14 14
Sales Lift Measurement
• Did the campaign work for my brand?
• How did the campaign change consumer behavior?
• Was the impact among new or existing buyers?
• How did it impact loyalty?
• How did it impact price sensitivity?
• How did it impact sub brands / total category / competition?
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 15 15
Sales Impact Calculation & Key Drivers
=
X
PURCHASE AMOUNT How much of the products did HHs buy
per occasion during the test period?
PENETRATION How many Brand Buying HHs?
TOTAL SALES
PURCHASE FREQUENCY How often did HHs buy Brand products
during the test period?
BUYING RATE How much are they buying?
X
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 16 16
Sales Impact: For every dollar spent in advertising, $4.42 was delivered in incremental sales
Number of Exposed Households in the US
Equals total incremental sales
from the campaign
X 78,345,0002
$4,862,869
X
Percent of Panel Reached
Total Partner Universe
9,351,673 HHlds Estimated Campaign Reach
Times incremental sales per household
(Exposed - Unexposed) $0.523 Per Exposed HH Incremental $
Divided by Campaign Spending $1,100,000
$4.42
Total Incremental Sales from Campaign
Total Media Spend
÷
II
Incremental $ per $1 spent (Payback)
1Percent of Panel Reached = Total Exposed Households ÷ Total Partner/Nielsen matched Panel 2Total Partner Universe= Total Projected US households x Panel Estimated Reach 3Per Exposed Household Incremental $ = Total incremental $ ÷ Total Exposed Households
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 17 17
@
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 18 18
NCS’ Model Methodology
• Within Super Learner, employs a Gradient Boosting Machine (GBM) and an Elastic Net Regression
– If something is determined to be important, it gets a greater weight
– If something is determined to be unimportant, it gets a lesser weight
• Super Learner pick/create models • “How do I estimate Q and how to I estimate g?” one SL for each
• Cross Validation objectively scores each one of them OUT of sample
• Rather than just picking the one that is considered to be the best within sample, evaluate and use all models
• Each model, even the poor ones, can contribute
• Better models get a greater weight, and the poor ones get a lower weight
• Ensemble Model is the final creation that is the product of all the weighted models
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 19 19
NCS’ Model Methodology – contd…
• TMLE (Targeted Maximum Likelihood Estimation) – Tease out an estimated answer based on unbiased assumptions
• Reduce the bias of advertising targeting
• Reduce the impact of the people who would buy anyways
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 20 20
The WAY principle
• W is all the independent variables. – Not using covariates - a variable can’t be influenced by an outcome
• “Bought on Deal” can’t be used - it indicates the brand was purchased • Price has to be available for all items, not just item sold
• A is exposure, either you were exposed or not exposed to the creative – W to A is what parts about you are causing you to be exposed
• Y is purchase. Yes/No binary, OR how much: continuous – A to Y, is what we’re trying to ISOLATE, it’s the purchases you make as a result of being
exposed
• Predicting A to Y • WAY is the assumed data structure principle, Q and g are the models built on
top of the WAY data structure
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 21 21
Lift Prediction
– Two numbers of predicting lift
• RR – is “relative risk” – an average purchase lift
– Q1 / Q0 – Lift Index
– Q1 – Prob of event when exposed, Q0 – Prob of event when not exposed
• ATE (Average Treatment Effect)– is absolute risk, as an increase in everyone’s lift
– Q1 – Q0
– Two ways of looking at the model
• (Q1* - Q0*) / n is the average lift among all the rows
• E(Y|A,W) the Q model is the expected value of Y conditionally given A and W, g is expected value of A given W
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 22 22
H2O Implementation @ NCS
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 23 23
NCS Platform
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 24 24
NCS Platform
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 25 25
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 26 26
Lessons
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 27 27
Lessons learnt
• Tried different Super Learner types
– Linear Regression
– Liner with Elastic net regularization
– GBM » Intense and computationally expensive
– Random Forest » which didn’t outperform GBM
• GBM was most accurate but most expensive
• GLM is fast but not accurate enough
• H2O 3 - Java and Rest API/codebase was cleaner, more maintainable for long term deployments compared to V2 – Went through the pain to migrate though !
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 28 28
And finally...
Got the best support from H2O team
» Have a dedicated coordinator
» Team jumped to add new features
» Proactive and effective and lightning fast reactive support!
Copyright © 2015 Nielsen Catalina Solutions • Confidential & Proprietary 29 29
H2O Support - H2O Data Munger
• Goal: Perform ‘join’ and ‘group’ natively in H2O in the same environment as ML
• Use Java as API
• Data: – Id based: 64bit integer
– XX Million Ids
– 0 to 5 observations for each id/each day
– 90 to 180 days in test
• Matt Dowle, the Super H2O Developer, scaled up the algorithm (forwards radix) to be parallel and distributed in H2O
Top Related