Modeling Process Review - SAS · Database Marketing Applications DataMaApp Inc. 72 Concession 12...
Transcript of Modeling Process Review - SAS · Database Marketing Applications DataMaApp Inc. 72 Concession 12...
Page 1
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Modelling Process Review
March 2nd 2012
Page 2
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
DataMaApp - Who are we ?
DataMaApp (Database Marketing Applications)
founded in 1999
DataMaApp strives;
for Data driven results
to Maximize your profits
while Appealing to your customers
Page 3
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
DataMaApp - Who are we ?
DataMaApp has over 40 years combined experience applying
statistical data tools on customers’ data to drive positive
business growth and acquisition results.
Experience ranges over a wide variety of clients including the
banking, telco, automotive, research and printing industries
Experience in both the business to business and business
and consumer space
Services range from standard profiling and reporting to more
advanced predictive statistical models, payback metrics and
advanced multi-variate test designs
Page 4
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
DataMaApp - Who are we ?
DataMaApp provides strategic support using data in
four key business areas :
1. How much should I invest ?
2. How should I invest ?
3. Test Design
4. Campaign / Market performance analysis
Page 5
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Modelling Process Review
1. Target Universe Definition
2. Variable Creation
3. Variable Reduction
4. Model Development
5. Model Final Decision
6. Model Scoring – ensuring your Statistical model will
work
Page 6
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Target Universe Definition
Analysis of dependent variable to determine optimal response
window and customer universe
Defining and ensuring exclusions
Determining if there is any seasonality
Confirm response counts
Page 7
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Variable creation
Univariate analysis to determine appropriate ranges for variables
Frequency analysis to determine the variable population
CHAID analysis to determine interaction variables
Determine how to deal with missing values (Mean Filled, Missing
Filled, 0 Filled, Large Filled)
Variables to Transformation where appropriate (Log, Squared,
etc)
Page 8
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Variable reduction
STEP 1
Descriptive Variable Analysis
Review all new variables for accuracy and outliers
Remove lowly populated variables
Remove those with little variance
STEP 2
Business Intelligence
Variables of importance to the business
Variables that are actionable
Factor Analysis
Analysis of inter-relationships among a larger number of variables
PCA (Principal Component Analysis) determines the number of factors
required (scree plot)
Page 9
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Variable reduction
Correlation
Correlation to dependent
Variable Clustering
Groups like variables into mutually exclusive clusters based on correlation
between variables
Selection Process
Use Business Intelligence, Factor, Correlation, and Clustering to determine
which variables to include in model development
Page 10
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Model Development
Modeling Techniques
Regression
Uses classical statistics (Least Squared Analysis) to determine which
attributes predict the dependent. (Proc Reg , Proc Logistic)
Regression analysis is most effective on continuous normal data.
Typical technique for customer transactional data modelling (e.g. customer
attrition model, product cross-sell)
Regression modelling has two common downfalls
often we do not have continuous and normal data to build a model on yet
many statistical analysts proceed as if they do creating unpredictable model
implementation results.
Regression models are not effective when the dependent is not binary and the
dependents values have no relation to each other (e.g. 1 is better than 2 is
better than 3)
There is a solution – Bayesian statistics
Page 11
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Model Development
Modelling Techniques
Bayesian
Uses Bayesian statistics (Conditional Probability Analysis) to determine
which attributes predict the dependent. (Proc Discrim)
Bayesian analysis is most effective on Binary data.
Model ‘classifies’ each customer in the class the customer most likely falls
into
Binary dependent (did the customer do something or not – most common
model) gives each customer two probability scores – one for each
dependent value
Customers can then be ranked and deciled into likelihood deciles (just
like logistic regression)
Common model applications would include prospect conversion models
Page 12
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Model Development
Modelling Techniques
Bayesian
Bayesian modelling also provides a powerful tool for placing customers
(prospects) into multiple buckets if the dependent is not binary and there is no
order relationship of the values of the dependent.
A excellent application of this would be to predict which value group a
prospect falls into (RFM , TBS etc)
Page 13
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Model Development
Modelling Techniques
Combined
Combines both Regression and Bayesian techniques (standardized average
score) to determine which attributes predict the dependent.
Combined analysis is most effective on mixed data.
An good example of this would be developing a customer model for which
you have both transaction data and third party overlay data (income data, what
car do they drive and whatever binary data you may have)
Page 14
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Final Model Decision
1. Relevant statistical measure
F-stats
Concordance
ANOVA, MANOVA
2. Gains Lift Charts
Observed Dependent activity by Decile
3. Variable Profile
Variable differentiation by Decile
All three steps are utilized to determine optimal model
Page 15
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Model Scoring – Ensuring your
statistical model will work
1. Apply model scoring algorithm on the universe targeted for the application of
the model
2. Profile the scored universe by decile on the predictive variables in the model
3. Compare the scored variable profile by decile to the model variable profile to
ensure the predictive variables on the scored file closely match the model
sample. If they do not match;
1. The score universe is different than the model universe
2. The model has aged may no longer be effective
3. The universe has changed since model development (e.g. acquisition of a
large number of new customer, a successful new product launch that did not
occur on the model sample etc)
If the scored decile profiling on the predictive variables does not match the model
sample profiles the model performance could be suspect
Page 16
Data Driven Results, Maximize your profits, Appeal to your customers
Database Marketing Applications
DataMaApp Inc.
72 Concession 12 East
Tiny, Ontario, Canada L0L 2J0
705-549-0771 fax 705-549-0771
Questions ?