PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling...
Transcript of PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling...
![Page 1: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/1.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
PROVEN PRACTICES FOR PREDICTIVE MODELING
MARY-ELIZABETH (“M-E”) EDDLESTONEPRINCIPAL SYSTEMS ENGINEER, ANALYTICS
CONTRIBUTIONS FROM:DARIUS BAERDAVID OGDENDOUG WIELENGA
BROUGHT TO YOU BY SAS CUSTOMER LOYALTY
![Page 2: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/2.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES DISCLAIMERS
• The choice of “Best Practices” is highly subjective.
• Certain suggested practices may not be suitable for a particular situation.
• It is the responsibility of a data mining practitioner to critically evaluate methods and select the best method for a particular situation.
• This presentation represents the opinions of the contributors.
![Page 3: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/3.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES TO HELP YOU MEET AND EXCEED YOUR GOALS
Faster model developmentMore useful modelsSuperior models
![Page 4: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/4.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES FORMAT OF PRESENTATION
• Background & General Guidance• Developing the Data• Developing & Delivering the Model
![Page 5: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/5.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BACKGROUND &GENERAL GUIDANCEANALYTICS CYCLE AND THE MODELING PROCESS
![Page 6: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/6.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ANALYTICS LIFECYCLE
Formulate Problem
Data Preparation
Data Exploration
Transform & Select
Develop Models
Validate Models
Deploy Model
Evaluate & Monitor
Model
![Page 7: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/7.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE IT’S ALL ABOUT BALANCE
![Page 8: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/8.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE IT’S ALL ABOUT BALANCE
• Many factors need to be considered and optimized:• Time• People• Money• IT Resources
People
TechnologyBusiness Process
![Page 9: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/9.jpg)
Copyright © 2013, SAS Institute Inc. All rights reserved.
• BI reporting• Web portals /
dashboards• Information
management• Problem-specific
business solutions• Predictive analytics• Hardware
ANALYTICS INFRASTRUCTURE
TECHNOLOGY
BUSINESS PROCESS
DECISION ANALYTICS
Fact-based decision making requires the right technology, talent, processes and culture
• Continuous Process Improvement
• Planning• Project
methodology• Standards
• Vision & Leadership
• Team composition
• Enterprise authorityPEOPLE
![Page 10: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/10.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Domain ExpertMakes DecisionsEvaluates Processes & ROI
BUSINESS MANAGER
Data ExplorationData VisualizationReport Creation
BUSINESS ANALYST
Exploratory AnalysisDescriptive SegmentationPredictive ModelingModel Validation & Registration
DATA MINER
Model ValidationModel DeploymentModel MonitoringData Preparation
IT/SYSTEMS MANAGEMENT
Formulate Problem
Data Preparation
Data Exploration
Transform & Select
Develop Models
Validate Models
Deploy Model
Evaluate & Monitor
Model
LIFECYCLE BEST PRACTICE
INVOLVE ALL THE RELEVANT PEOPLE/ROLES
![Page 11: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/11.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE WEAR MANY HATS
Have a passion to understand not just
analytics, but the business and
technology.
![Page 12: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/12.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE USE THE TECHNOLOGY AND METHOD THE FITS THE JOB
Every tool and method has advantages and disadvantages.Whenever possible, select the tool or method that balances long-term goals for the entire process.
![Page 13: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/13.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE BEGIN WITH THE END IN MIND
![Page 14: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/14.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE BEGIN WITH THE END IN MIND
•What?•How?•Who?•When?
![Page 15: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/15.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE BEGIN WITH THE END IN MIND
• What is the overarching strategic objective/initiative?
• How will the model be used?• How will it be put into production?
![Page 16: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/16.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE BEGIN WITH THE END IN MIND
• Who will be affected by the use of the model?
• Who needs to be convinced of the value of the model?
• When will the model be used?
![Page 17: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/17.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES BUSINESS CONSIDERATIONS BEFORE YOU MODEL
• Thoroughly understand the business/marketing objectives• Detail the precise (planned) usage for the output• Define the target variable (the outcome being modeled /
predicted)• Formulate a theoretical model: Y = f (X1, X2, …) fill-in
the likely X’s
![Page 18: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/18.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE
SEMMA Process for Model
Development
![Page 19: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/19.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE MODELING APPROACH
1. Sample training set(s), validation set(s), holdout test set
2. Explore min, max, mean, median, missing values, levels (categorical cardinality)
3. Modify filtering outliers, reducing cardinality, correcting multicolinearity, imputations, non-linear transformations
![Page 20: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/20.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE MODELING APPROACH
4. Model variable selection, various model formulations, iterative cycle, insights & client reviews
5. Assess performance criteria and review
![Page 21: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/21.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE MODELING APPROACH (CONTINUED)
7. Final Assessment & Testing8. Profile characteristics & indicators9. Document results10. Prepare (production-ready) data collection and
score code11. Monitor model performance
![Page 22: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/22.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DEVELOPING THE DATA
![Page 23: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/23.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES OPTIMIZING DATA
Determining DataSelecting TargetPreparing Variables
![Page 24: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/24.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DETERMINING DATA
![Page 25: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/25.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES TECHNICAL CONSIDERATIONS BEFORE MODELING
• Brainstorm all potential input data elements• Identify source systems, specific data fields, availability/priority/level-of-effort of data
• Finalize data to be collected
![Page 26: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/26.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES TECHNICAL CONSIDERATIONS BEFORE MODELING
• Formulate structure and layout of modeling dataset to be built
• Devil-in-the-details: filters, timeframe of history, etc…
• Build modeling dataset
![Page 27: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/27.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE ALLOW SUFFICIENT TIME FOR ALL ASPECTS
![Page 28: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/28.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE
SEMMA Process for Model
Development
![Page 29: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/29.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLE
• (Over) Sampling• Decisioning• Partitioning
![Page 30: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/30.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLING
![Page 31: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/31.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLE TO SAMPLE OR NOT?
• Sampling is a valuable tool that can be used to great effect.
• If computing resources are no object, it’s possible to use all data.
• When resource constrained, try increasing sample sizes as model development progresses.
• When model is nearly finalized, try different seeds for samples to ensure model stability.
SAMPLE
SAMPLE
ALL DATA
![Page 32: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/32.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLE WHAT ABOUT OVERSAMPLING?
• It depends.• Frequently one needs to oversample in order to allow
algorithm(s) to identify effect, especially with rare targets.• Only oversample as much as you need to in order to
obtain a model that makes sense from a business perspective. This is highly subjective.
![Page 33: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/33.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SELECTING TARGET
![Page 34: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/34.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
CHOOSING YOUR TARGET
• Choosing the Target• Response vs. Propensity• Number of Models
![Page 35: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/35.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DECISIONING
![Page 36: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/36.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
WEIGHTING YOUR DECISIONS
• Expected Profit• Decision Boundaries
![Page 37: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/37.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
UNDERSTANDING EXPECTED PROFIT
• Consider this game• Flip a fair coin one time• If it is heads, you win $10.00• Cost of playing one time is $1.00
Do you want to play this game?
![Page 38: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/38.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
UNDERSTANDING EXPECTED PROFIT
• Consider this game• Flip a fair coin one time• If it is heads, you win $10.00• Cost of playing one time is $1.00
E(Profit) = 0.5 * (10 - 1) + 0.5 * (-1) = 4.50 + (-0.50) = 4.00
![Page 39: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/39.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DECISIONS COMBINING THE WEIGHTS WITH PRIORS
![Page 40: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/40.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DECISIONS DETERMINING DECISION WEIGHTS
• To determine the amount of weight to assign to the rare event in a binary target, calculate this ratio:
•
• Specify the weight of the rare event to be this ratio• http://support.sas.com/kb/47/965.html
![Page 41: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/41.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DECISIONS DETERMINING DECISION WEIGHTS: EXAMPLE
• Consider a binary event where Prob(Yes) = 0.1 and Prob(No) = 0.9
• To determine the amount of weight to assign to the rare event in a binary target, calculate this ratio:
•
= ..
= 9
• Specify the weight of the rare event to be this ratio
![Page 42: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/42.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DECISIONS INCORPORATING PRIORS
• Before fitting model• Decision Profile
• After fitting model• Decision Node
![Page 43: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/43.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
PARTITIONING
![Page 44: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/44.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLE DATA PARTITIONING
PARTITION ROLE
Training Used to fit the model
Validation Used to validate the model and prevent over-fitting
Test Used to provide unbiased estimate of model performance
![Page 45: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/45.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLE SAMPLE: DATA PARTITIONING
40%
30%
30%
WHAT IS OPTIMAL PARTITION?
Training
Validation
Test
![Page 46: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/46.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE SAMPLE: DATA PARTITIONING
60%
40%0%
WHAT IS OPTIMAL PARTITION?
Training
Validation
Test
![Page 47: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/47.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE SAMPLE: DATA PARTITIONING
70%
30%0%
WHAT IS OPTIMAL PARTITION?
Training
Validation
Test
It depends!
![Page 48: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/48.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAMPLE DATA PARTITIONING CONSIDERATIONS
• How much data is available?• Is an unbiased measure of model performance required?
• Should test data be in-sample or out-of-sample?• How many test samples are needed? (e.g. different time
periods, different geographies, etc.)• When should test data be used in the process?
![Page 49: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/49.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE DATA PARTITIONING
• Percentages: frequently used percentages are 50/50/0, 60/40/0 and 70/30/0 with a completely separate Test partition.
• Do not bring Test data into process until model is complete. It should not influence modeling process, merely used to report performance.
• Multiple Test data can be used – consider how model will be deployed and create representative samples.
![Page 50: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/50.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
PREPARING DATA
![Page 51: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/51.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY ITERATIVE RELATIONSHIP WITH DATA PREPARATION
Data Prep
Data Exploration
Data Modification
![Page 52: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/52.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE EXPLORE & MODIFY: GETTING THE MOST OUT OF DATA
• Once you have an analytics-ready table:• Examine Categorical Variables• Examine Continuous Variables• Explore Missing Values• Cluster Variables
![Page 53: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/53.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY CATEGORICAL VARIABLES
![Page 54: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/54.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY CONTINUOUS VARIABLES
![Page 55: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/55.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY MISSING DATA
• Why is data missing?• Are there patterns to the missing data within or across variables?
• Imputation methods to consider
![Page 56: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/56.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY CLUSTER VARIABLES
• There is no single answer for clusters
• Design clusters and profiles around themes using smaller set of related variables
![Page 57: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/57.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SELECTING VARIABLES
![Page 58: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/58.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY VARIABLE SELECTION/REDUCTION TECHNIQUES
• Stepwise Regression• Variable Selection Node• Decision Tree Node• Variable Clustering • Combined Approach
![Page 59: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/59.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY VARIABLE SELECTION/REDUCTION TECHNIQUES
• Multicollinearity
![Page 60: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/60.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXPLORE & MODIFY VARIABLE SELECTION/REDUCTION TECHNIQUES
• Interactions
![Page 61: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/61.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES OPTIMIZING DATA
Selecting TargetDetermining DataPreparing Variables
![Page 62: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/62.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DEVELOPING & DELIVERING THE MODEL
![Page 63: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/63.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
MODEL & ASSESS DELIVERING THE MODEL
• Developing Your Model• Choosing a Model• Deploying the Model
![Page 64: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/64.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DEVELOPING THE MODEL
![Page 65: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/65.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
MODEL MODEL DEVELOPMENT
• Regression• Decision Trees• Neural Networks• Ensemble• Rule Induction • Something Else?
![Page 66: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/66.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE MODEL DEVELOPMENT
• Try various techniques and combinations of techniques.
![Page 67: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/67.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
CHOOSING A MODEL
![Page 68: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/68.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES MODEL SELECTION
• Evaluate model metrics• Consider business knowledge• Recognize constraints
![Page 69: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/69.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ASSESS CUMULATIVE CHARTS
![Page 70: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/70.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ASSESS NON-CUMULATIVE CHARTS
![Page 71: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/71.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DEPLOYING THE MODEL
![Page 72: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/72.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES MODEL DEPLOYMENT
• Reporting Results• Clean up and back up• Monitor performance
![Page 73: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/73.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES MODEL DEPLOYMENT
• Incorporate and share knowledge• Automate ETL (Extract, Transform, Load)• Automate process
![Page 74: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/74.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ULTIMATE GOAL
SAS MODEL FACTORYSOURCE /
OPERATIONAL SYSTEMS
MODEL MANAGEMENT
MODEL DEVELOPMENT
DATA PREPARATION
MODEL DEPLOYMENT
![Page 75: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/75.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
MODEL & ASSESS DELIVERING THE MODEL
• Developing Your Model• Choosing a Model• Deploying the Model
![Page 76: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/76.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICES FORMAT OF PRESENTATION
• Background & General Guidance• Developing the Data• Developing & Delivering the Model
![Page 77: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/77.jpg)
Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.
BEST PRACTICE BE ANALYTICALLY SAVVY AND CREATIVE
analytical creative
It’s both science and
art!
![Page 78: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations](https://reader031.fdocuments.net/reader031/viewer/2022030511/5abb53bf7f8b9a567c8c7d55/html5/thumbnails/78.jpg)
Copyr igh t © 2013 SAS Ins t i tu te Inc . A l l r igh ts reserved. www.SAS.com
THANK YOU FOR USING SAS!