Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling....
Transcript of Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling....
![Page 1: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/1.jpg)
Dean Abbott
Abbott Analytics
KNIME Fall Summit
#KNIMEFallSummit16
September 16, 2016
Twitter: @deanabb
Measuring
Variable Importance
with Target Shuffling
![Page 2: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/2.jpg)
Dean Abbott
Abbott Analytics
KNIME Fall Summit
September 16, 2016
Twitter: @deanabb
Measuring
Variable Importance
with Target Input Shuffling
![Page 3: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/3.jpg)
Dean Abbott
Co-Founder and Chief Data Scient is t and
Chief Techology Off icer, SmarterHQ
Twit ter : @deanabb
![Page 4: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/4.jpg)
© Abbott Analytics, 2001-20164
A SaaS contextual marketing technology Tier 1 brands use to drive
conversion and loyalty, through multi-channel personalization
![Page 5: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/5.jpg)
AWS: Redshift, MySQL/Aurora, EC2, S3, Kinesis
![Page 6: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/6.jpg)
Why Am I Talking About this
Arcane Topic?
• I’ve been bothered by this for
decades….yes...I’m that old
• It’s conceptually easy to do.
© Abbott Analytics, 2001-20166
![Page 7: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/7.jpg)
Variable Importance in
Linear Regression
© Abbott Analytics, 2001-20167
![Page 8: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/8.jpg)
Variable Importance in
Decision Trees
• Decision Trees
• You think they are easy to explain?
© Abbott Analytics, 2001-20168
![Page 9: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/9.jpg)
Variable Importance in
Neural Networks
• Huh?
© Abbott Analytics, 2001-20169
![Page 10: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/10.jpg)
Variable Importance in
Neural Networks
• Or what neural
nets really look
like…
© Abbott Analytics, 2001-201610
![Page 11: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/11.jpg)
Naïve Bayes Model Outputs
© Abbott Analytics, 2001-2016
Essentially a
series of
cross-tabs for
every
variable!
Remember,
the final
probability is
the product
of the
individual
variable
probabilities.
11
![Page 12: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/12.jpg)
SVM Output
© Abbott Analytics, 2001-201612
![Page 13: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/13.jpg)
Neural Networks: Interpretation via Sensitivities
• Sensitivities reflect the amount of change in the outputs when each of the inputs is changed or wiggled some small amount—a larger sensitivity means the output changes more for a small change in the input.
• Provide measure of the importance of each input variable in the model (by itself)
• Can use sensitivities to reduce input variables in other neural network, decision tree, or regression models
© Abbott Analytics, 2001-201613
![Page 14: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/14.jpg)
KNIME Random Forest Node
Helps with Importance
© Abbott Analytics, 2001-201614
![Page 15: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/15.jpg)
© Abbott Analytics, 2001-201615
![Page 16: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/16.jpg)
Outline
• Classical variable importance: linear regression
• Hack #1: use linear regression model statistics to
infer variable importance
• Hack #2: use target shuffling to infer variable
importance
© Abbott Analytics, 2001-201616
![Page 17: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/17.jpg)
The Data: Easiest Possible!
• 3 inputs: each is a random Normal: mean = 20, std = 5
• Target variable: 0.5*var1 + 0.2*var2 + 0.3*var3
• 95,412 records (same size as cup98lrn)
© Abbott Analytics, 2001-201617
![Page 18: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/18.jpg)
Let’s Start with Normal
© Abbott Analytics, 2001-201618
![Page 19: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/19.jpg)
Variable Importance Using
Linear Regression Coefficient
• Coefficient match (be definition) the proportions used to
be build the target variable
• This is the average influence of each input on the
predictions for all records
© Abbott Analytics, 2001-201619
![Page 20: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/20.jpg)
© Abbott Analytics, 2001-201620
![Page 21: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/21.jpg)
t-proportion
For Each Variable to Assess Influence
• T-value measures the significance of the relationship.
• It turns out, that the proportion of the t-values for the exact model
matches the coefficients
© Abbott Analytics, 2001-201621
![Page 22: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/22.jpg)
Variable Importance Using
Prediction Proportion
• How would an empiricist compute
influence?1. Compute the proportion of the prediction that comes from
each term in the model
1. Influence of variable 1 = W1 * var1
2. Influence of variable 2 = W2 * var2
3. Influence of variable 3 = W3 * var3
2. Average the influences over all records
© Abbott Analytics, 2001-201622
![Page 23: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/23.jpg)
Variable Importance Using
Prediction Proportion• Compute the contribution of each term in the linear regression model
separately (each record).
• Var1_influence = $var1coef$ * $var1$, etc.
• Compute the proportion of the contribution of the predicted target variable value
• Average the contributions of each variable for each record to compute the average influence of each variable
© Abbott Analytics, 2001-201623
![Page 24: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/24.jpg)
So Far So Good
• Now let’s do the same
analysis for
• Neural Networks
• Support Vector
Machines.
© Abbott Analytics, 2001-201624
![Page 25: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/25.jpg)
So Far So Good
• Now let’s do the same
analysis for
• Neural Networks
• Support Vector
Machines.
• Uh.....maybe not
© Abbott Analytics, 2001-201625
![Page 26: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/26.jpg)
Do it the YACK way
• Yet
• Another
• Creative
use of
• KNIME
© Abbott Analytics, 2001-201626
![Page 27: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/27.jpg)
Why “Target Shuffling”?
• We don’t always have nice
metrics to identify the best
inputs with predictive models
(NNets, SVM, … anything
other than regression!)
• Even with regression, we don’t
always have nice inputs
• See John Elder’s introduction
of Target Shuffling to the data
mining community
© Abbott Analytics, 2001-201627
http://semanticommunity.info/@api/deki/files/30744/Elder_-_Target_Shuffling_Sept.2014.pdf
![Page 28: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/28.jpg)
Input Distributions Are
Not Always Ideal
© Abbott Analytics, 2001-201628
![Page 29: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/29.jpg)
Why “Target Shuffling”?
• Don’t care about the “target” part
• The Target shuffling node doesn’t care either
• Scramble (randomly) a single (input variable) column
• Target Shuffling Node doesn’t have to be in a loop; it can scramble a column while leaving the others in their natural order
• Captures the actual distribution of the data
© Abbott Analytics, 2001-201629
![Page 30: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/30.jpg)
Let’s call it
Input Shuffling
© Abbott Analytics, 2001-201630
Input
![Page 31: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/31.jpg)
Principles of Input Shuffling
• Key: randomly re-select a value of a single input variable value while
leaving all other variables in with their original values
• Compute the standard deviation (or some other measure of
perturbation) for each record
• Of the Target Variable Predictions
• NOT the actual target variable
• This perturbation is a measure of how influential the variable is in
the model
• High standard deviation -> lots of influence
• Low standard deviation -> not much influence
• ~0 standard deviation -> no influence
© Abbott Analytics, 2001-201631
![Page 32: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/32.jpg)
Shuffled Inputs Meta Node
Two Loops: (1) loop on input variables and (2) loop on shuffled input variable (50x or so)
© Abbott Analytics, 2001-201632
![Page 33: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/33.jpg)
Shuffling Inputs
All inputs and target
Just 1
input
© Abbott Analytics, 2001-201633
![Page 34: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/34.jpg)
Shuffling Inputs
All inputs and target
Just 1 input at a time
© Abbott Analytics, 2001-201634
![Page 35: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/35.jpg)
Single Record:
What it looks like• Single Record: 50 “shuffles”: Row0
© Abbott Analytics, 2001-201635
![Page 36: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/36.jpg)
Average for All Records in data
(~9K for this data set)
• Measures the spread of the predictions when randomly
perturbing the single input variable
© Abbott Analytics, 2001-201636
![Page 37: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/37.jpg)
Variable Importance Using
Input Shuffling for
Idealized Linear Regression Data
• Compute proportion of the average standard deviation from shuffling
the input (keeping others with the original values)
• (yes, I know I’m averaging standard deviations!)
© Abbott Analytics, 2001-201637
![Page 38: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/38.jpg)
Realistic Data:
KDD Cup 1998• 95,412: cup98lrn from KDD Cup 1998 Competition
• Use only the responders (4843) in linear regression models
• Hundreds of fields in data, but only use 4 for research
purposes
• LASTGIFT, NGIFTALL,
RFA_2F, D_RFA_2A
• Continuous target
• Two continuous
• One ordinal (RFA_2F)
• One dummy (D_RFA_2A)
© Abbott Analytics, 2001-201638
![Page 39: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/39.jpg)
Realistic Data:
KDD Cup 1998• Heavy skew of LASTGIFT, NGIFTALL,
TARGET_D
• Makes visualization
difficult
• Biases
regression
coefficients
(if
one cares)
© Abbott Analytics, 2001-201639
![Page 40: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/40.jpg)
Could Use Normalized Data
• To remove influence of skew and scale
• Log10 transform LASTGIFT, NGIFTALL, TARGET_D
• Scale all variables (post log10) to [0, 1]
© Abbott Analytics, 2001-201640
![Page 41: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/41.jpg)
Normalized Data
• Relationships clearer
• LASTGIFT strong positive correlation with TARGET_D
• NGIFTALL, RFA_2F, D_RFA_2A all have apparently slight negative
correlation
with
TARGET_D
© Abbott Analytics, 2001-201641
![Page 42: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/42.jpg)
The Basic Model:
Linear RegressionCoefficient
Use abs() for all calculations
© Abbott Analytics, 2001-201642
![Page 43: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/43.jpg)
Linear Regression: Compare
Influence Using Different MethodsCoefficient T-Proportion
Use abs() for all t-proportion calculationsUse abs() for all calculations
© Abbott Analytics, 2001-201643
![Page 44: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/44.jpg)
Linear Regression: Compare
Influence Using Different MethodsCoefficient T-Proportion
Prediction Proportion Input Shuffling
Use abs() for all t-proportion calculationsUse abs() for all calculations
Use abs() for all t-proportion calculations© Abbott Analytics, 2001-201644
![Page 45: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/45.jpg)
Linear Regression, Neural Network, and
Random Forest: Input Shuffling Influence
Input Shuffling- LR Input Shuffling - MLP
© Abbott Analytics, 2001-201645
![Page 46: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/46.jpg)
Repeat for More Inputs –
KDD Cup 98
© Abbott Analytics, 2001-201646
![Page 47: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/47.jpg)
Apply Input Shuffling to
Larger KDD Cup 98 Data
© Abbott Analytics, 2001-201647
Shuffle
LASTGIFT_log10
![Page 48: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/48.jpg)
Variable Influence from
Regression Diagnostics
© Abbott Analytics, 2001-201648
![Page 49: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/49.jpg)
Input Shuffling Variable Influence:
Regression
© Abbott Analytics, 2001-201649
currentColumnName VariableInfluence_Linear VariableInfluence_RF VariableInfluence_GBM
D_RFA_2A 0.0518 0.0139 0.0051
LASTGIFT_log10 0.0477 0.0383 0.0596
E_RFA_2A 0.0426 0.0155 0.0153
F_RFA_2A 0.0266 0.0105 0.0037
MINRAMNT_log10 0.0077 0.0127 0.0113
RFA_2F 0.0073 0.0122 0.0063
A_GEOCODE2 0.0060 0.0020 0.0008
B_GEOCODE2 0.0057 0.0011 0.0002
MINRDATE 0.0040 0.0061 0.0085
NGIFTALL 0.0038 0.0075 0.0066
MAXRDATE 0.0028 0.0035 0.0044
C_GEOCODE2 0.0025 0.0005 0.0000
NUMPRM12 0.0024 0.0022 0.0033
DOMAIN3 0.0021 0.0008 0.0009
CARDPM12 0.0016 0.0026 0.0037
LASTDATE 0.0005 0.0029 0.0018
AGE_imputerand 0.0004 0.0029 0.0046
DOMAIN2 0.0002 0.0012 0.0002
NUMPROM 0.0001 0.0036 0.0067
DOMAIN1 0.0000 0.0000 0.0000
![Page 50: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/50.jpg)
Accuracy Comparison on
Testing Data
© Abbott Analytics, 2001-201650
Linear Regression Random Forests Gradient Boosting
![Page 51: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/51.jpg)
Input Shuffling Variable Influence:
Regression (Unnormalized!)
© Abbott Analytics, 2001-201651
currentColumnName VariableInfluence_Linear VariableInfluence_RF VariableInfluence_GBM
E_RFA_2A 4.337 0.807 0.396
LASTGIFT 4.052 2.252 4.016
D_RFA_2A 3.566 0.625 0.245
F_RFA_2A 3.552 0.457 0.000
RAMNTALL 2.429 0.540 1.239
NGIFTALL 2.258 0.692 0.957
MINRAMNT 2.111 0.708 0.722
RFA_2F 1.274 0.618 0.480
FISTDATE 0.970 0.298 0.731
A_GEOCODE2 0.754 0.130 0.086
B_GEOCODE2 0.519 0.082 0.017
DOMAIN3 0.362 0.052 0.066
DOMAIN1 0.358 0.080 0.036
C_GEOCODE2 0.307 0.028 0.000
NUMPRM12 0.304 0.154 0.262
DOMAIN2 0.289 0.072 0.028
MAXRDATE 0.213 0.297 0.444
MINRDATE 0.200 0.345 0.455
CARDPM12 0.178 0.139 0.296
AGE_imputerand 0.174 0.202 0.363
MAXRAMNT 0.168 1.791 1.547
LASTDATE 0.036 0.240 0.243
![Page 52: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/52.jpg)
Input Shuffling Variable Influence:
Classification
© Abbott Analytics, 2001-201652
![Page 53: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/53.jpg)
Input Shuffling Variable Influence:
Classification
© Abbott Analytics, 2001-201653
VariableVariableInfluence
LogisticVariableInfuence
RFVariableInfuence
GBM
FISTDATE 0.0123 0.0349 0.0124
D_RFA_2A 0.0080 0.0024 0.0027
RFA_2F 0.0080 0.0176 0.0040
DOMAIN3 0.0072 0.0056 0.0057
E_RFA_2A 0.0069 0.0069 0.0055
NGIFTALL 0.0057 0.0347 0.0180
DOMAIN1 0.0011 0.0084 0.0013
LASTGIFT 0.0004 0.0236 0.0132
F_RFA_2A 0.0003 0.0103 0.0001
![Page 54: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/54.jpg)
Discussion
• Why Input Shuffling is good• Works for any input distribution
• Works with any algorithm
• Measures importance based on other input variables in natural patterns
rather than an idealized value (like the mean or mode)
• Can use many metrics to measure what “importance” means to you
• Why Input Shuffling is not so good• Takes a long time to run if you have lots of inputs, lots of records
• No statistically defensible metric to use (yet)
© Abbott Analytics, 2001-201654
![Page 55: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/55.jpg)
Conclusion
• Variable influence can be computed as a single
• Coefficients aren’t good measures unless the variables conform to linear
regression assumptions
• Some models don’t have “coefficients” at all so we can’t use the linear regression
approach
• Using target shuffling, we can generate randomized sensitivity scores easily for any
model
• If inputs are not normally distributed, average overall influence doesn’t tell
the full story (or may even tell a misleading story) about how valuable the
variable is in predicting the target
• Breaking predictions into bins (deciles or other number of bins) allows us to
compute an influence score for every part of the predicted range
• Answers the question: for high predicted values, which variables are most
influential
© Abbott Analytics, 2001-201655
![Page 56: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/56.jpg)
Binning Predicted Values
into Buckets (Deciles, Quintiles,…)
• Predictions Deciling predicted values allows us to compute
variable influence for each of these ranges of
the predicted values. Note that the top and
bottom bins have much larger variances.
© Abbott Analytics, 2001-201656
![Page 57: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/57.jpg)
LASTGIFT Influence
• LASTGIFT has stronger influence (positive) at the high end of predictions
• Significant influence for all predicted values
• Nearly constant influence for Bins 7-10
• Monotonic influence vs. predicted values
© Abbott Analytics, 2001-201657
![Page 58: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/58.jpg)
RFA_2F Influence
• RFA_2F has
stronger influence
(negative) at the
low end of
predictions
• Almost no
influence for Bin 7
– Bin 10
• Monotonic
influence vs.
predicted values
© Abbott Analytics, 2001-201658
![Page 59: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/59.jpg)
NGIFTALL Influence
• NGIFTALL has
stronger influence
(negative) at the
low end of
predictions
• Mostly monotonic
influence vs.
predicted values
© Abbott Analytics, 2001-201659
![Page 60: Measuring Variable Importance with Target Shuffling · Variable Importance with Target Shuffling. Dean Abbott Abbott Analytics KNIME Fall Summit September 16, 2016 dean@abbottanalytics.com](https://reader033.fdocuments.net/reader033/viewer/2022041609/5e369d9b97a1747eb050b846/html5/thumbnails/60.jpg)
D_RFA_2A Influence
• D_RFA_2A has strong influence at the low end of predictions only (Bin 1 and Bin2)
• No influence at all for Bin 3 through Bin 10
© Abbott Analytics, 2001-201660