Bootstrap-After-Bootstrap Prediction Intervals for Auto Regressive Models
Hubbard Decision Research The Applied Information Economics Company Bootstrap Hints.
-
Upload
amber-gallagher -
Category
Documents
-
view
212 -
download
0
Transcript of Hubbard Decision Research The Applied Information Economics Company Bootstrap Hints.
HubbardDecision Research
The Applied Information Economics Company
Bootstrap HintsBootstrap Hints
HubbardDecision Research
The Applied Information Economics Company
Overview of Bootstrapping HintsOverview of Bootstrapping Hints
The objective of a good bootstrap model is to be a realistic model of intuitive judgments which are even more accurate than the judges
The measure of effectiveness in this area is the R squared
Roughly, R squared means the % of variance explained by the model
These hints should help improve R squared
HubbardDecision Research
The Applied Information Economics Company
Strategies for Improving R SquaredStrategies for Improving R Squared
Hints for choosing the right variables Hints for improving data gathering Hints for improving quantification Hints for finding higher-order variables
HubbardDecision Research
The Applied Information Economics Company
Hints for Choosing VariablesHints for Choosing Variables For some commonly bootstrapped variables – such as
Confidence Index and Cancellation Probability – these variables may be considered: Project cost and/or duration Is it a compliance project and/or is the project a documented
strategic requirement? What is the scope of the business covered? (eg. Number of
departments involved, number of users, etc.) Sponsor characteristics such as level, whether the sponsor is
business or IT, or the sponsors past success record in past projects Whether the investment is new software development, package
modification, upgrades to previous systems, hardware only, etc. Technology risk such as proven track records, IT familiarity with
the technology, the maturity of the technology Watch how many variables are added - much more than 8
variables starts to become unproductive and may degrade the accuracy of the model – stick to the important ones
HubbardDecision Research
The Applied Information Economics Company
Data Gathering HintsData Gathering Hints
You will probably always get a higher R square when averaging larger groups
Be sure to allow time for calibration Use a trial bootstrap list that they discuss as a
group They can check results with “pair-wise
comparisons” – they pick pairs of investments at random, determine which they would prefer, then they confirm that their evaluators scores reflect this
HubbardDecision Research
The Applied Information Economics Company
Hints for Quantifying VariablesHints for Quantifying Variables
Regression assumes that all variables are basically linear
Reviewing each variable for non- linearity and finding a way to make them linear will improve R squared
Variables that can be captured as 0 or 1 (binary) need no review
Continuous variables need to be graphed to check for non-linearity
Discrete variables that are not binary require pivot table analysis (see pivot table procedure for details)
HubbardDecision Research
The Applied Information Economics Company
Continuous VariablesContinuous Variables
One way to improve R square is to convert your non-linear variables into linear variables
To check which variables are non-linear make an XY graph of the continuous variable on the X axis and the bootstrapped variable (from the evaluators) on the Y axis
If you find an obviously non-linear relationship, you can change the variable so that it becomes linear
Depending on how the graph looks, you can take the appropriate steps
HubbardDecision Research
The Applied Information Economics Company
LinearLinear
This is an obvious linear relationship, leave it just like it is
0,00%
20,00%
40,00%
60,00%
80,00%
100,00%
0% 20% 40% 60% 80% 100%
HubbardDecision Research
The Applied Information Economics Company
Scattered DistributionScattered Distribution
If the XY plot is not obviously non-linear, then just leave it like it is
If the Excel regression output indicates that this variable has little or no effect, consider removing it
0,00%
1,00%
2,00%
3,00%
4,00%
5,00%
6,00%
0,00% 1,00% 2,00% 3,00% 4,00% 5,00% 6,00% 7,00%
HubbardDecision Research
The Applied Information Economics Company
Clustered distributionClustered distribution
Here, a “threshold” would be the best quantification of this variable Instead of being linear, this variable appears to make a difference only
when it is above or below a certain value (in this case, about 6% on the horizontal scale
Try converting the continuous variable to a binary. In this case you would use “=if(x<.06, 1,0)”
0%
2%
4%
6%
8%
10%
12%
14%
16%
0% 2% 4% 6% 8% 10% 12% 14%
HubbardDecision Research
The Applied Information Economics Company
Upward SlopingUpward Sloping
If the graph slopes upward, then you might try putting the scale of the X axis on “logarithmic”
If this makes it look linear then use the formula “=log(X)” If that doesn’t work try “=X^.5” or some other power of X less than 1
0%10%20%30%40%50%60%70%80%90%
0 10 20
HubbardDecision Research
The Applied Information Economics Company
Leveling OffLeveling Off
Try setting the scale of the Y axis to “logarithmic” If this makes it look linear then use “=exp(X)” If it doesn’t work, try “=X^2” or some other power of X
0%10%20%30%40%
50%60%70%80%90%
0% 50% 100% 150% 200% 250% 300%
HubbardDecision Research
The Applied Information Economics Company
Downward SlopingDownward Sloping
Try setting the scale of the Y axis to “logarithmic” If this makes it look linear then use “=exp(x)” If it doesn’t work, try “=1/X”
0%10%20%30%40%
50%60%70%80%90%
0% 50% 100% 150% 200% 250% 300%
HubbardDecision Research
The Applied Information Economics Company
Hints for Higher-Order TermsHints for Higher-Order Terms
After your first attempt at a regression, you may improve your R squared by adding some “higher-order” variables
A higher-order variable includes variables that are the products of other variables, conditional statements involving other variables, etc.
To find potential candidates for higher-order terms, ask yourself if the importance of some variables depend on the values of other variables
Try several new terms and plot each one. If there looks like an obvious linear relationship, then add it
If you make a higher-order variable, run a new regression, and the R square is higher, it was probably a good choice
HubbardDecision Research
The Applied Information Economics Company
Continuous Higher-Order TermsContinuous Higher-Order Terms
If the importance of one variable depends on the value of another, and they are both continuous, try the following – we’ll call these two variables X and Y
If the bootstrapped variable should increase when both X and Y are high (or when both are low) then try “=X*Y”
If the bootstrapped variable should increase when one variable is high and the other is low then try “=X/Y”
If X is especially important when Y is over/under a certain value N then try “=if(Y>N, X, 0)
HubbardDecision Research
The Applied Information Economics Company
Discrete Higher-Order TermsDiscrete Higher-Order Terms
You might try a pivot table that compares the average bootstrapped output variable in combinations of the two variables – put one variable in the columns of a pivot and the other in the rows
You can then try a nested IF statement that allows you to put a separate discrete value on each combination of the two variables
For example, suppose you found a compounding relationship between “strategic” (Y) and “multiple departments” (X)
You might try “=if(X=1,if(Y=1,.41,.11),.5)”
10
.41 .51
.11 .49
1 0Strategic
Multiple Departments
These 2 are not significantly different so
you can average them and use the same value
These 2 are not significantly different so
you can average them and use the same value
Average
HubbardDecision Research
The Applied Information Economics Company
Improvements Due to BootstrapImprovements Due to Bootstrap
This chart shows the percentage reduction in error of intuitive estimates compared to bootstrapped estimates
Results vary depending on how objective and systematic the model was – like ours
0% 5% 10% 15% 20% 25% 30% 35% 40%
Cancer patient life-expectancy
Life-insurance salesrep performance
Graduate students grades
Changes in stock prices
Mental illness using personality tests
Student ratings of teaching effectiveness
IQ scores using Rorschach tests
Psychology course grades
Business failures using financial ratios
Mean across many studies
HubbardDecision Research
The Applied Information Economics Company
Actual Classification PlotsActual Classification Plots
An Illinois insurance company created a classification chart to help prioritize the current list of proposed investments
They wanted to determine which investments could be accepted without more analysis and which need more analysis
18 investments were plotted on the classification chart
The results had a profound effect on investment priorities
Some investments that were assumed to be beneficial now required analysis and some that required analysis could now be approved immediately
HubbardDecision Research
The Applied Information Economics Company
Classification of Example ProjectsClassification of Example Projects
3
4
5
6
7
89
1011
12
13
14
15
16
17
18
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
10 100 1,000 10,000
12
Expected Investment Size ($000)
Con
fide
nce
Inde
x
No
Cla
ssif
icat
ion
Nee
ded
Do Abbreviated Risk-Return Analysis: 6. DLSW Router Network Redesign9. Extended Hours18. Doc. Access Strategy
Do Abbreviated Risk-Return Analysis: 6. DLSW Router Network Redesign9. Extended Hours18. Doc. Access Strategy
Do Full Risk-Return Analysis: 8. Pearl Indicator and Pearl I/O interface11. Richardson Data Center Consolidation15. MVS DB2 Tools
Do Full Risk-Return Analysis: 8. Pearl Indicator and Pearl I/O interface11. Richardson Data Center Consolidation15. MVS DB2 Tools
Reject; Consider Other Options: 1. Data Strategy 2. Enterprise Security Strategy3. Remote Server Redundancy12. MQ Series: Base13. Development Environment 2000 (mf)14. “Source Control” Source Code Mgmt16. Enterprise InterNet
Reject; Consider Other Options: 1. Data Strategy 2. Enterprise Security Strategy3. Remote Server Redundancy12. MQ Series: Base13. Development Environment 2000 (mf)14. “Source Control” Source Code Mgmt16. Enterprise InterNet
Success Factor Adjustments: 4. Network OS migration to Novell 5.x10. Optimize Single Code Base
Success Factor Adjustments: 4. Network OS migration to Novell 5.x10. Optimize Single Code Base
Accept without Further Analysis: 5. Lucent switch upgrade7. Image Server Relocation17. Enterprise IntraNet to all sites
Accept without Further Analysis: 5. Lucent switch upgrade7. Image Server Relocation17. Enterprise IntraNet to all sites
HubbardDecision Research
The Applied Information Economics Company
Bootstrapping DeliverablesBootstrapping Deliverables
Final presentation including An XY chart showing correlation of original estimates to
the bootstrap model Any “solution space” that was developed such as
classification charts
A worksheet for input of various values which uses the bootstrap model to estimate some output variable(s)
Any customization to RAVI documentation for that client for proper use of the worksheets and solution spaces
Any recommendations based on the bootstrap