The Perplexing Math of Uncertainty - Vose Software · THE PERPLEXING MATH OF UNCERTAINTY 1...
Transcript of The Perplexing Math of Uncertainty - Vose Software · THE PERPLEXING MATH OF UNCERTAINTY 1...
The Perplexing
Math of
Uncertainty David Vose
Director, Vose Software
Vose Software BVBA
Franklin Rooseveltlaan 348 9000 Gent Belgium P +32 (0)9 265 02 30 E [email protected]
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 1
Quantitative risk analysis involves a wide range of skills and tasks that risk modelers need to have
mastered before embarking on an important risk analysis of their own. Among the most over-looked and
underestimated of these skills is knowledge of how to manipulate variables within Monte Carlo simulation
models which is the standard modeling technique for risk analysis.
Add, subtract, multiply, divide – we have learned how to use them with numbers at school by the age of
ten at the latest, and we take it for granted that we have mastered them. It is hard to imagine a risk
analysis model in any field that does not include some of these four operations. Yet these basic
operations very often do not work in the same way when those numbers are uncertain. Very worryingly,
nearly every person I encounter who is involved in risk modeling is to some degree unaware or unclear
about the correct ways of manipulating uncertain variables in a model, perhaps because we don’t give
a second thought to calculations using + - * / .
Most risk analyses are performed using Excel with a Monte Carlo add-in like our software, ModelRisk.
There is a widely held belief that one can take a standard spreadsheet model (e.g. a cashflow model
with EBITDA or NPV calculation) and simply replace any value within that model that is uncertain with a
function that generates random samples from some distribution to reflect its uncertainty. People
mistakenly think that the rest of the model’s logic can be left unchanged.
And it really does matter. Incorrect manipulation of uncertain variables in a model will almost always
produce simulation results with something close to the correct average value, which people use as a
‘reality check’, but completely wrong spread around that average value. The net effect is that decision-
makers are presented with a wildly inaccurate estimate of the uncertainty (risk) of the outcomes of
different decision choices. Some may realize that the model results are unrealistic and dismiss them,
while others won’t and will make very misguided decisions.
I invite you to read these examples very carefully if you are intend to write Monte Carlo risk analysis
models, and share it with other colleagues who build such models.
INTRODUCTION
Learn how to model risk correctly
Every three months, we offer a ten-day residential course on risk analysis modeling. The courses are all taught by me, David Vose. You will find a short version of my CV at the end of this paper. The aim of these training courses is to turn people into risk analysts who produce work that their company can trust. The courses cover everything you need to know and provides plenty of exercises for you to gain some experience in applying the ideas and techniques.
GO TO DETAILS OF TRAINING COURSE
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 2
Portfolio Name
BASIC
TANDARD
Of the four operations, addition is the least problematic. The only issue that occurs is where the two
variables being added are connected in some way (i.e. causally associated with each other either
directly or indirectly so that their possible values are correlated). In other words, if the two variables you
want to add together are affected by one or more of the same external factors then simple addition will
very likely not be appropriate.
That issue aside, performing the simple possible addition of two uncertain variables gives a very nice
way of illustrating how non-intuitive the results are from a Monte Carlo simulation – and therefore why
it is so dangerous to rely upon one’s intuition when assessing whether the results look about right.
Let’s say that we have two costs – A and B, and we want to calculate the total C. If A = $1, and B = $4,
we’d have C = $5.
Now imagine that costs A and B are uncertain. A is equally likely to be somewhere between $0.50 and
$1.50, and B is equally likely to be between $3.50 and $4.50. That means that C is also uncertain and
must lie somewhere between $4 and $6. In fact1, C would also most likely take a value of $5, which can
be shown pictorially like this:
Add
CALCULATION 1
And in risk analysis modeling parlance like this:
Uniform(0.5,1.5) + Uniform(3.5,4.5) = Triangle(4,5,6)
You might be surprised that the uncertainty of C follows a Triangle shape instead of another rectangle
(Uniform). When asked to guess what two Uniform variables sum to, the answer often given is another
Uniform. It isn’t that intuitive, yet it is hard to think of any simpler distributions we might add together.
The following graphs illustrate where the Triangle comes from:
__________________________________
1 Technically, this is only true if the two uncertainties are independent
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 3
I. A plot of Monte Carlo simulated values of A against C. For example, the bottom left corner
shows samples where A and B were close to their minima (0.5 and 3.5 respectively to give a
total C = 4) and the top right corner shows samples where A and B were close to their maxima
(1.5 and 4.5 respectively to give a total C = 6). The range of the value of B is shown by the red
arrows.
II. The points are statistically evenly spread within his rhombus shape because A and B are
uniformly distributed. Imagine that we split these simulated data in two according to the
horizontal line shown …
III. And then flip the position of the top half of the data. You can see that when projected onto the
vertical axis the values of C follow a Triangle distribution
IV. Switch axes and we have a Triangle distribution. Note that the vertical axis now represents
‘probability’2 since the height of the triangle is proportional to the fraction of simulated values
that fall at or near the horizontal axis value.
In this example, the distributions sum to a Triangle because A and B follow Uniform distributions with
the same width (of $1). If the widths had been different, the resultant summation would have been
__________________________________ 2 To be precise, we should say ‘probability density’ rather than ‘probability’ since the variable is continuous
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 4
trapezoidal, as shown by the following similar set of plots, where A = Uniform(0.5,1.5) and B =
Uniform(3.5,5.5):
Let’s say we have two costs A and B that sum to C, but we know the values of A and C. If A = $1, and C =
$5, we could calculate B: = $5 - $1 = $4. But when the values of A and B are uncertain, we cannot do this
calculation at all!
From the example above, we had:
A: Uniform(0.5,1.5)
B: Uniform(3.5,4.5)
C: Triangle(4,5,6)
And we saw that A + B = C, i.e:
Uniform(0.5,1.5) + Uniform(3.5,4.5) = Triangle(4,5,6)
Simple algebra would have us believe that C – A = B, i.e.:
Triangle(4,5,6) - Uniform(0.5,1.5) = Uniform(3.5,4.5)
In fact the left and right sides of this equation are very different. If we calculated in a model
= Triangle(4,5,6) - Uniform(0.5,1.5)
in the hope of retrieving the correct distribution for B, we would in fact have grossly overestimated the
uncertainty of B as shown in the following plot:
The non-intuitive nature of such a simple calculation underlines the difficulty of checking whether the results
of a risk model are correct.
Subtract
CALCULATION 2
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 5
What went wrong?
Let’s look again at the scatter plot of the previous example.
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 6
If the value of C were to be 5.5, as shown in the top arrow, the value of A can only lie between 1 and 1.5.
Similarly, if C were 4.5, A must lie between 0.5 and 1. In other words, the possible distribution of A is
dependent on the value of C, which is not accounted for in a simple formula like:
C – A = Triangle(4,5,6) - Uniform(3.5,4.5)
The general rule here is that one should avoid doing subtractions like C – A when using Monte Carlo
simulation whenever the value of C incorporates the value of A. So, for example:
C = total cost of running a factory, A = personnel cost, C-A will not calculate the non-personnel cost.
Instead, you should construct your model the other way round - calculate the personnel costs and
non-personnel costs, then add them together to obtain the total cost
C = revenue, A = cost, then C-A will calculate the profit, as long as any relationship between C and
A has been accounted for in the model (e.g. relationship to the volume of goods sold).
Let’s say that we have 600 customers come into a shop and they each spend $20. The total revenue for the
shop is 600*$20 = $12,000. Most business spreadsheet models will have many similar types of calculations
multiplying number of units by cost or revenue per unit.
But if the amount spent varies randomly between customers, the simple multiplication is completely incorrect
and will grossly over-estimate the uncertainty of the revenue. To illustrate, let’s assume for simplicity that
the amount a customer spends in the shop is equally likely to lie between $15 and $25. The incorrect
calculation would look like this:
Total revenue = 600 * Uniform(15,25)
The problem with this formula is that just one random sample from the Uniform distribution is being used for
all 600 customers: so if one customer spends $16, for example, they all do – according to the formula. Using
this formula, the total revenue would follow a Uniform(9000, 15000) distribution.
What we really want to do is add up 600 separate, independent amounts spent, each of which is drawn from
the Uniform(15,25) distribution, to simulate how the money comes in. If we do that, the answer is almost
exactly a Normal(12000, 71) distribution. The following plot compares the results doing this calculation the
correct and incorrect ways:
Multiply
CALCULATION 3
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 7
The difference is very dramatic, and is greater the larger the number of variables (in this case, customers)
being summed. It’s probably too dramatic for someone reviewing the results to believe in the incorrect
version, but then one is left wondering why the results are so bad when the model looks correct. In less
dramatic situations, the error will go unnoticed and the reviewer/manager/decision maker will be presented
with a cashflow analysis that illustrates a much greater uncertainty (risk) than is really there. It makes me
wonder how many great business opportunities have been passed over because of some simple modeling
error.
Note that when you add up a number of independent variables (Uniform in this case) the total tends towards
a Normal (i.e. bell-shaped) distribution.
Divide
CALCULATION 4
Nearly everyone who starts doing some risk modeling makes mistakes when they include division in their
models. It is very confusing and unintuitive to begin with. I recommend that you avoid using division in your
models unless you have had some really good risk analysis training.
Let me illustrate using the example above for multiplication. The total shop earnings follows a Normal(12000,
71) distribution. Imagine that we have this estimate, and want to figure out how much each person spends.
We might write this:
Normal(12000, 71) / 600
This is the average amount that each person spends in the shop, but it could also be the actual amount
each individual person spends if they all spent the same amount. There is no distinction between the two in
this calculation. However, if each customer spends an amount that is different and independent of other
customers, there is no way to back-calculate the distribution of the individual expenditure (which was a
Uniform(15,25) you’ll remember). We cannot know with the above information what the distribution of the
amount spent by individual customers is, but 9it turns out that we can state the mean and standard deviation
if they all make purchasing decisions independently.
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 8
There is another, rather hand-waving and slightly geeky, way of looking at this. If you studied any math, you
will have come across logs at some point. Imagine that we have two uncertain variables, X and Y, and that
Z = X / Y. The equation can be written in logs as follows:
Log Z = Log X – Log Y
If X and Y are random variables, Log X and Log Y are also random variables, and we’ve just learned above
that you can’t subtract one from the other, so division of one independently simulated random variable by
another is also very likely to be wrong. Of course, division comes up a lot in spreadsheet models. Most of
the time the calculation works out okay when converting to a simulation model, but not always, and your
modeler needs to know when.
There are many manifestations of this issue. For example, whenever the word ‘average’ comes up in your
analysis (average cost per unit, average time to serve a customer, average price for a feedstock) there is
an implicit denominator. The spread and shape of the distribution of the ‘average’ something is very
dependent on the number of individuals we are averaging over. The same problem appears when we have
data we want to use that is recording averages rather than individual observations. Analyzing and using
such data correctly requires good knowledge of probability and statistics.
Learn how to model risk correctly Every three months, we offer a ten-day residential course on risk analysis modeling. The courses are all taught by me, David Vose. You will find a short version of my CV at the end of this paper. The aim of these training courses is to turn people into risk analysts who produce work that their company can trust. The courses cover everything you need to know and provides plenty of exercises for you to gain some experience in applying the ideas and techniques.
GO TO DETAILS OF TRAINING COURSE
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 9
Company Name
About The AuthorDavid Vose is an internationally recognized
expert is risk analysis. He has been a specialist
consultant in risk analysis since 1989, working in
a wide range of industries including energy,
mining, electricity generation and supply,
shipping, project management and economics,
banking, insurance, engineering, economics,
forestry, reliability modeling, and human health.
David Vose has consulted to companies and government agencies in over thirty countries around the world.
He helps organizations bring risk analysis to their decision-making process, and provides training seminars
in risk analysis for decision-makers and analysts, as well as leading complex risk assessments. He has also
appeared as an expert witness in a number of successful international legal disputes.
David divides his time between consultancy, and designing the technical aspects and usability aspects of
the risk analysis software produced by his company, Vose Software, including the highly acclaimed
ModelRisk product that allows one to perform risk analysis with Microsoft Excel models.
He is the author the Risk Analysis, published by John Wiley and Sons, now in its third edition, and has
authored or co-authored several international guidelines and academic papers on risk analysis.
CONTACT
Email: [email protected]
LinkedIn profile: www.linkedin.com/pub/david-vose/0/1b4/93
Book an obligation-free chat: www.vosesoftware.com/chatwithdavid.php
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 10
What People Say About DavidSelected recommendations from LinkedIn
Wayne Winston
Professor at Indiana University and Higher Education Consultant “David is one of the world's greatest experts on Monte Carlo Simulation Modeling. He has written several outstanding books on the
subject”
Grant Nicholas
Director at ANZ bank, Natural Resources Group "I had the pleasure of meeting David back in 2003. I would not hesitate in recommending David as one of the foremost leaders of
our time in the area of risk analysis. ... He has a clear and detailed understanding of the theory that forms the foundatoin of risk
analysis, which is complimented by his many years of practical experience covering an array of business areas...."
Linda Tollefson Associate Commissioner for Foods and Veterinary Medicine at U.S. Food and Drug Administration “David Vose is a superb risk assessor, risk analyst, and mathematician who provided risk analysis services at the U.S. FDA Center
for Veterinary Medicine on the association of the use of antimicrobials in food-producing animals and resistant foodborne pathogens
in humans. Dr. Vose was one of the first risk assessors to apply a robust mathematical approach to the problem. He produced solid
scientific results that withstood a court (legal) challenge. He is also exceptional in his ability to clearly explain both the results and
the processes he used to lay persons.”
Stephen Grey
Associate Director at Broadleaf Capital International “David's technical and mathematical grasp of distribution and modelling principles and methods is second to none. His books and
online materials are an invaluable reference for anyone working in this field.”
Peter Simon Managing Partner at Lucidus Consulting Limited “David is one or the World's experts on quantitative risk analysis. He is also a pragmatist and is able to explain theories and
concepts at a level we can all understand. David has answered many questions that I have posed him and has never let me down.”
Francois Grobler
Private Consultant “I regard David as an authority on probabilistic modelling and risk analysis. His books and software are excellent as is his ability to
convey the complexities of the stochastic world to industry novices and experts alike. I have attended and recommended his training
courses on numerous occasions. I am grateful to have him as a professional contact and recommend him highly for risk analysis
related software, training and consulting.”
Stephen Hough
Aviation Consultant “David was hired as an expert witness in relation to stochastic modelling by the companies I worked for on two major international
arbitrations. His work was invaluable in preparing our case for the arbitrations. He was able to get his mind round the issues of the
particular cases very quickly and then articulate these very complex issues, both verbally and in writing, in a way that can be
understood easily by non-stochastic modellers, eg lawyers, barristers and senior company management. He can be relied upon to
deliver the needed output on time.”
Stephen Jewell
Director at Risk Champion Ltd “David is one of the foremost risk experts in the world, particularly around risk analysis and one of the first people I turn to for risk
advice on a wide range of topics. His book on Quantitative Risk Analysis is the most comprehensive I have ever read. In addition he
has taken all his 20+ years experience to produce quantitative risk analysis modeling software that is far superior to any other of its
type in the marketplace”
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 11
About Vose Software
Based in Belgium, Vose Software specializes in quantitative risk analysis
software tools. Our primary focus is providing commercial and custom built risk
analysis tools that help clients make better, more informed decisions in the
face of uncertainty and risk.
Vose Software also offer comprehensive risk analysis consulting and training
services in the field of risk analysis.
ModelRisk
ModelRisk provides the capability to perform high speed Monte
Carlo simulations within your Microsoft Excel models to help
you better understand the risks and uncertainty in your work.
ModelRisk offers more tools and features than any other
product in its class.
Key ModelRisk features include: largest range of distributions,
copulas (correlation structures) and time series available; fitting
to data; stochastic optimization; modeling extreme events;
conversion tools for competing products; a free viewer
application so you can share your results; and much more.
ModelTree
ModelTree offers a simple way to understand and interpret your
decision options under uncertainty by adding the capabilities of
decision trees to Microsoft Excel. A decision tree is a flowchart-
like graph that uses branches to clarify every possible result of
a decision. Probabilities and expected outcomes are assigned
to the different possible outcomes of a decision, and ModelRisk
then determines the optimal set of decisions to make.
ModelTree can be used to simplify complex challenges,
visualize all possible outcomes and categorize risk topics.
Risk Analysis Software
SOFTWARE
September 9, 2014 © Vose Software | www.vosesoftware.com
THE PERPLEXING MATH OF UNCERTAINTY 12
We provide training in all aspects of risk analysis thinking and modeling, from basic awareness and
communication of the results of a risk assessment to complex mathematical modeling. We particularly
specialize in the quantitative modeling of risk. Our trainers have excellent communication skills and at
least ten years experience as risk analysts across a wide range of industries. The courses are very hands-
on and designed to focus on the types of risk that your organization faces, the data you have available
and the decisions you need to make.
Risk Analysis Training
TRAINING
Our consultancy services cover a wide range of industries and broadly fall into four areas:
Auditing
Expert Witness Services
Risk analysis and probabilistic modeling
Methodologies
We have consulted to government and business organizations around the world in fields as diverse as
climate change, banking, insurance, capital investment, shipping, steel, aerospace, electricity generation,
railway infrastructure, and animal and human health. We take a very individualistic approach to each
consulting engagement, and have a reputation for providing innovative solutions to complex risk issues.
For more information please visit www.vosesoftware.com.
Risk Analysis Consulting
CONSULTING
Visit Our Head Office
OFFICES
Our head office is conveniently based in the historic city of Ghent, Belgium – just 35 minutes from
Brussels by train. Our modern training facilities can accommodate up to 30 people. We also
welcome casual visitors - why not pop in for a chat the next time you’re in the area?