The Perplexing Math of Uncertainty - Vose Software · THE PERPLEXING MATH OF UNCERTAINTY 1...

The Perplexing

Math of

Uncertainty David Vose

Director, Vose Software

Vose Software BVBA

Franklin Rooseveltlaan 348 9000 Gent Belgium P +32 (0)9 265 02 30 E [email protected]

September 9, 2014 © Vose Software | www.vosesoftware.com

THE PERPLEXING MATH OF UNCERTAINTY 1

Quantitative risk analysis involves a wide range of skills and tasks that risk modelers need to have

mastered before embarking on an important risk analysis of their own. Among the most over-looked and

underestimated of these skills is knowledge of how to manipulate variables within Monte Carlo simulation

models which is the standard modeling technique for risk analysis.

Add, subtract, multiply, divide – we have learned how to use them with numbers at school by the age of

ten at the latest, and we take it for granted that we have mastered them. It is hard to imagine a risk

analysis model in any field that does not include some of these four operations. Yet these basic

operations very often do not work in the same way when those numbers are uncertain. Very worryingly,

nearly every person I encounter who is involved in risk modeling is to some degree unaware or unclear

about the correct ways of manipulating uncertain variables in a model, perhaps because we don’t give

a second thought to calculations using + - * / .

Most risk analyses are performed using Excel with a Monte Carlo add-in like our software, ModelRisk.

There is a widely held belief that one can take a standard spreadsheet model (e.g. a cashflow model

with EBITDA or NPV calculation) and simply replace any value within that model that is uncertain with a

function that generates random samples from some distribution to reflect its uncertainty. People

mistakenly think that the rest of the model’s logic can be left unchanged.

And it really does matter. Incorrect manipulation of uncertain variables in a model will almost always

produce simulation results with something close to the correct average value, which people use as a

‘reality check’, but completely wrong spread around that average value. The net effect is that decision-

makers are presented with a wildly inaccurate estimate of the uncertainty (risk) of the outcomes of

different decision choices. Some may realize that the model results are unrealistic and dismiss them,

while others won’t and will make very misguided decisions.

I invite you to read these examples very carefully if you are intend to write Monte Carlo risk analysis

models, and share it with other colleagues who build such models.

INTRODUCTION

Learn how to model risk correctly

Every three months, we offer a ten-day residential course on risk analysis modeling. The courses are all taught by me, David Vose. You will find a short version of my CV at the end of this paper. The aim of these training courses is to turn people into risk analysts who produce work that their company can trust. The courses cover everything you need to know and provides plenty of exercises for you to gain some experience in applying the ideas and techniques.

GO TO DETAILS OF TRAINING COURSE

http://www.vosesoftware.com/RAtraining10days.php



Portfolio Name

BASIC

TANDARD

Of the four operations, addition is the least problematic. The only issue that occurs is where the two

variables being added are connected in some way (i.e. causally associated with each other either

directly or indirectly so that their possible values are correlated). In other words, if the two variables you

want to add together are affected by one or more of the same external factors then simple addition will

very likely not be appropriate.

That issue aside, performing the simple possible addition of two uncertain variables gives a very nice

way of illustrating how non-intuitive the results are from a Monte Carlo simulation – and therefore why

it is so dangerous to rely upon one’s intuition when assessing whether the results look about right.

Let’s say that we have two costs – A and B, and we want to calculate the total C. If A = $1, and B = $4,

we’d have C = $5.

Now imagine that costs A and B are uncertain. A is equally likely to be somewhere between $0.50 and

$1.50, and B is equally likely to be between $3.50 and $4.50. That means that C is also uncertain and

must lie somewhere between $4 and $6. In fact1, C would also most likely take a value of $5, which can

be shown pictorially like this:

Add

CALCULATION 1

And in risk analysis modeling parlance like this:

Uniform(0.5,1.5) + Uniform(3.5,4.5) = Triangle(4,5,6)

You might be surprised that the uncertainty of C follows a Triangle shape instead of another rectangle

(Uniform). When asked to guess what two Uniform variables sum to, the answer often given is another

Uniform. It isn’t that intuitive, yet it is hard to think of any simpler distributions we might add together.

The following graphs illustrate where the Triangle comes from:

__________________________________

1 Technically, this is only true if the two uncertainties are independent



I. A plot of Monte Carlo simulated values of A against C. For example, the bottom left corner

shows samples where A and B were close to their minima (0.5 and 3.5 respectively to give a

total C = 4) and the top right corner shows samples where A and B were close to their maxima

(1.5 and 4.5 respectively to give a total C = 6). The range of the value of B is shown by the red

arrows.

II. The points are statistically evenly spread within his rhombus shape because A and B are

uniformly distributed. Imagine that we split these simulated data in two according to the

horizontal line shown …

III. And then flip the position of the top half of the data. You can see that when projected onto the

vertical axis the values of C follow a Triangle distribution

IV. Switch axes and we have a Triangle distribution. Note that the vertical axis now represents

‘probability’2 since the height of the triangle is proportional to the fraction of simulated values

that fall at or near the horizontal axis value.

In this example, the distributions sum to a Triangle because A and B follow Uniform distributions with

the same width (of $1). If the widths had been different, the resultant summation would have been

__________________________________ 2 To be precise, we should say ‘probability density’ rather than ‘probability’ since the variable is continuous



trapezoidal, as shown by the following similar set of plots, where A = Uniform(0.5,1.5) and B =

Uniform(3.5,5.5):

Let’s say we have two costs A and B that sum to C, but we know the values of A and C. If A = $1, and C =

$5, we could calculate B: = $5 - $1 = $4. But when the values of A and B are uncertain, we cannot do this

calculation at all!

From the example above, we had:

A: Uniform(0.5,1.5)

B: Uniform(3.5,4.5)

C: Triangle(4,5,6)

And we saw that A + B = C, i.e:

Uniform(0.5,1.5) + Uniform(3.5,4.5) = Triangle(4,5,6)

Simple algebra would have us believe that C – A = B, i.e.:

Triangle(4,5,6) - Uniform(0.5,1.5) = Uniform(3.5,4.5)

In fact the left and right sides of this equation are very different. If we calculated in a model

= Triangle(4,5,6) - Uniform(0.5,1.5)

in the hope of retrieving the correct distribution for B, we would in fact have grossly overestimated the

uncertainty of B as shown in the following plot:

The non-intuitive nature of such a simple calculation underlines the difficulty of checking whether the results

of a risk model are correct.

Subtract

CALCULATION 2



What went wrong?

Let’s look again at the scatter plot of the previous example.



If the value of C were to be 5.5, as shown in the top arrow, the value of A can only lie between 1 and 1.5.

Similarly, if C were 4.5, A must lie between 0.5 and 1. In other words, the possible distribution of A is

dependent on the value of C, which is not accounted for in a simple formula like:

C – A = Triangle(4,5,6) - Uniform(3.5,4.5)

The general rule here is that one should avoid doing subtractions like C – A when using Monte Carlo

simulation whenever the value of C incorporates the value of A. So, for example:

C = total cost of running a factory, A = personnel cost, C-A will not calculate the non-personnel cost.

Instead, you should construct your model the other way round - calculate the personnel costs and

non-personnel costs, then add them together to obtain the total cost

C = revenue, A = cost, then C-A will calculate the profit, as long as any relationship between C and

A has been accounted for in the model (e.g. relationship to the volume of goods sold).

Let’s say that we have 600 customers come into a shop and they each spend $20. The total revenue for the

shop is 600*$20 = $12,000. Most business spreadsheet models will have many similar types of calculations

multiplying number of units by cost or revenue per unit.

But if the amount spent varies randomly between customers, the simple multiplication is completely incorrect

and will grossly over-estimate the uncertainty of the revenue. To illustrate, let’s assume for simplicity that

the amount a customer spends in the shop is equally likely to lie between $15 and $25. The incorrect

calculation would look like this:

Total revenue = 600 * Uniform(15,25)

The problem with this formula is that just one random sample from the Uniform distribution is being used for

all 600 customers: so if one customer spends $16, for example, they all do – according to the formula. Using

this formula, the total revenue would follow a Uniform(9000, 15000) distribution.

What we really want to do is add up 600 separate, independent amounts spent, each of which is drawn from

the Uniform(15,25) distribution, to simulate how the money comes in. If we do that, the answer is almost

exactly a Normal(12000, 71) distribution. The following plot compares the results doing this calculation the

correct and incorrect ways:

Multiply

CALCULATION 3



The difference is very dramatic, and is greater the larger the number of variables (in this case, customers)

being summed. It’s probably too dramatic for someone reviewing the results to believe in the incorrect

version, but then one is left wondering why the results are so bad when the model looks correct. In less

dramatic situations, the error will go unnoticed and the reviewer/manager/decision maker will be presented

with a cashflow analysis that illustrates a much greater uncertainty (risk) than is really there. It makes me

wonder how many great business opportunities have been passed over because of some simple modeling

error.

Note that when you add up a number of independent variables (Uniform in this case) the total tends towards

a Normal (i.e. bell-shaped) distribution.

Divide

CALCULATION 4

Nearly everyone who starts doing some risk modeling makes mistakes when they include division in their

models. It is very confusing and unintuitive to begin with. I recommend that you avoid using division in your

models unless you have had some really good risk analysis training.

Let me illustrate using the example above for multiplication. The total shop earnings follows a Normal(12000,

71) distribution. Imagine that we have this estimate, and want to figure out how much each person spends.

We might write this:

Normal(12000, 71) / 600

This is the average amount that each person spends in the shop, but it could also be the actual amount

each individual person spends if they all spent the same amount. There is no distinction between the two in

this calculation. However, if each customer spends an amount that is different and independent of other

customers, there is no way to back-calculate the distribution of the individual expenditure (which was a

Uniform(15,25) you’ll remember). We cannot know with the above information what the distribution of the

amount spent by individual customers is, but 9it turns out that we can state the mean and standard deviation

if they all make purchasing decisions independently.



There is another, rather hand-waving and slightly geeky, way of looking at this. If you studied any math, you

will have come across logs at some point. Imagine that we have two uncertain variables, X and Y, and that

Z = X / Y. The equation can be written in logs as follows:

Log Z = Log X – Log Y

If X and Y are random variables, Log X and Log Y are also random variables, and we’ve just learned above

that you can’t subtract one from the other, so division of one independently simulated random variable by

another is also very likely to be wrong. Of course, division comes up a lot in spreadsheet models. Most of

the time the calculation works out okay when converting to a simulation model, but not always, and your

modeler needs to know when.

There are many manifestations of this issue. For example, whenever the word ‘average’ comes up in your

analysis (average cost per unit, average time to serve a customer, average price for a feedstock) there is

an implicit denominator. The spread and shape of the distribution of the ‘average’ something is very

dependent on the number of individuals we are averaging over. The same problem appears when we have

data we want to use that is recording averages rather than individual observations. Analyzing and using

such data correctly requires good knowledge of probability and statistics.

Learn how to model risk correctly Every three months, we offer a ten-day residential course on risk analysis modeling. The courses are all taught by me, David Vose. You will find a short version of my CV at the end of this paper. The aim of these training courses is to turn people into risk analysts who produce work that their company can trust. The courses cover everything you need to know and provides plenty of exercises for you to gain some experience in applying the ideas and techniques.

GO TO DETAILS OF TRAINING COURSE

http://www.vosesoftware.com/RAtraining10days.php



Company Name

About The AuthorDavid Vose is an internationally recognized

expert is risk analysis. He has been a specialist

consultant in risk analysis since 1989, working in

a wide range of industries including energy,

mining, electricity generation and supply,

shipping, project management and economics,

banking, insurance, engineering, economics,

forestry, reliability modeling, and human health.

David Vose has consulted to companies and government agencies in over thirty countries around the world.

He helps organizations bring risk analysis to their decision-making process, and provides training seminars

in risk analysis for decision-makers and analysts, as well as leading complex risk assessments. He has also

appeared as an expert witness in a number of successful international legal disputes.

David divides his time between consultancy, and designing the technical aspects and usability aspects of

the risk analysis software produced by his company, Vose Software, including the highly acclaimed

ModelRisk product that allows one to perform risk analysis with Microsoft Excel models.

He is the author the Risk Analysis, published by John Wiley and Sons, now in its third edition, and has

authored or co-authored several international guidelines and academic papers on risk analysis.

CONTACT

Email: [email protected]

LinkedIn profile: www.linkedin.com/pub/david-vose/0/1b4/93

Book an obligation-free chat: www.vosesoftware.com/chatwithdavid.php

http://www.vosesoftware.com



What People Say About DavidSelected recommendations from LinkedIn

Wayne Winston

Professor at Indiana University and Higher Education Consultant “David is one of the world's greatest experts on Monte Carlo Simulation Modeling. He has written several outstanding books on the

subject”

Grant Nicholas

Director at ANZ bank, Natural Resources Group "I had the pleasure of meeting David back in 2003. I would not hesitate in recommending David as one of the foremost leaders of

our time in the area of risk analysis. ... He has a clear and detailed understanding of the theory that forms the foundatoin of risk

analysis, which is complimented by his many years of practical experience covering an array of business areas...."

Linda Tollefson Associate Commissioner for Foods and Veterinary Medicine at U.S. Food and Drug Administration “David Vose is a superb risk assessor, risk analyst, and mathematician who provided risk analysis services at the U.S. FDA Center

for Veterinary Medicine on the association of the use of antimicrobials in food-producing animals and resistant foodborne pathogens

in humans. Dr. Vose was one of the first risk assessors to apply a robust mathematical approach to the problem. He produced solid

scientific results that withstood a court (legal) challenge. He is also exceptional in his ability to clearly explain both the results and

the processes he used to lay persons.”

Stephen Grey

Associate Director at Broadleaf Capital International “David's technical and mathematical grasp of distribution and modelling principles and methods is second to none. His books and

online materials are an invaluable reference for anyone working in this field.”

Peter Simon Managing Partner at Lucidus Consulting Limited “David is one or the World's experts on quantitative risk analysis. He is also a pragmatist and is able to explain theories and

concepts at a level we can all understand. David has answered many questions that I have posed him and has never let me down.”

Francois Grobler

Private Consultant “I regard David as an authority on probabilistic modelling and risk analysis. His books and software are excellent as is his ability to

convey the complexities of the stochastic world to industry novices and experts alike. I have attended and recommended his training

courses on numerous occasions. I am grateful to have him as a professional contact and recommend him highly for risk analysis

related software, training and consulting.”

Stephen Hough

Aviation Consultant “David was hired as an expert witness in relation to stochastic modelling by the companies I worked for on two major international

arbitrations. His work was invaluable in preparing our case for the arbitrations. He was able to get his mind round the issues of the

particular cases very quickly and then articulate these very complex issues, both verbally and in writing, in a way that can be

understood easily by non-stochastic modellers, eg lawyers, barristers and senior company management. He can be relied upon to

deliver the needed output on time.”

Stephen Jewell

Director at Risk Champion Ltd “David is one of the foremost risk experts in the world, particularly around risk analysis and one of the first people I turn to for risk

advice on a wide range of topics. His book on Quantitative Risk Analysis is the most comprehensive I have ever read. In addition he

has taken all his 20+ years experience to produce quantitative risk analysis modeling software that is far superior to any other of its

type in the marketplace”



About Vose Software

Based in Belgium, Vose Software specializes in quantitative risk analysis

software tools. Our primary focus is providing commercial and custom built risk

analysis tools that help clients make better, more informed decisions in the

face of uncertainty and risk.

Vose Software also offer comprehensive risk analysis consulting and training

services in the field of risk analysis.

ModelRisk

ModelRisk provides the capability to perform high speed Monte

Carlo simulations within your Microsoft Excel models to help

you better understand the risks and uncertainty in your work.

ModelRisk offers more tools and features than any other

product in its class.

Key ModelRisk features include: largest range of distributions,

copulas (correlation structures) and time series available; fitting

to data; stochastic optimization; modeling extreme events;

conversion tools for competing products; a free viewer

application so you can share your results; and much more.

ModelTree

ModelTree offers a simple way to understand and interpret your

decision options under uncertainty by adding the capabilities of

decision trees to Microsoft Excel. A decision tree is a flowchart-

like graph that uses branches to clarify every possible result of

a decision. Probabilities and expected outcomes are assigned

to the different possible outcomes of a decision, and ModelRisk

then determines the optimal set of decisions to make.

ModelTree can be used to simplify complex challenges,

visualize all possible outcomes and categorize risk topics.

Risk Analysis Software

SOFTWARE

http://www.vosesoftware.com/videotour.php

http://www.vosesoftware.com/videotourMT.php



We provide training in all aspects of risk analysis thinking and modeling, from basic awareness and

communication of the results of a risk assessment to complex mathematical modeling. We particularly

specialize in the quantitative modeling of risk. Our trainers have excellent communication skills and at

least ten years experience as risk analysts across a wide range of industries. The courses are very hands-

on and designed to focus on the types of risk that your organization faces, the data you have available

and the decisions you need to make.

Risk Analysis Training

TRAINING

Our consultancy services cover a wide range of industries and broadly fall into four areas:

Auditing

Expert Witness Services

Risk analysis and probabilistic modeling

Methodologies

We have consulted to government and business organizations around the world in fields as diverse as

climate change, banking, insurance, capital investment, shipping, steel, aerospace, electricity generation,

railway infrastructure, and animal and human health. We take a very individualistic approach to each

consulting engagement, and have a reputation for providing innovative solutions to complex risk issues.

For more information please visit www.vosesoftware.com.

Risk Analysis Consulting

CONSULTING

Visit Our Head Office

OFFICES

Our head office is conveniently based in the historic city of Ghent, Belgium – just 35 minutes from

Brussels by train. Our modern training facilities can accommodate up to 30 people. We also

welcome casual visitors - why not pop in for a chat the next time you’re in the area?

The Perplexing Math of Uncertainty - Vose Software · THE PERPLEXING MATH OF UNCERTAINTY 1...

Documents

Transcript of The Perplexing Math of Uncertainty - Vose Software · THE PERPLEXING MATH OF UNCERTAINTY 1...