A historical perspective on analytical measurement uncertainty: From Cotes, Laplace and Gauss to GUM...

A historical perspective on analytical measurement uncertainty: From Cotes, Laplace and Gauss to

GUM and implications for current practice

William R. Porter, PhDPrincipal Scientist

Peak Process Performance Partners [email protected]

History of GUMMay 2015

Why measure things?

“In physical science a first essential step in the direction of learning any subject is to find principles of numerical reckoning and practicable methods for measuring some quality connected with it. I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the stage of science, whatever the matter may be.” Sir William Thompson, Baron Kelvin (From lecture to the Institution of Civil

Engineers, London (3 May 1883), 'Electrical Units of Measurement', Popular Lectures and Addresses (1889), Vol. 1, 80-81.)

Copyright 2015Peak Process Performance Partners LLC

2


And the Reportable Value Is…?

In order for science to progress, scientists propose hypotheses to test, and then perform experiments in which they collect data to support or refute their hypotheses.

In order to collect data, scientists have to make experimental measurements. Statistical analysis of experimentally obtained

measured values is performed to test or refute hypotheses.


3


And the Reportable Value Is…?

In order for engineers to design and implement engineering projects, they must be able to collect pertinent data.

In order to collect data, engineers have to make experimental measurements. Statistical analysis of experimentally obtained

measured values is used to evaluate and control engineering processes.


4


Metrology

The process of obtaining measured values experimentally is the aim of the branch of science called metrology.

Metrology is "the science of measurement, embracing both experimental and theoretical determinations at any level of uncertainty in any field of science and technology.” (as defined by the International Bureau of Weights

and Measures)


5


Measurement is a Process

The true quantity of the objective of the measurement process (the measurand) can never actually be observed. It is an ideal theoretical Platonic Form.

Only experimentally measured values can be observed. These are the actual flickering shadows on the wall of Plato’s

cave. The number of measured values that can be collected in practice

to estimate the true quantity of the measurand is limited.


6


Reality Is Not “Real”…

…but unreality is “real” i.e., “truth”

How can this be true?


7


The Ideal “True” Quantity to Be Measured…

…is a single, fixed, real-valued number. The circumference of a truly circular object

with a true diameter of one meter is π meters.• This is a “real” number, in mathematical terms.• There has never been in the past, is not now nor

will there ever be in the future any actual measurement process capable of proving that an object is truly circular, has a diameter of exactly one meter, or a circumference of exactly π meters.

• “True” quantities are “real” numbers but are unreal, because they can never actually be measured.


8


All Practical Measured Values…

…are discrete integer multiples of some fundamentally quantized unit of measurement. Actual measured values can never be

“real” numbers (mathematically) and can never be identically equal to the true quantity to be measured. NOT EVER.• But they are the real (actually measurable)

numbers we call data! They are always discrete integers.


9


Measurement Granularity

The smallest quantum of the measurement process should be such that: When repeated measurements are carefully

made of the same object using the same process by the same operator, the results exhibit multiple discontinuous integer values scattered about some central value.


10


Fraud and Granularity

If two or more sets of measurement values are collected under slightly different circumstances, and the two sets of results agree exactly, then: Either fraud has occurred, or The measurement process is insufficiently granular

(the quantum of measurement is too big), or A wildly improbable coincidence has occurred.

No valid measurement process is ever expected to generate perfectly reproducible results except under wildly improbable conditions.


11


Uncertainty in Measurements

All measurement processes are uncertain; there are no measurement processes, the results of which are not uncertain to some extent.

Only the quantitative magnitude of the estimated uncertainty distinguishes a measurement process that yields useful reportable measured values from one that yields uninterpretable results. All measured values are wrong, but some are

useful.


12


In Other Words…

“Every measurement is subject to some uncertainty. A measurement result is only complete if it is accompanied by a statement of the uncertainty in the measurement. Measurement uncertainties can come from the measuring instrument, from the item being measured, from the environment, from the operator, and from other sources. Such uncertainties can be estimated using statistical analysis of a set of measurements, and using other kinds of information about the measurement process. There are established rules for how to calculate an overall estimate of uncertainty from these individual pieces of information. The use of good practice – such as traceable calibration, careful calculation, good record keeping, and checking – can reduce measurement uncertainties. When the uncertainty in a measurement is evaluated and stated, the fitness for purpose of the measurement can be properly judged.” —Stephanie Bell, A Beginner’s Guide to Uncertainty of Measurement, The National Physical

Laboratory, http://www.npl.co.uk/publications/a-beginners-guide-to-uncertainty-in-measurement


13

http://www.npl.co.uk/publications/a-beginners-guide-to-uncertainty-in-measurement


Granularity, again

The rule of thumb is that there should be at least ten discrete equally spaced quantized values within a span of 6 standard uncertainty units u. If the measurement process is insufficiently

granular, you need a better process.

Clearly, we need a quantitative estimate of the magnitude of the uncertainty, u.


14


The Measurement Process

Collect a limited number of observed values O1,…,On, where n is “small” (e.g. << 30).

Combine these in some way so as to obtain a plausible point estimate Y of the unknowable true quantity of the measurand.

Also combine these in some way, with additional information as needed, so as to obtain a symmetric interval estimate Y ± ku that encompasses the true value of the measurand with some specified level of plausibility indicated by k. k = 1 (plausible), k = 2 (highly plausible), k = 3 (very highly

plausible)

The reportable measurement value is: Y ± ku. See Guide to the Expression of Uncertainty in Measurement (GUM).


15


True Quantity of the Measurand

Method Symbol Properties

Classical frequentist µ Fixed real number.

Bayesian µ (for Normal mean) Uncertain real number with Normal probability distribution

GUM No symbol Unknowable hypothetical fixed real number; cannot be reported.


16


Best Estimated Value of the Measurand


Classical frequentist ȳ Random real number with Normal probability distribution.

Bayesian y (for Normal mean) Fixed real number.

GUM Y Fixed integer or rational fraction multiplied by quantum of measurement.


17

In all three cases, the best estimated value of the measurand is computed from the n observations O1,…,On that are combined to estimate the measured value.


Dispersion of Best Estimated Value of the Measurand


Classical frequentist

spooled Random real number.

Bayesian sposterior (for Normal mean) Fixed real number.

GUM uprocess Integer or rational fraction multiplied by quantum of measurement comprised of both fixed and random components.


18

Estimates of dispersion in all three cases pertain to the process as a whole, and generally should not be estimated just from the one set of observations used to generated the best estimated value of the measurand for a particular reported measured value. Pooling of information obtained from other, similar measurements is nearly always needed.


Interval Estimated Value of the Measurand


Classical frequentist

ȳ ± tα/2s Random real number. Probability estimate is objective and quantitative.Only (1 – α)% of intervals contain true quantity of measurand.

Bayesian ȳ ± tα/2sposterior

(for Normal mean)

Fixed real number. Probability estimate is subjective and quantitative.Only (1 – α)% of potential true quantities of measurand are in interval.

GUM Y ± ku Integer or rational fraction multiplied by quantum of measurement comprised of both fixed and random components. Probability estimate is subjective and qualitative. No exact probability can be assigned.


19


Uncertainty

The uncertainty u is the square root of the sum of the squares of what we don’t know (Bayesian uncertainty b) and what we can’t know (Frequentist uncertainty s):

b is irreducible residual symmetric bias. s is random scatter.

• Ancillary data may (and should) be used as needed to estimate both b and s!


20

22 sbu


We Have a Problem, Houston.

The aim of metrology is inconsistent with contemporary statistical theories as rigorously defined by Frequentists or Bayesians.

We need a more general theory, because the metrology problem won’t go away and metrologists cannot accept current statistical wisdom. Neither the Frequentists nor the Bayesians adequately

address the problem; a mixture of approaches is needed. What is needed is a SUPERPOSITION of current

principles.


21


HOW DID WE GET INTO THIS MESS?

Some history…


22


A Long, Long Time Ago…

There once was a time when scientists were content to take a single measurement of an object meticulously and then report this number, which they had so carefully obtained, as evidence for or against support of some hypothesis about the workings of Nature. That time is long gone. It’s been gone for nearly 3 centuries.


23


1720: Roger Cotes

Roger Cotes was an English mathematician and colleague of Isaac Newton.

He conjectured that the reporting the arithmetic average of group of observations decreased the error of the measurement process and yields a value more closely approaching the true quantity that we are trying to estimate. It was just a conjecture; later workers helped to demonstrate the

value of this approach.• Cotes R. Aestimatio errorum in mixta mathesis per variationes

partium trianguli plani et sphaerici. In Smith R, ed. Harmonia mensurarum. Cambridge, England: pages 1-22 (1722).


24


1750: Tobias Mayer

German astronomer Tobias Mayer introduced the method of Least Squares to refine astronomical measurements. By this time, averaging of astronomical

observations was becoming common practice.

• Mayer T. Abhandlung über die Umwalzung des Monds um seine Axe und die scheinbare Bewegung der Mondsflecten. Kosmographische Nachrichten und Sammlungen auf das Jahr 1748, Nuremberg, pp, 52–183. (1750).


25


1755: Roger Boscovic

Ragusan (modern day Dubrovnic, Dalmatia) physicist and astronomer Roger Joseph Boscovich proposes minimizing the sum of absolute deviations from some target “best estimate.” Boscovich RJ, Maire C. De Litteraria Expeditione per Pontificum

ditionem ad dimetiendas duas Meridiani gradus. Rome: Palladis (1755).


26


1755: Thomas Simpson

English mathematician Thomas Simpson proposed that the mean of a series of observations was a better estimate of the true quantity of the object to be measured than any single observation, however meticulously obtained. The deviations from the mean provided useful information

about the uncertainty of the measurement.• Simpson T. A letter to the Right Honorable George Earl of

Macclesfield, President of the Royal Society, on the advantage of taking the mean of a number of observations, in practical astronomy. Philosophical Transactions of the Royal Society of London, 49: 82–93 (1755).


27


1755: Rev. Thomas Bayes

Bayes, in a comment on Simpson’s surmise, noted that the mean only made sense as a superior estimator if the deviations from the mean were symmetric about it.

Simpson took note and revised his recommendation in 1757. Report both the mean (as the “best” estimate) and the

scatter of the deviations from the mean.• Simpson T. Miscellaneous Tracts on Some Curious, and Very

Interesting Subjects in Mechanics, Physical-Astronomy, and Speculative Mathematics. London: J. Nourse, p.64 (1757).


28


1788: Pierre-Simon Laplace

French mathematician and astronomer Pierre-Simon Laplace applied the least-squares approach, previously introduce by Mayer, to studies of planetary motion.

• Laplace P-S. Théorie de Jupiter et de Saturne. Paris: Academy of Sciences (1787).

• Laplace P-S. Mécanique Céleste. Paris (1799–1825). Laplace dealt with both the case where all observations were obtained under the same conditions (repeated measurements) or under different conditions (what we would call estimates of intermediate precision).


29


1805: Adrien-Marie Legendre

Another French mathematician, Adrien-Marie Legendre, provided a simple guide in 1805 to the process of data reduction employed by Mayer and Laplace and gave it the name we know it by today: the method of least-squares estimation. Legendre A-M. Nouvelles méthodes pour la détermination des orbites

des comètes [New Methods for the Determination of the Orbits of Comets] (in French). Paris: F. Didot, See appendix: Sur la Méthode des moindres quarrés [On the method of least squares] pp. 72–75.. (1805).


30


1809: Carl Friedrich Gauss

German mathematician Carl Friedrich Gauss developed the theory of the Normal Probability Distribution to replace a cruder attempt at assigning a probability to distributions of repeated measurements first introduced by Laplace.

• Gauss CF. Theoria motus corporum coelestium in sectionibus conicis solem ambientum (Theory of motion of the celestial bodies moving in conic sections around the Sun). (1809). English translation by C. H. Davis, New York: Dover (1963).

Gauss did not invent the Normal Probability Distribution; that distribution had been proposed earlier by French mathematician Abraham de Moivre working in England in 1738 as a large sample approximation to the binomial distribution.


31


1809: Gauss’s Claim

Gauss was the first to assert that the arithmetic mean of a set of observations that scattered about some central value with a distribution approaching the Normal Distribution is in fact the best single point estimate of the set of values. But his reasoning was somewhat circular, as

Laplace was quick to point out.


32


1810: Laplace, Again

Laplace jumped in with an elegant argument based on his Central Limit Theorem to support Gauss’s argument.

• Laplace P-S. Mémoire sur les integrals définies et leur application aux probabilités, et specialement à la rercherche du milieu qu’il faut choisir entre les resultats des observations. Mémoires de l’Académie des sciences de Paris, pp. 279–347 (1810).

This is the part of the argument that trips up most practicing metrologists.


33


1823: Gauss, Again

Gauss was able to generalize Laplace’s argument to provide a coherent noncircular derivation of the method of least squares assuming a Normal Probability Distribution with characteristic mean and standard deviation.

• Gauss CF. Theoria Combinatorius Observationum Erroribus Minimus Obnoxiae. Gőttingen: Dieterich (1823).


34


Eureka!

The metrology problem is solved (sort of).

The reportable value is:

Mean ± Standard Deviation

Well, maybe…


35


Long Time Passing…

Frequentist methods for statistical analysis of data evolved.

People like Francis Galton, Francis Edgeworth, William Gosset (Student), the Pearsons, Ronald Fisher, etc., etc. etc. advance the theory of statistics, but (perhaps excepting Student) forget about the nitty-gritty granular details of the metrology problem.


36


1924: Walter Shewhart

Shewhart introduces the Control Chart for engineering quality control and assurance. This DOES NOT use exact

probabilities as originally envisioned, but only plausible limits.


37

• Shewhart’s original work was based on granular measured values and did not use Normal Curve probabilities.

• Later, statisticians dressed it up in mumbo-jumbo to make Shewhart’s practical engineering tool fit the restrictive confines of Frequentist statistical theory.


1963: Mary Natrella

National Bureau of Standards statistician Mary G. Natrella introduces a simple set of guidelines for reporting measurement uncertainty in her classic handbook. Experimental Statistics. NBS Handbook

91, ch. 23. Washington: US Government Printing Office (1963).

Both residual bias and random scatter were included in her recommendation.


38


1964: John Mandel

Another NBS statistician, John Mandel, writes a book on how to evaluate experimental data and includes some discussion on measurement uncertainty. The systematic evaluation of

measuring processes,” ch. 13 in New York: Interscience (1964).


39


Over the Next Decade…

Metrologists around the globe struggled to come up with a simple, standardized procedure to estimate measurement uncertainty in a consistent way.


40


1977–1981: BIPM

An international collaboration was instigated by the Bureau International des Poids et Mesures (International Bureau of Weights and Measures) in 1977 that resulted in an initial recommendation issued internally in 1980 and then published in 1981. Giacomo P. Expression of experimental uncertainties. Metrologia

17:73–74 (1981).


41


Many Committee Meetings Later…

The Guide to the Expression of Uncertainty in Measurement (GUM) was first published in 1993 and subsequently updated and revised. BIPM, IEC, IFCC, ISO, IUPAC, OIML. Guide to the Expression of

Uncertainty in Measurement. International Organization for Standardization, Geneva, First Edition (1993) reprinted and corrected (1995).

BIPM, IEC, ILAC, IFCC, ISO, IUPAC, OIML. Evaluation of Measurement Data— Guide to the Expression of Uncertainty in Measurement. (2008).


42


The GUM is a Legal Standard

People who make measurements for submission to many regulatory agencies are required to follow the GUM. European colleagues are especially adamant

that such compliance be demonstrated. No ifs, ands or buts, you have to get this done

correctly. In many venues, it’s the law.• And you had better do it the GUM way!


43


The Challenge to Statisticians

Bring statistical theory into congruence with the Bayesian–Frequentist duality required by the GUM.

This is analogous to quantum physics and the problem of wave-particle duality.


44

22 sbu


References

GUM: http://www.bipm.org/utils/common/documents/jcgm/

JCGM_100_2008_E.pdf

VIM: http://www.bipm.org/utils/common/documents/jcgm/

JCGM_200_2012.pdf

ASTM (§14.02): E29-08 Standard Practice for Using Significant Digits in Test Data to

Determine Conformance with Specifications E2655-08 Standard Guide for Reporting Uncertainty of Test Results and

Use of the Term Measurement Uncertainty in ASTM Test Methods E2782-11 Standard Guide for Measurement Systems Analysis (MSA)


45



46

A superior man, in regard to what he does not know, shows a cautious reserve. If names be not correct, language is not in accordance with the truth of things. If language be not in accordance with the truth of things, affairs cannot be carried on to success. When affairs cannot be carried on to success, proprieties and music do not flourish. When proprieties and music do not flourish, punishments will not be properly awarded. When punishments are not properly awarded, the people do not know how to move hand or foot. Therefore a superior man considers it necessary that the names he uses may be spoken appropriately, and also that what he speaks may be carried out appropriately. What the superior man requires is just that in his words there may be nothing incorrect. — Confucius, Analects, Book XIII, Chapter 3, verses 4-7, translated by James Legge

A historical perspective on analytical measurement uncertainty: From Cotes, Laplace and Gauss to GUM...

Documents

Transcript of A historical perspective on analytical measurement uncertainty: From Cotes, Laplace and Gauss to GUM...