Download - Cantilever Beam End-to-End UQ Test Problem and Evaluation ...

Cantilever Beam End-to-End UQ Test Problem

and Evaluation Criteria for

UQ Methods Performance Assessment

Vicente Romero, Ben Schroeder, Matt Glickman,

Justin Winokur

Sandia National Laboratories

Albuquerque, NM

ASME Verification & Validation Symposium

May 3-5, 2017

Sandia is a multi-program laboratory managed and operated by Sandia Corporation, a

wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of

Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

Sandia National Laboratories document SAND2017-4592 C (unlimited release).

Introduction

• This talk outlines an End-to-End UQ model problem

– a “test” problem--physics and uncertainty Truth Models will

be released in a year or two for self-checks, workshops?

– has many features of real problems that make it useful for

significant assessment of UQ and Credibility approaches

and their potential practicality and effectiveness on real

problems

– This talk just outlines the problem--no approaches or results

• Performance Metrics are introduced that can help

quantify UQ method reliability, conservatism, and

risk over thousands of realizations of incomplete

uncertainty info. and how the method handle this.

– Opportunity to benchmark and compare methods

using specified performance metrics 2

• Model and

Experiment

inputs

& results

• corrections

and/or

uncertain-

ties

• Model

Validation

• Model

Calibration

• Model

Condition-

ing

• probabil-

istic v.

non-prob.

• random v.

systematic

• aleatory v.

epistemic

• traveling

v. non-

traveling

UQ Elements in End-to-End Analysis

• design

margins

• safety

margins

• model

acceptability

Uncertainty

Characterization

and Representation

Uncertainty

“Roll Up”

(any or all of:)

uncer.

represent.

model

prediction-bias

uncertainty

solution bias

uncertainty

estimation

propagation extrapolation aggregationperformance

margin UQ

• intervals

• distribu-

tions

• Pboxes

• random

fields

• discrete

propagated

uncertainty

+

extrapolated

uncertainty

+

solution error

uncer.

Model Input

uncertainties:

• continuous

• discrete

• probabilistic

• non-

probabilistic

Calculation

or Solution

Verification

Margin

Assessment

Sensitivity

Analysis

UQ Based

Resource Allocation

Uncertainty

Quantification

uncer.

categoriz.

3

4

Motivation:

A Relatively Simple End-to-End UQ Problem

(generic, not the Beam Problem)

Several more Layers and

Significant Sources of

Uncertainty usually involved

Experim.

Data,

scalar

input

Calibrated

sub-model is

state-var.

dependent

Uncertain

Function

Experim.

Data,

scalar

input

Scalar response outputs

(functional outputs much

more difficult to

effectively compare)

Exper. Data

Set B

Model in

validation

conditions

Propagate

uncertainty

realizations

through model

Model

Predictions

0.00

0.25

0.50

0.75

1.00

Cu

m P

rob

2e+15 4e+15 6e+15 8e+15 1e+16 1.2e+16

E# Al Box n fluenceExperim.

results

Model Validation

Val. Info.

QMU analysis

where

response

threshold is

specified

for system

safety or

performance

goal

New Uncertainties

specific to app. conditions

Predictions at

app. conditions

Exper. Data

Set A

• Stochastic system, e.g. randomly varying geoms., mtl. props.

5

Sandia is working on identifying, evaluating, and developingpractical and effective procedures for:

A) Integration of experimental information into models and predictions Experiment design and UQ analysis of experimental results Parameter Estimation, Model Calibration Model Validation

B) Extrapolation with UQ

C) QMU—Quantification of Margins and Uncertainty

Evaluate the most promising candidate methods on several engineering test problems relevant to our application space

Cantilever Beam E2E UQ Test Problem is a simplified end-to-end UQ problem we’re currently working

Ultimate Goal: Systems Engineering of VVUQ in Experimental and M&S Workflows

Experiments → Model Development/Calibration → Validation → Extrapolation → QMU

Establish Credibility of:

• Simple prototype problem for stochastic physical systems with scalar

inputs and outputs

– population of beams with small random variations in geometry (Length,

Width, Height) and material property (E)

• Random and Systematic components of measurement uncertainty on

inputs and outputs of tests to characterize Deflection variability

• Probabilistic and Interval uncertainties

• Load/BC control errors (vary from test to test about target load P at

which want to determine Deflection variability)

• Very few experimental replicate tests (sparse variability data;

substantial aleatory and epistemic types of uncertainty)

• Model Discretization related solution error/uncertainty

• Model Calibration, Validation, Extrapolative prediction, QMU

Cantilever Beam E2E UQ Test Problem

6

A.1

• 4 beam tested at random from population

• Loads vary about target Po

• beams vary in geometry and material properties

• no measurement errors

• Use data from 4 tests to estimate or bound:

• Deflection variability for whole population

• Probability that defl. > Dcrit.

A.2

• same as A.1 but random and systematic

errors exist on measurements of output

deflection

• Errors are unknown but consistent with supplied

probabilistic and interval random and systematic

uncertainty characterizations

Problem Part A:

Experimental Data UQ and QMU

7

A.3

• Add random and systematic errors on

measurements of input loads in the 4 tests

• Errors are unknown but consistent with

supplied probabilistic and interval random

and systematic uncertainty characterizations

• Isolate deflection variability due to beam

variations only (no load variations) and

predict for loads Po, 0.9Po, 1.1Po at

which no experimental deflections exist

• 5th test is supplied to yield load-deflection

relationship information

• potentially use to address isolation task

• and account for meas. errors on input loads

Part A: Experimental Data UQ and QMU

8

Scenario 1 – beam dimensions can be varied in calibration

model

• Geom. UQ Case A – the 4 beams’ dimensions not

measured but are controlled to within stated

manufacturing tolerances

• Geom. UQ Case B – the 4 beams’ dimensions

individually measured but subject to specified

measurement uncertainties

• Predict population deflection variability and exceedance

probability for loads Po, 0.9Po, 1.1Po.

• Model discretization error/uncertainty

• Response-surface surrogate model error/uncertainty.

• Compare to estimates using experimental data only

Part B: Model Introduced—Calibrate and

Estimate Deflection Variability

and Exceedance Probability

9

Scenario 2 – beam dimensions beam dimensions cannot

be varied in calibration model

• Geom. UQ Case A – the 4 beams’ dimensions not

measured but are controlled to within stated

manufacturing tolerances

• Geom. UQ Case B – the 4 beams’ dimensions

individually measured but subject to specified

measurement uncertainties

• What strategy is used to set beam dimensions in model?

What are implications and impacts?

• Predict population deflection variability and exceedance

probability for loads Po, 0.9Po, 1.1Po.

• Compare to B.1 results and use of experimental data only.

Part B: Model and Calibration to

Estimate Deflection Variability

and Exceedance Probability

10

Validation Configuration A

• Validation Beams have different dimensions than calibration beams

• Different loading (uniform distributed load)

• Same temperature

• Random and systematic probabilistic and interval measurement

uncertainties in the validation experiments


• Response-surface surrogate model error/uncertainty

• Predict deflection and compare to results of 1 test

• May elect to adjust model based on comparison results

Part C: Model Validation, Potential associated

Adjustment of Prediction Model, and

Extrapolative Prediction and Analysis

11

Validation Configuration B

• Much higher temperature—possible material strength effects

• Validation Beams have different dimensions than calibration beams

• Point end-load Po like in calibration setting

• Random and systematic probabilistic and interval measurement

uncertainties in the validation experiments


• Response-surface surrogate model error/uncertainty

• Predict deflection and compare to results of 2 replicate tests

• May elect to adjust model based on comparison results




12

Model Validation Relevance and Extrapolative Prediction and QMU

• Validation Beams have same geometry/dimensions as beams that

want to make post-validation predictions for

• How is the model’s predictive ability characterized from the two

validation activities?

• What about after the model is adjusted, if adjusted (based on

the validation results)?

• What are the caveats, range of use conditions, uncertainty,

“confidence”, credibility” etc. of the model and/or model predictions

for end-load Po and for uniform loading at the validation conditions?

• What about for a specified higher temperature, T_hot?

• Operational Restriction: determine the “safe” temperature for 99.9%

reliability of end-deflection not exceeding Dcrit for end-load Po




13

Discussion

• Plan to introduce Beam Problem to ASME, AIAA,

Soc. Auto. Engrs., maybe SIAM

• Physics and Uncertainty “Truth Models” for the

Cantilever Beam E2E UQ problem will be released in

a year or two

– after the UQ community has a chance to try the problem

without knowledge of the truth models.

• Robustness, Efficiency, and Accuracy of UQ methods

will be distributional quantities that must be

characterized over thousands of realizations of

complex system dynamics involving sparse data and

errors/uncertainties in measurements, model forms,

discretization, surrogate models, and UQ strategies,

methods, and implementations. 14

Discussion

• It will be a major effort to characterize and

differentiate the performance and practicality of E2E

UQ and Credibility strategies and methods

• Some progress on performance characterization

measures and metrics already made in recent

sparse-data UQ work

• Next two slides define metrics that help quantify

performance tradeoffs between UQ method reliability,

conservatism, and risk over thousands of random

trials of application of the methods

15

• Winokur & Romero (ASCE/ASME J. UQ/Risk, 2017 in review):

– all error types usually not equally bad; preference weighting via denom.

– larger avg. magnitude of given error type drives numerator and metric up

– lower metric value = better performance

– 𝑃𝑖𝑗 is the proportion of

error type ij,

𝑤𝑖 is a relative preference for error type ij

16

Weighted Performance Metric considers

Proportions and Magnitudes of Overshoot and

Undershoot Errors on estimates of e.g.

95% Central Range of Response

case 𝒘++ 𝒘−− 𝒘+− 𝒘−+ Comments

A 1.00 0.00 0.00 0.00 2-sided coverage: bound/encompass both upper and lower percentiles

B 0.50 0.00 0.00 0.50 1-sided: upper bound on upper percentile

𝜖𝐿(+; 𝑜𝑣𝑒𝑟𝑠ℎ𝑜𝑜𝑡)

𝜖𝐿(+; 𝑜𝑣𝑒𝑟𝑠ℎ𝑜𝑜𝑡)

𝜖𝑈(−; 𝑠ℎ𝑜𝑟𝑡𝑓𝑎𝑙𝑙)

𝜖𝑈(+; 𝑜𝑣𝑒𝑟𝑠ℎ𝑜𝑜𝑡)

𝜖𝑈 = %𝑖𝑙𝑒𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒97.5 −%𝑖𝑙𝑒𝑒𝑥𝑎𝑐𝑡

97.5

𝜖𝐿 = %𝑖𝑙𝑒𝑒𝑥𝑎𝑐𝑡2.5 −%𝑖𝑙𝑒𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒

2.5

• Romero, Bonney, et al. (2017):

– all error types usually not equally bad; preference weighting via penalty

factor in numerator

– larger avg. magnitude of given error type drives numerator and metric up

– larger proportion of + (overshoot) errors in denominator drives metric

down

– lower metric value = better performance

17

Performance Metric for Estimation of

Exceedance Probability

EP peformance metric = [σ𝑵+∆𝒍𝒐𝒈 + 𝟏𝟎σ𝑵− |∆𝒍𝒐𝒈| ]/𝑵+

EP peformance metric = [σ𝑵+∆𝒍𝒐𝒈 + σ𝑵− |∆𝒍𝒐𝒈| ]/𝑵+

Error metric = ∆𝒍𝒐𝒈 = 𝒍𝒐𝒈 𝑬𝑷_𝒆𝒔𝒕𝒊𝒎𝒂𝒕𝒆𝒅 − 𝒍𝒐𝒈 𝑬𝑷_𝒕𝒓𝒖𝒆

unpenalized

10X penaltyfor negative errors (under-estimation of exceedance probability)

(#of orders of magnitude that the predicted probability is off by)

Summary

• We need model E2E UQ problems with known answers

to get an idea how well methods and strategies may be

working on real problems we are applying them to.

• It will be a major effort to characterize and differentiate

the performance and practicality of E2E UQ and

Credibility paradigms, strategies, and methods.

• The Beam E2E UQ problem has many features of real

problems that make it useful for significant assessment of

UQ and Credibility approaches and their potential

practicality and effectiveness on real problems.

• Performance metrics are available to help quantify

tradeoffs between UQ method reliability, conservatism,

and risk over thousands of random trials. 18

Contact: [email protected]