Timothy G. Trucano Optimization and Uncertainty Estimation Department, Org 9211 Sandia National...

Timothy G. Trucano

Optimization and Uncertainty Estimation Department, Org 9211

Sandia National Laboratories

Albuquerque, NM 87185

Workshop on Error Estimation and Uncertainty QuantificationWorkshop on Error Estimation and Uncertainty Quantification

November 13-14, 2003

Johns Hopkins University

Phone: 844-8812, FAX: 844-0918Email: [email protected]

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

V&V Issues

April 19, 2023 Johns Hopkins, November 2003

• The problem.

• What is validation?

• What is verification?

• Coupling is required.

• Walking through V&V.

• A few research issues.

Outline of talk.


• Hamming – “The purpose of computing is insight…” (?)

• ASCI – the purpose of computing is to provide “high-performance, full-system, high-fidelity-physics predictive codes to support weapon assessments, renewal process analyses, accident analyses, and certification.” (DOE/DP-99-000010592)

• Philip Holmes – “…a huge simulation of the ‘exact’ equations…may be no more enlightening than the experiments that led to those equations…Solving the equations leads to a deeper understanding of the model itself. Solving is not the same as simulating.” (SIAM News, June, 2002)

Useful quotes to keep in mind.


• Validation is a “physics problem.”• Verification is a “math problem.”

“Validation” is a process of comparing calculations with experimental data and drawing inferences about the scientific fidelity of the code for a particular application. For example:

1

2

3

5

6

4

Incident Angle

p

r/

pin

c

35 40 45 5030

Experiment + Error Bar

Analytic

ALEGRA Calculation

This is math.

This is physics.


• Error bars mean what?

• What is the numerical accuracy of the code?

• Is the comparison good, bad, or indifferent? In what

context?

• Why did we choose this means to compare the data and the

calculation? Is there something better?

• Why did we choose this problem to begin with?

• What does the work rest on (such as previous knowledge)?

• Where is the work going (e.g. what next)?

Some of the questions that occur to us as a result of this comparison:


• Validation of computational science software is the process of answering the following question:

Are the equations correct?

• It is convenient to recognize that validation is also the process of answering the following question:

Are the software requirements correct?

• It goes without saying that validation is “hard;” but it is sometimes forgotten that the latter definition of validation takes precedence (it applies to ANY software).– Focus on the former; but remember the latter.

What is validation?


• Verification of computational science software is the process of answering the following question:Are the equations solved correctly ?

• It is convenient to recognize that verification is also the process of answering the following question:Are the software requirements correctly implemented?

• A strict definition of verification is:Prove that calculations converge to the correct solution of the equations.

• This latter definition is HARD (impossible)!! Use it as a mission statement.

• Provably correct error estimation is essentially an equivalent problem.

What is verification?


What evidence is required requires V&V plans

How evidence is accumulated requires V&V tasks

How evidence is “accredited” requires V&V assessment

How evidence is applied intersects Computing

V&V are processes that accumulate information – “evidence”.


Why do we choose specific validation tasks?

This is a single-material, simple EOS strong-shock multi-dimensional hydrodynamics validation problem that develops validation evidence for ALEGRA-HEDP capabilities. It will be used in a “validation grind” for ALEGRA. (Chen-Trucano, 2002)

We have defined and implemented a planning framework for code application validation at Sandia that reflects the hierarchical nature of validation.

• We have defined formal and documented planning guidance. (Trucano, et al, Planning Guidance Ver. 1, 2 SAND99-3098, SAND2000-3101)

• A core concept is the PIRT (Phenomena Identification and Ranking Table) (a “Quality Function Deployment” tool).

• We have defined a formal and documented assessment methodology for the planning component.

We have defined and implemented a planning framework for code application validation at Sandia that reflects the hierarchical nature of validation.

• We have defined formal and documented planning guidance. (Trucano, et al, Planning Guidance Ver. 1, 2 SAND99-3098, SAND2000-3101)

• A core concept is the PIRT (Phenomena Identification and Ranking Table) (a “Quality Function Deployment” tool).

• We have defined a formal and documented assessment methodology for the planning component.

2-D Shock Wave Experiment


What do the experimental error barsmean?

This study relied upon existing experimental data that did not characterize uncertainty properly. Our interpretation of the error bars is that they reflect only instrument fidelity. (From this, we might make a strong assumption about “uniform distributions.”)

All serious discussion of validation metrics begins with uncertainty in the experimental data.

• A difficult problem is to characterize the uncertainty embedded in |Calc – Expt|.

• For the short term, validation metrics are driven by the assumption that this uncertainty can be characterized probabilistically.

• An important component in doing this right is to execute dedicated experimental validation.

• A rigorous methodology for experimental validation addresses experimental data requirements. (Trucano, et al “General Concepts for Experimental Validation of ASCI Code Applications” SAND2002-0341)

All serious discussion of validation metrics begins with uncertainty in the experimental data.

• A difficult problem is to characterize the uncertainty embedded in |Calc – Expt|.

• For the short term, validation metrics are driven by the assumption that this uncertainty can be characterized probabilistically.

• An important component in doing this right is to execute dedicated experimental validation.

• A rigorous methodology for experimental validation addresses experimental data requirements. (Trucano, et al “General Concepts for Experimental Validation of ASCI Code Applications” SAND2002-0341)1

2

3

5

6

4

Incident Angle

pr/

pin

c

35 40 45 5030


Analytic

ALEGRA Calculation

Instrument Fidelity


What are the “Validation Metrics” |Calc – Expt|?

The main metric was reproduction of qualitative trends in shock reflection. The secondary goal was quantitative pointwise comparison of specific Mach No./Angle pairs.

Our key R&D project is the Validation Metrics Project (origin ~ 1998)

• The focus of the project is to answer the questions: (Trucano, et al “Description of the Sandia Validation Metrics Project” SAND2002-0121)

1. What metrics and why?

2. What are relevant Pass/Fail criteria?

3. What are the implications for calculation prediction confidence?

• Critical impact on current V&V milestones.

• Uncertainty Quantification is an enabling technology.

• Current themes are thermal analysis, solid mechanics, & structural dynamics.

Our key R&D project is the Validation Metrics Project (origin ~ 1998)

• The focus of the project is to answer the questions: (Trucano, et al “Description of the Sandia Validation Metrics Project” SAND2002-0121)

1. What metrics and why?

2. What are relevant Pass/Fail criteria?

3. What are the implications for calculation prediction confidence?

• Critical impact on current V&V milestones.

• Uncertainty Quantification is an enabling technology.

• Current themes are thermal analysis, solid mechanics, & structural dynamics.

1

2

3

5

6

4

Incident Angle

p

r/

pin

c

35 40 45 5030


Analytic

ALEGRA Calculation

Viewgraph Norm


• Probabilistic sophistication in development and application

of these metrics is a great challenge.

It’s obvious that there are better metrics than the viewgraph norm.

i i i

jjj

P M observation P observationM P M= ×

P MP observationMP M observation


Regression Standardized Residual

2.502.00

1.501.00

.500.00

-.50-1.00

-1.50-2.00

-2.50-3.00

Fre

quen

cy

40

30

20

10

0

Empirical Residual Histogram

What do probabilistic metrics mean?

A study of Hills investigating statistical methodologies for very simple validation data (Hugoniot data for aluminum) has been influential for the entire validation metrics project.

This is the simplest starting point for validation of shock wave calcs.

“Error” = |Calc – Expt| is probabilistic:

• Expt = random field; Calc = random field

• These fields depend on many variables – geometry, initial and boundary condition specifications, numerical parameters in the case of the calculation. (Hills - SAND99-1256, SAND2001-0312, SAND2001-1783)

Hopefully only a “few” of these variables are important

• Predictive confidence results from understanding of the error field; depends on quantity and quality of data.

• Additional complexities arise from the hierarchical nature of validation and the intended applications. This is an important subject for current research (Hills & Leslie – multivariate statistical approaches; Mahadevan – Bayesian net reliability)

“Error” = |Calc – Expt| is probabilistic:

• Expt = random field; Calc = random field

• These fields depend on many variables – geometry, initial and boundary condition specifications, numerical parameters in the case of the calculation. (Hills - SAND99-1256, SAND2001-0312, SAND2001-1783)

Hopefully only a “few” of these variables are important

• Predictive confidence results from understanding of the error field; depends on quantity and quality of data.

• Additional complexities arise from the hierarchical nature of validation and the intended applications. This is an important subject for current research (Hills & Leslie – multivariate statistical approaches; Mahadevan – Bayesian net reliability)

Statistical Error Description


What are the calculation error bars?

When we performed the original work we could not converge the calculations because of hardware limitations.

ALEGRA-HEDP has a growing set of verification problems that increase our confidence in the numerical accuracy.

It is critical to “verify” calculations used in validation studies. (Verification guidance currently missing but in progress)

• This requires convergence studies and error estimation techniques.

• Because it is unlikely that this will be fully definitive, our confidence in the numerical accuracy of validation calculations also rests upon:

Code verification processes and results. This includes attention to software engineering (SE).

Careful design and application of verification test suites where convergence to the right answer can be demonstrated.

• DOE has demanded formal attention to SE (and Sandia has responded).

It is critical to “verify” calculations used in validation studies. (Verification guidance currently missing but in progress)

• This requires convergence studies and error estimation techniques.

• Because it is unlikely that this will be fully definitive, our confidence in the numerical accuracy of validation calculations also rests upon:

Code verification processes and results. This includes attention to software engineering (SE).

Careful design and application of verification test suites where convergence to the right answer can be demonstrated.

• DOE has demanded formal attention to SE (and Sandia has responded).

These calculations are not converged.


Is there also uncertainty in the calculation beyond numerical accuracy?

To study the probabilistic content of the error field when we compare calculated and experimental Hugoniot data we studied the influence of uncertainty in certain computational parameters. (L. Lehoucq, using the DDACE UQ tool. This type of study can now be accomplished using DAKOTA.)

Calculations have uncertainties that are composed both of numerical accuracy questions and uncertainties arising from problem specifications.

• Numerical accuracy uncertainties fundamentally reside in lack of convergence for fixed problem specification; one extreme point is known under-resolution of the grid.

• There is uncertainty in translating experimental specs into calculation specs.

• There is uncertainty in specifying a variety of numerical parameters; hence calibration of uncertain models becomes an important question.

Calculations have uncertainties that are composed both of numerical accuracy questions and uncertainties arising from problem specifications.

• Numerical accuracy uncertainties fundamentally reside in lack of convergence for fixed problem specification; one extreme point is known under-resolution of the grid.

• There is uncertainty in translating experimental specs into calculation specs.

• There is uncertainty in specifying a variety of numerical parameters; hence calibration of uncertain models becomes an important question.

Experiment

Is uncertainty in the Grüneisen parameter important? (Part of an ensemble of 120000 calculations.)


Uncertainty dominates real V&V.

Input

Application Specification

Calibrations

Structural (Model) Uncertainty

Validation Data

Input

Application Specification

Calibrations

Structural (Model) Uncertainty

Validation Data

Codes

Algorithm lack of rigor

Under-resolution

Code Reliability

Human Reliability

Infrastructure Reliability

Codes

Algorithm lack of rigor

Under-resolution

Code Reliability

Human Reliability

Infrastructure Reliability

Output

Quantitative Margins and Uncertainty

Decisions

Output

Quantitative Margins and Uncertainty

Decisions


What is the intended application and are we accumulating predictive confidence?

ALEGRA-HEDP validation calculation for Imperial College 4x4 arrays.

Success or failure of predictive M&S may have an important influence on the future of the Pulsed Power program.

There is a very important link between V&V and the intended application of modeling and simulation.

• Rigorous assessment of predictive confidence resulting from V&V is important.

This is demanded by the experimental validation methodology and the validation metrics project.

There are technical problems, such as how to quantify the benefit gained by doing additional validation experiments; and how to quantify the risk associated with not having validation experiments.

• We have also devoted significant attention to the issue of stockpile computing – how to make proper use of the investment in V&V. Document in progress.

• “Quantitative Margins and Uncertainty”(!)

There is a very important link between V&V and the intended application of modeling and simulation.

• Rigorous assessment of predictive confidence resulting from V&V is important.

This is demanded by the experimental validation methodology and the validation metrics project.

There are technical problems, such as how to quantify the benefit gained by doing additional validation experiments; and how to quantify the risk associated with not having validation experiments.

• We have also devoted significant attention to the issue of stockpile computing – how to make proper use of the investment in V&V. Document in progress.

• “Quantitative Margins and Uncertainty”(!)

Predictive modeling for Z-pinch physics.


WIPP and NUREG-1150 PrecedentsHigh Consequence Regulatory Issues in the National Interest

Addressed Primary Through Modeling and Simulation

WIPP Data

Lessons Learned: (1) Seek BE + Uncertainty

(2) It takes more than one shot to get it right

WIPP Data


Example research question: Is Probabilistic Software Reliability (PSR) useful for computational science software?

PSR methodologies may deepen our ability to express operational confidence in our codes as software products.

• A vigorous area of research is the expansion and limits of statistical testing techniques.

“Based on the software developer and user surveys, the national annual costs of an inadequate infrastructure for software testing is estimated to range from $22.2 to $59.5 billion.” (“The Economic Impacts of Inadequate Infrastructure for Software Testing,” NIST report, 2002.)

• Can PSR be extended to include “failures” defined by unacceptable algorithm performance? By inadequate resolution?

• Can interesting code acceptance criteria be devised based on statistical software reliability ideas?

PSR methodologies may deepen our ability to express operational confidence in our codes as software products.

• A vigorous area of research is the expansion and limits of statistical testing techniques.

“Based on the software developer and user surveys, the national annual costs of an inadequate infrastructure for software testing is estimated to range from $22.2 to $59.5 billion.” (“The Economic Impacts of Inadequate Infrastructure for Software Testing,” NIST report, 2002.)

• Can PSR be extended to include “failures” defined by unacceptable algorithm performance? By inadequate resolution?

• Can interesting code acceptance criteria be devised based on statistical software reliability ideas?

Hypothetical reliability model for ASCI codes.

Development and test Capability I Capability II Capability etc

Fai

lure

Rat

e

Application DecisionsApplication Decisions

# of Users

1st use / Validation

Note the important implication here that the software is NEVER FROZEN! This effects reliability methods.

A general purpose computational physics code such as ALEGRA-HEDP, has a complex software lifecycle and reliability history.

A fundamental complexity is the constant evolution of the software capability.


Example research question: Validation Metric Research.

Uncertainty quantification remains an critical enabling technology for validation:

• Forward uncertainty propagation is computationally demanding (ideal would be stochastic PDE’s).

• UQ needs new ideas in experimental design for simulations and coupled simulation-experiment validation tasks.

• UQ needs tools and the expertise to use them properly.

DAKOTA is the platform of choice for current and future evolution of UQ tool capability at Sandia.

• The “backward” UQ problem – model improvement – is an even harder formal challenge.

This is related to Optimization Under Uncertainty (OUU).

Uncertainty quantification remains an critical enabling technology for validation:

• Forward uncertainty propagation is computationally demanding (ideal would be stochastic PDE’s).

• UQ needs new ideas in experimental design for simulations and coupled simulation-experiment validation tasks.

• UQ needs tools and the expertise to use them properly.

DAKOTA is the platform of choice for current and future evolution of UQ tool capability at Sandia.

• The “backward” UQ problem – model improvement – is an even harder formal challenge.

This is related to Optimization Under Uncertainty (OUU).

Real data

The most relevant data on the Z-machine tends to be complicated, integral, and spatio-temporally correlated. The uncertainty is currently not well-characterized.

Load Implosion Load Stagnation


r

Robust (reliable?) pulse shaping

Lagrangian

Foam + ? (Design)

Capsule (Design)

Robust (reliable?) pulse compensation

SMALE EulerianMMALE

Wire initiation – 3D ALEGRA MHD

Conversion –1,2,3D ALEGRA rad-MHD

Drive and implosion – 1,2D ALEGRA rad-hydro

High Z dense plasma (Design, uncertain, unstable)

z

Example research question: OUU (Optimization Under Uncertainty).

Using computational models in reliability-based or robust design is an important goal.

• V&V are the source of the confidence that we have in the modeling component of these activities.

• Model improvement derived from V&V is related to OUU.

For example, calibration under uncertainty.

• It is important to couple research on OUU with research threads in Validation Metrics.

• We are just beginning this work.

• VERY COMPLEX computation underlies this work.

Using computational models in reliability-based or robust design is an important goal.

• V&V are the source of the confidence that we have in the modeling component of these activities.

• Model improvement derived from V&V is related to OUU.

For example, calibration under uncertainty.

• It is important to couple research on OUU with research threads in Validation Metrics.

• We are just beginning this work.

• VERY COMPLEX computation underlies this work.

High Gain Capsules

Fusion capsule design for a Z-pinch driver is an interesting and extreme problem in OUU. We are currently using ALEGRA-HEDP and DAKOTA to study features of this problem.


Giunta has been working on the use of multi-fidelity surrogates, which will surely be crucial for use of OUU in such complex problems.

1 0

L o w F id e lity3 0 ,0 0 0 D O F

H ig h F id e lity8 0 0 ,0 0 0 D O F

F in ite E le m e n tM o d e ls o f th e

S a m e C o m p o n e n t

M u ltif id e li ty S u r ro g a te M o d e ls

• T h e lo w -f id e lity s u rro g a te m o d e l re ta in s m a n y o f th e im p o r ta n t fe a tu re s o f th e h ig h -f id e lity “ t ru th ” m o d e l, b u t is s im p lif ie d in s o m e w a y .

– d e c re a s e d p h y s ic a l re s o lu t io n

– d e c re a s e d F E m e s h re s o lu tio n

– s im p lif ie d p h ys ic s

• In d e p e n d e n t o f n u m b e r o f d e s ig n p a ra m e te rs .

• L o w -f id e lity m o d e l s t il l m a y h a v e n o n s m o o th re s p o n s e tre n d s .

• W o rk s w e ll w h e n lo w -f id e lity t re n d s m a tc h h ig h -f id e lity tre n d s .

1 0

L o w F id e lity3 0 ,0 0 0 D O F

H ig h F id e lity8 0 0 ,0 0 0 D O F

F in ite E le m e n tM o d e ls o f th e

S a m e C o m p o n e n t

M u ltif id e li ty S u r ro g a te M o d e ls

• T h e lo w -f id e lity s u rro g a te m o d e l re ta in s m a n y o f th e im p o r ta n t fe a tu re s o f th e h ig h -f id e lity “ t ru th ” m o d e l, b u t is s im p lif ie d in s o m e w a y .

– d e c re a s e d p h y s ic a l re s o lu t io n

– d e c re a s e d F E m e s h re s o lu tio n

– s im p lif ie d p h ys ic s

• In d e p e n d e n t o f n u m b e r o f d e s ig n p a ra m e te rs .

• L o w -f id e lity m o d e l s t il l m a y h a v e n o n s m o o th re s p o n s e tre n d s .

• W o rk s w e ll w h e n lo w -f id e lity t re n d s m a tc h h ig h -f id e lity tre n d s .


Combining uncertainty and multi-fidelity runs us head-on into probabilistic error models.

log() error[log()]

Material #1 Material #2

mean error in contact xempirical histogram of error in shock arrival time at wall

• The problem is a simple shock problem involving shock transmission and reflection from a contact discontinuity.

• Key features are various wave space-time trajectories.

• This is also a common verification test problem and has an analytic solution.


Probabilistic Error Models (PEM) are useful for computational science software and necessary for risk-informed decisions.

• Suppose that we can neither “verify codes” nor “verify calculations.” – “When quantifying uncertainty, one cannot make errors

small and then neglect them, as is the goal of classical numerical analysis; rather we must of necessity study and model these errors.”

– “…most simulations of key problems will continue to be under resolved, and consequently useful models of solution errors must be applicable in such circumstances.”

– “…an uncertain input parameter will lead not only to an uncertain solution but to an uncertain solution error as well.”

• These quotes reflect a new view of “numerical error” expressed in B. DeVolder, J. Glimm, et al. (2001), “Uncertainty Quantification for Multiscale Simulations,” Los Alamos National Laboratory, LAUR-01-4022.


“We make no warranties, express or implied, that the programs contained in this volume are FREE OF ERROR, or are consistent with any particular merchantability, or that they will meet your requirements for any particular application. THEY SHOULD NOT BE RELIED UPON FOR SOLVING A PROBLEM WHOSE SOLUTION COULD RESULT IN INJURY TO A PERSON OR LOSS OF PROPERTY…” [Emphasis Mine] (from Numerical Recipes in Fortran, Press, Teukolsky, Vetterling, and Flannery)

Will we be able to seriously claim that ASCI codes are any better than this?!

Conclusion:


How absurd would the following be?

We make no warranties, express or implied, that the

bridge you are about to drive on is free of error…


How much more absurd would the following be?

We make no warranties, express or implied, that the book you are about to read

is free of error…

Timothy G. Trucano Optimization and Uncertainty Estimation Department, Org 9211 Sandia National...

Documents

Transcript of Timothy G. Trucano Optimization and Uncertainty Estimation Department, Org 9211 Sandia National...