PDF of all Presentations

184
25 th Anniversary Symposium Promise of a Discipline: Reliability & Risk in Theory and Practice

Transcript of PDF of all Presentations

Page 1: PDF of all Presentations

2 5 th A nniv e rsa ry S y m po sium Promise of a Discipline:

Reliability & Risk in Theory and Practice

Page 2: PDF of all Presentations

Dr. A li Mosleh Nicole J. Kim Eminent Professor of Reliab ilit y Engineering

W ELCOME & OPENING REMARKS

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 3: PDF of all Presentations

Dr. George Dieter Professor Emeritus Glenn L. Mart in Inst itute Professor of Engineering

HISTORY OF THE RELIABILITY ENGINEERING PROGRAM

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 4: PDF of all Presentations

Dr. B. Balachandran Minta Mart in Professor & Chair, Department of Mechanical Engineering

RELIABILITY ENGINEERING IN THE DEPARTMENT OF MECHANICAL ENGINEERING

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 5: PDF of all Presentations

Dr. Elias L. Anagnostou, Northrop Grumman Dr. Aris Christou, Universit y of Maryland Dr. Antoine B. Rauzy, Cent rale-Supélec Dr. Carol Smidt s, The Ohio State Universit y

Moderator: Dr. A li Mosleh

Front iers of Reliab ilit y Engineering

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

PANEL 1

Page 6: PDF of all Presentations

Frontiers of Reliability Engineering

Panel 1 25th Anniversary Symposium

Promise of a Discipline: Reliability and Risk in Theory and Practice

University of Maryland College Park April 2, 2014

Page 7: PDF of all Presentations
Page 8: PDF of all Presentations

Frontiers… • Integrated Probabilistic Simulation (for design and operational phases) • Probabilistic Physics of Failure • X-Ware Systems Reliability

– Hardware/Software/Human – Interface failures – Soft Casual Models

• Hybrid Methods • Advanced Inference Methods (doing more with less) • New Modeling Languages • Model-Based System Health Management • Model-Based System Engineering • HAL-9000 • Resilience Engineering

Page 9: PDF of all Presentations

Reliability and Risk in

Theory and Practice

University of Maryland

April 2, 2014

Elias Anagnostou Engineering Fellow, Research and Technology

Panel 1: Frontiers of Reliability

Engineering

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 10: PDF of all Presentations

Need For Risk-Based Fleet Management

• Issues

– Uncertainties in legacy approach force conservatisms

– Austere budgets now drive the need to extract all remaining capabilities while minimizing risks

– Pervasive objectives to increase readiness and lower life cycle costs necessitate a change in the current paradigm

– New vehicle requirements for reduced weight and longer life dictate a need for high-fidelity methods to manage risk

• Approach

– Advanced modeling and simulation tools that link materials-design-manufacturing-sustainment (Digital Thread)

– Virtual representation of a system as an integrated system of data, models, and analysis tools applied over the entire life cycle on a tail-number unique basis (Digital Twin)

– Concurrent Uncertainty Management across the material system life cycle

• Enables improved reliability, affordability and maintainability with an overall goal to reduce total ownership costs

– Sustainment - Approximately 40% more life can be extracted without structural modifications (DARPA/Navy Structural Integrity Prognosis System (SIPS) demonstrated results)

2 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 11: PDF of all Presentations

De

fect

Siz

e

Probabilistic Predictions Updated By Imperfect

Sensor Evidence

Anticipated Usage

Actual Usage

Update With Sensor Data

DARPA/Navy Structural Integrity Prognosis System (SIPS)

• Prognosis system to manage uncertainty and provide actionable information for risk-informed fleet management – Increase asset availability and reduce cost w/o increasing risk

Approach

Develop the underlying critical technologies that enable prognosis and the demonstration of these in an integrated PROGNOSIS system:

– Physics-based modeling that captures interactions between structural damage drivers and material failure mechanisms

– Sensors that measure critical vehicle and materials parameters

– Reasoning and predictive modules that accept, compare, interpret and correlate the data from the sensors and models to provide structural reliability predictions

Reasoning &

Prediction

Physics-based

Models

Sensor Systems

Software System

OUTPUT: Current and future state probabilities

3 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 12: PDF of all Presentations

SIPS Program Organization

Prognosis Program Program Manager - Madsen

Principal Scientist - Papazian

Materials & Modeling

Anagnostou

Sensor Systems Silberstein

Reasoning & Predictions

Engel

System Architecture

Teng

Demonstrations Anagnostou

Engel

An integrated team of ≈ 75 engineers, scientists, professors and graduate students

4

Structures Material Science Manufacturing

Computer Science Experimentalists Info Management

Mathematics

Sensor Science Chemistry

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 13: PDF of all Presentations

SIPS Uncertainty Management

5

Current State

Physics-Based Models

Combines all the available information while accounting for their respective uncertainties

Model uncertainty

Usage uncertainty

False and missed indications

Assessment uncertainty

Repair effectiveness

As-manufactured

state

Material properties Environment

Maintenance

induced damage Missing &

corrupted data

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 14: PDF of all Presentations

SIPS Research Progression to Flight Demonstration

• Fixed-Wing Structures Application

6 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 15: PDF of all Presentations

Multi-scale Environment

Row of fastener holes

Hole #14

5.7 mm 100 µm

1 mm

Microstructurally small cracks

7 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 16: PDF of all Presentations

Microstructural Origins of Fatigue (7075)

1 mm

Microstructurally small cracks

8 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 17: PDF of all Presentations

Failure Progression From Initial State to Failure

Multiple cracked particles cause multiple micro-structurally small cracks. Some arrest, some grow then link together and form the dominant crack that leads to failure.

* Typical images from multiple samples 9 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 18: PDF of all Presentations

Physics-Based Models for Fatigue Life Prognosis

10

Ntotal = Nincubation + Nnucleation+ Nsmall crack + Nlarge crack

FASTRAN/UniGrow

FASTRAN (& Crack Coalescence) (Models for small crack growth & link-up)

Multi-Stage Fatigue/

VPS-MICRO (Models for nucleation

& small crack growth)

Geometric Approach (Models for incubation & nucleation stages by

coupling experimental observations with micro-

mechanical crystal plasticity simulations)

incinc

P

NC

2max

/

TH

MSC PSC

daG CTD CTD

dN

0.5625i pa D

Initial crack size

0.5625i pa D

Initial crack size

HCF loading dominated LCF loading dominated

2

max

00 2ˆ

p

I

ut

IIGS

GSCa

n

S

U

GS

GSCfCTD

HCF loading dominated LCF loading dominated

2

max

00 2ˆ

p

I

ut

IIGS

GSCa

n

S

U

GS

GSCfCTD

250 mm

a b

c

0 Cycle: 1 Cycle:

(a)

100 Cycle:

(b)

3000 Cycle:

(c) 10 mm

Loading Direction

7075-T651

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

FcSSK

KC

dN

dc

xxmaxeff

K

K

N

effi

c

max

i

2

1

Page 19: PDF of all Presentations

11

Multiscale Fatigue Modeling Environment

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 20: PDF of all Presentations

Multi-Scale Modeling: 3D Digital Materials

• Statistical Characterization of Material

• Digital Replication of microstructure

Two program materials: 7075-T651 & 7050-T7451, and seven 7075-T651 legacy wing panel materials

12 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 21: PDF of all Presentations

Investigation of Damage Mechanics

• Experimental methods to characterize damage evolution

• Calibrate fatigue models at various length scales/damage mechanisms

13 Three specimens tested, ~1000 particles monitored per test

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 22: PDF of all Presentations

Framework Validation

• Experimental characterization of damage evolution • Validation of probabilistic framework

14 Total of 35 Specimens tested, 5668 cracks measured Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 23: PDF of all Presentations

Model Integration

• Captures critical microstructurally-sensitive damage mechanisms

• Captures probability of occurrence of life-limiting fatigue mechanisms

• Produces naturally-occurring initial crack sizes for the start of small crack growth analysis to failure

• Tailored to the as-built manufactured state per aircraft tail number

Incubation Nucleated at Cycle 30

80% of 1st Cycle 1000 Cycles

Physics-Based Models for Crack Nucleation

Material Cyclic Response at the Notch

Multiaxial Methods,Neuber & Glinka-ESEB

Bulk Material mStructure Statistics

Grain OrientationParticle Aspect Ratio

Particle Size

Geometry, Material & Fatigue Loading UniGrow/FASTRAN

PredictionsSmall & Large Crack Growth

to failure

Physics-Based Initial Crack Size Distribution

Particle Size, a

P(a)P(a)

Particle Size, a

P(a)P(a)

Spectrum LoadSpectrum LoadSpectrum Load

Response Surface to Select Particles Most

Likely to Crack(Incubation Filter)

Response Surface to Select Cracked Particles Most Likely to Spawn a Crack into the Matrix

(Nucleation Filter)

Probabilistic Output

p(t)

a1

t0

De

fec

t S

ize

Flight Hours t1

Initial Crack Size Distribution

Probabilistic Predictions vs. Experiment

15 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 24: PDF of all Presentations

EA-6B Outer Wing Panel Fatigue Test Prognosis Validation

• Original life predictions – Prior flight history of panel

– Panel swap history

– Pre-test NDI

– Distribution of constituent particle sizes in 7075-T651

• Predictions modified by: – Null sensor readings, detection at sensor

threshold, crack size estimates (all accounting for sensor accuracy and uncertainty characteristics)

• Bayesian reasoning system to make a probabilistic prediction based on uncertain input data

• As the test progressed: – Significant decrease in life uncertainty

– Significant increase in predicted usable life

• Observed crack sizes validated predictions

Predictions converged to truth as test progressed

Predictions for largest crack

16 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 25: PDF of all Presentations

SIPS P-3 Flight Demonstration

Onboard Sensors

“Workable Executable Prototype” demonstration of a combination of systems consistent with Navy fleet management practice

NDI (Omni-Scan)

Reasoning &

Prediction

Sensor Physics Based

Models

CATASTROPHIC

(1)

CRITICAL

(2)

MARGINAL

(3)

NEGLIGIBLE

(4)

FREQUENT (A)

= or > 100 / 100K

Flight Hours

1 3 7 13

PROBABLE (B)

10 - 99 / 100K

Flight Hours

2 5 9 16

OCCASIONAL (C)

1.0 - 9.9 / 100K

Flight Hours

4 6 11 18

REMOTE (D)

0.1 - 0.9 / 100K

Flight Hours

8 10 14 19

IMPROBABLE (E)

= or < 0.1 / 100K

Flight Hours

12 15 17 20

SEVERITY

HAZARD

CATEGORIZATION

FR

EQ

UE

NC

Y

CATASTROPHIC

(1)

CRITICAL

(2)

MARGINAL

(3)

NEGLIGIBLE

(4)

FREQUENT (A)

= or > 100 / 100K

Flight Hours

1 3 7 13

PROBABLE (B)

10 - 99 / 100K

Flight Hours

2 5 9 16

OCCASIONAL (C)

1.0 - 9.9 / 100K

Flight Hours

4 6 11 18

REMOTE (D)

0.1 - 0.9 / 100K

Flight Hours

8 10 14 19

IMPROBABLE (E)

= or < 0.1 / 100K

Flight Hours

12 15 17 20

SEVERITY

HAZARD

CATEGORIZATION

FR

EQ

UE

NC

Y

Cra

ck

siz

e

Flight Hours

SIPS

17

NAVAIR Chose Vehicle & Sensor System

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 26: PDF of all Presentations

Results at a Critical Location

• Model predicted ~50% probability of a significant crack in a critical location

• Sensor had no indications

• Their combination reduced the probability to ~ 1%

18

The Combination of Model Predictions and Sensor Evidence Agreed With All Teardown Findings

• What Caused the Wide Disagreement Between Model and Sensor? During teardown inspections we discovered that the hole had been drilled out

We presume this was done to remove an existing crack

Based on the amount of material removed, model predicted repair would have been performed about 2001

No repair records were available, however Phased Depot Maintenance was performed 6/00 to 1/01

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014

Page 27: PDF of all Presentations
Page 28: PDF of all Presentations

Reliability Physics and Engineering: Key to Transformative Research

Aris Christou, MSE and ME Department, University of Maryland; [email protected]

Page 29: PDF of all Presentations

"Advanced manufacturing is a family of activities that (a) depend on the use and coordination of information, automation, computation, software, sensing, and networking, and/or (b) make use of cutting edge materials and emerging capabilities enabled by the physical and biological sciences, for example nanotechnology, chemistry, and biology. It involves both new ways to manufacture existing products, and the manufacture of new products emerging from new advanced technologies.” —President’s Council of Advisors on Science and Technology Report to the President on Ensuring American Leadership in Advanced Manufacturing,

Page 30: PDF of all Presentations

Introduction and Motivation • Industry profitability and success depend on yield and reliability. • Advanced semiconductors i.e. 2D, wide bandgap systems are a key for

numerous applications that extend from communications to automotive, defense and security.

• Manufacturing of components is strongly dependent on in depth reliability studies that include physics-based approaches to complement the currently used industry techniques that are not adequate for improving the current status of technology.

• Point-like nano/microscopic defects can often be the cause of a macroscopic device to collapse

• The challenge is a physics based approach to reliability through an integration of science and engineering.

• The transformative breakthroughs will be based on reliability physics, chemistry, mathematics and engineering.

Page 31: PDF of all Presentations

Approach • Meeting the challenge will be based on novel material and defect

characterization techniques which are necessary to locate the prevalent defects as well as their concentration and dynamics over time.

• Dimensional reduction, lower and higher voltages, and higher frequencies impact impact negatively the reliability

• In-situ and ex-situ characterization, will be necessary to satisfy the program’s objectives.

• Examples include reliability predictors such as spin, Transport-, Raman-, Noise-spectroscopy, Imaging for defects down to monolayer size.

• The types of defects existing in the fabricated devices need to be identified. Determining which of the defects is the cause of failure and which are effects of the failure is very important.

• Nanometer resolution characterization techniques considerably smaller than the apparent average separation between traps are required. Physics based simulation and experimental validation to further the fundamental understanding of the degradation mechanisms must also be undertaken.

Page 32: PDF of all Presentations

Reliability Grand Challenges Identify and Quantify the failure mechanisms arising through smaller dimensions, high electric fields, coupled effects of heat, strain, and electric polarization, gate current, and the relatively high density of extended and point defects endemic in most semiconductors. Gain a physics based knowledge through extensive and targeted characterizations and analyses and incorporate it into the failure models which can then become the basis for the new robust manufacturing science. Establish the basis for the new methodology for reliability prediction and manufacturing science for future technologies. Take basic science all the way to manufacturing through education and research and enable a competitive industry to be realized.

Page 33: PDF of all Presentations

PAST LESSONS FROM INNOVATIVE RELIABILITY ASSESSMENT TECHNIQUES

Reliability Assessment Fabrication

Fixture Mounting

Step-Stress Tests

Burn-in Tests

Reliability Assessment Fabrication Noise

Measurements

1 week 1 month 6 months

7 hours

7 months

CONVENTIONAL METHOD

NEW METHOD

7 months

7 hours

Bas

e N

oise

Pow

er D

ensi

ty

(A2 )

Determined a strong correlation between device reliability and baseband noise characteristics

Temperature dependence of peaks in base noise power density indicates reliability

Identified trap levels responsible for degradation from temperature dependent noise measurements

OUTCOME

Page 34: PDF of all Presentations

A. Reina et al., Nano Letters 9, 30 (2009)

Contour map of I2D/IG

Contour map of I2D/IG: 60 points over 63.5×45 μm2

Thin graphene layers (mono/bi/tri-layer): I2D/IG > 1 Half of the graphene layers are covered with thin graphene

layers

X: 10pts, step: 7.1 µm Y: 6pts, step: 9 µm

mapping on sample

RECENT LESSONS FROM INNOVATIVE POTENTIAL RELIABILITY ASSESSMENT TECHNIQUES

Ref: H. Kim, E. Pichonat, D. Vignaud, D. Pavlidis and H. Happy; Graphene Layers Grown by RTP-CVD on Nickel and Their Properties, WOCSDICE 2012

Page 35: PDF of all Presentations

Degradation model Physics and Math

Experimental Results for Future Semiconductor Devices

Characterization techniques

Engineering

Materials

Physics

Chemistry

New model reliably predicts degradation and allows for Robust

Manufacturing

Yes

No

Technological effects Physics and Engineering

Design Test structures

Process Science, Chemistry

Change parameters/

expand model

+

parameters for modeling Basic test structures

(Electrical Engineering)

Model fits exp. results?

“Updated” degradation model

Future Materials test

structures

Future Semiconductors: New Physics (High field effects - stress/temperature - Mechanical)

An Interdisciplinary Approach Device Physics and Electrical Engineering Mathematics and Materials Science Chemistry and Physics

Page 36: PDF of all Presentations

Example of Carbon Nanotube Composite Interconnects Cabon

Nanotube

Aluminum crystal

structure

Future Electronic Approach: • Mathematical Simulation • Process Science Modeling of Defects

Page 37: PDF of all Presentations

0102030405060708090

100

1 2 3 4 5 6

Wafer

Yield

(%)

Education, Research and Innovation REPRODUCIBLE

ROBUST DESIGN

C

B

Vb2ADVANCED DESIGN

CIRCUITS AND SYSTEMS Establish Material and

Device models

PHYSICAL PARAMETER -RELIABILITY CORRELATION

Disseminate Results through publications

Improve fabrication yield.

Improved robustness

Develop compact designs.

Improve performance with compact designs.

Establish correlation between physical parameters and reliability.

Page 38: PDF of all Presentations

Outcome and Conclusions • Promote cross-disciplinary approaches across scientific disciplines i.e.

reliability physics, materials, chemistry and more in addition to engineering.

• Initiate “transformative research” with societal impact i.e. power electronics and transport, T-Rays and medicine, communications and low-power etc. which are robust and manufacturable.

• Establish new methodologies for reliability prediction and manufacturing science for future technologies.

• Provide education and research experience for future engineers in new semiconductor technologies.

Thank you for your attention

Page 39: PDF of all Presentations
Page 40: PDF of all Presentations

New Logic Modeling Paradigms for Complex System Reliability and Risk Analysis

Antoine Rauzy

Chair Blériot-Fabre* - Ecole Centrale de Paris Ecole Polytechnique

FRANCE [email protected]

http://www.lgi.ecp.fr/pmwiki.php/PagesPerso/ARauzy

*Sponsored by SAFRAN group

Page 41: PDF of all Presentations

Probabilistic Risk Assessment …

… is now established on a solid scientific ground … is a mature technology … is a great tool for decision making

So, what’s next?

• More openness • Higher level modeling languages • Wider spectrum of applications

Page 42: PDF of all Presentations

Standard Representation Formats

Issues • Models are tool-dependent • Calculations are provably

difficult so calculation engines perform unwarranted approximations

<define-fault-tree name="FT1" > <define-gate name="top" > <or> <gate name="G" /> <basic-event name="C" /> </or> </define-gate> <define-gate name="G" > <and> <basic-event name="A" /> <basic-event name="B" /> </and> </define-gate> </define-fault-tree>

The Open-PSA Standard Representation Format for Fault Trees and Event Trees Challenge/research direction:

Define standard representation formats, with all the necessary constructs, with a clear and sound semantics

Version 3 of the Open-PSA standard under redaction • Simplifications • Block Diagrams • Multi-phase Markov Chains with Rewards

Page 43: PDF of all Presentations

New Algorithms for Model Assessment

Typical example (US plant): • ~2 500 Basic Events PSA model

What has been calculated: • ~100 000 Minimal Cutsets • 95% of the Core Damage Frequency with

less than 5% of the Basic Events, 100% with 25%

In a word, 75% of the model is “useless”!

Issues: • Finding the right level of

abstraction is difficult to achieve

Original Model

Minimal Cutsets

Simplified Model

Design Filtering Algorithms that to build simpler models that are equivalent w.r.t. to observation means

Page 44: PDF of all Presentations

Categories of Models

Challenge/research direction: Many possibly very different models are undistinguishable by observation means, i.e. results of virtual experiments (typically, calculation of failure scenarios). They are equivalent in the Turing test sense. Equivalent models form a category. Design mathematical concepts, algorithms and tools to determine the most representative (simplest?) model of a category.

MCS calculation

Minimal Cutsets

Original Models

Representative Model

?

Page 45: PDF of all Presentations

High Level Modeling Languages

Issues: • Completeness of specifications

with respect to safety concerns • Distance between system

specifications and safety models • Integration with other system

engineering disciplines

System Specification Fault Trees

class component state Boolean working (init = true); event failure (delay = exponential(lambda)); transition failure: working -> working := false; end

AltaRica

• Formal • Event-Based • Textual & graphical • Multiple assessment tools

Calculations

Automated Generation

AltaRica features

Page 46: PDF of all Presentations

AltaRica Mathematical Framework

domain componentState { STANDBY, WORKING, FAILED} class spareComponent componentState s (init = WORKING); Boolean demanded (reset = false); event turnOn (delay = 0, expectaction = 0.98), failureOnDemand (delay = 0, expectation = 0.02), turnOff (delay = 0), failure (delay = exponential(0.001)), repair (delay = exponential(0.1)); transition turnOn: s==STANDBY and demanded -> s := WORKING; failureOnDemand: s==STANDBY and demanded -> s := FAILED; turnOff: s==WORKING and not demanded -> s := STANDBY ; failure: s==WORKING -> s := FAILED; repair: s==FAILED -> s := STANDBY; end

s=WORKING

s=FAILED

s=STANDBY

failure not demanded? turnOff

demanded? turnOn

demanded? failureOnDemand

repair

Well founded generalization of: • Fault Trees, Blocks Diagrams • Markov chains, Stochastic Petri Nets

Guarded Transition Systems:

Page 47: PDF of all Presentations

The AltaRica 3.0 Project

class Pump … end

AltaRica 3.0

compilation to Fault Trees

generation of sequences

Libraries patterns

Guarded Transition Systems

model checking Probabilité de l'ER

0.0000 2000.0000 4000.0000 6000.0000 8000.0000

2.0000e-1

3.0000e-1

4.0000e-1

5.0000e-1

6.0000e-1

7.0000e-1

8.0000e-1

9.0000e-1

1.0000e+0

Pr[STop event]

stochastic simulation reliability allocation

Reliability Data

SysML

AADL

FMEA Petri Nets

Dynamic FaultTrees

GUI for modeling GUI for simulation Version & Configuration

Management System

compilation to Markov Chains

Page 48: PDF of all Presentations

Performances Assessment

Issues: • The business model of industry is moving from selling products to selling capacities • Companies have to take commitments and to do so to assess performances of

systems in presence of hazards.

= PRA languages and tools are well suited to assess capacities (it mainly suffices to assess mathematical expectations rather than probabilities)

Page 49: PDF of all Presentations

Carol Smidts Department of Mechanical and Aerospace Engineering

The Ohio State University [email protected]

To be presented at the Reliability Engineering 25th Anniversary Symposium

April 2, 2014 University of Maryland, College Park

Page 50: PDF of all Presentations

CHARACTERISTICS IN CONTRAST MORE RECENT • First hardware reliability paper appears in 1952 in Proceedings of the Institute of

Radio Engineers. • First software reliability paper appears in 1975 in IEEE Transactions on Software

Engineering. MORE COMPLEX • The complexity of typical hardware systems is several hundreds of components

(e.g., nuclear power plants). • The complexity of current software systems is millions of lines of source code

(e.g., 15 millions for the Linux kernel). Assuming a typical function consists of 200 lines of code, there are approximately 75,000 functions in the Linux kernel.

Page 51: PDF of all Presentations

CHARACTERISTICS IN CONTRAST EVOLVES EXTREMELY FAST • The number of important programming languages introduced per decade is

approximately 10. This number has been constant since 1950.

0

2

4

6

8

10

12

14

16

1950 1960 1970 1980 1990 2000

Number of Important Programming Languages Emerged in each Decade

Page 52: PDF of all Presentations

CHARACTERISTICS IN CONTRAST EVOLVES EXTREMELY FAST (Cont’d) • Programming paradigms have changed from non-structured to structured,

procedural to object-oriented. • Six main paradigms currently coexist: imperative, declarative, functional, object-

oriented, logic and symbolic.

ALWAYS TIED TO HARDWARE • Software does not run in isolation • Software is tied to a computer platform • As such failures are never observed in isolation • This has led some to not want software to be modeled at all

Page 53: PDF of all Presentations

CHARACTERISTICS IN CONTRAST DIFFERENT FAILURE MODE • Hardware:

• Hardware wears out leading to degraded performance • Failures are triggered due to harsh environment like excess heat and radiation

• Software: • Software does not wear out • Failures are due to latent faults that are triggered and propagate into failures

HIGHLY DEPENDENT UPON ITS ENVIRONMENT • Software is particularly sensitive to the environment CONTINUITY ASSUMPTION ONLY VALID WITHIN THE CONFINES OF A

LARGE NUMBER OF SMALL SUBDOMAINS • Predicates create non continuous behavior in program logic. • The typical ratio of predicates over lines of code is at the magnitude of 1/10. ONE OF A KIND • Data is difficult to collect

Page 54: PDF of all Presentations

CURRENT AREAS OF RESEARCH

Embedded (71%)

Web (14%)

Service (14%)

SRGM

Reliability / Test / Cost

Measures

Architecture (50%) Modeling (50%)

Other

Domain Characteristics

Dependable Systems

Based on a review of papers published between 2008-2013 in the Proceedings of the International Symposium on Software Reliability Engineering (ISSRE) [excludes 2012]

Page 55: PDF of all Presentations

AREAS OF RESEARCH EXAMPLES: OP DEFINITION

Environment

Computer (Hardware)

Software Software Software

Institutions/Customers

Factory Power Plant Bank School Corporation

Computer (1)

Computer (n)

Network

Computer (2)

Extract from: Smidts, C., Mutha, C., Rodríguez, M., & Gerber, M. J. (2014). Software testing with an operational profile: OP definition. ACM Computing Surveys (CSUR), 46(3), 39.

Page 56: PDF of all Presentations

AREAS OF RESEARCH EXAMPLES: OP DEFINITION

Critical Operations Considers

0..*

Abstraction Level

Field-of-Interest

0..1 considers

Executive Scope Profile

Component Level System Level

Profile

Inputs

Structure

External Error Input Data

Values Variable Name

Data Types

Input Data Constraints

Source Code

Application OP

Requires 0..1

1 1..*

0..1 uses

1

derived from mapping

mapping modifies

1..* 1

1 1

1..* adds dimension

changes

extends 0...*

0..1

1 input 1 1 0..*

Context

1

1 Executable: Y/N LPhase: Early/Later ToolSupp: Y/N

OP

Single , 68.5

Multiple, 31.6

Profile

Tree , 30

State, 50

Set, 20

Structure

Extract from Smidts, C., Mutha, C., Rodríguez, M., & Gerber, M. J. (2014). Software testing with an operational profile: OP definition. ACM Computing Surveys (CSUR), 46(3), 39.

Page 57: PDF of all Presentations

HW , 13.1

SW , 8.7

Human, 43.5

Unspecified, 34.8

Originator

Aware, 21.1

Unaware, 79

Critical Operations

Aware, 15.8

Unaware, 84.3

Executive Scope

Aware, 10.6

Unaware, 89.5

External Error

Auto , 57.9

Non-auto, 42.2

Tool Support

Component, 15.8

System, 84.3

Abstraction Level

Early , 84.3

Late, 10.6

Unspecified, 5.3 Lifecycle phase

Page 58: PDF of all Presentations

AREAS OF RESEARCH EXAMPLES: SOFTWARE AND HARDWARE RELIABILITY

ALU maps, showing usage and probability profiles. (a) Usage in terms of number of demands. (b) Delay probability profile. (c) Different-Function probability profile. (d) Stuck-at probability profile. (e) Combined failure probability profile.

Extracted from: Bing H.; Rodriguez, M.; Ming Li; Bernstein, J.B.; Smidts, C.S., "Hardware Error Likelihood Induced by the Operation of Software," Reliability, IEEE Transactions on , vol.60, no.3, pp.622,639, Sept. 2011 “© © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”

Page 59: PDF of all Presentations

AREAS OF RESEARCH EXAMPLES: SOFTWARE MEASURES TO SOFTWARE RELIABILITY

Group Root

Metric Total Rate Rank Inaccuracy

Ratio

I

BLOC 0.4 L 5.3764 CMM 0.6 M 5.5091

CC 0.72 H 5.6927 FP 0.5 L 5.2303

RSCR 0.69 M 5.3095 SDC 0.53 M 4.3765

II

CEG 0.44 L 2.7243 CF 0.81 H 1.4662

COM 0.36 L 2.7211 DD 0.83 H 0.1853 RT 0.55 M 0.0334

III FD 0.72 H 0.7397 TC 0.68 M 0.2146

Inaccuraccy ~ Group + Strata + Group*Strata

Sum Sq Group 54.556 Strata 3.986 Group:Strata 2.424 Residuals 1.901

Page 60: PDF of all Presentations

AREAS OF RESEARCH EXAMPLES: CHARACTERIZING SOFTWARE FAILURE

MECHANISMS

# Defect Name

1 Missing function

2 Extra function

F1

F3

F2

F1

F3

F2

Page 61: PDF of all Presentations

NEW AREAS OF RESEARCH

Page 62: PDF of all Presentations

THANK YOU

QUESTIONS?

Page 63: PDF of all Presentations

BREAK 10 :30 a.m. – 11:0 0 a.m.

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 64: PDF of all Presentations

Dr. George Apostolakis, Nuclear Regulatory Commission Ms. Maria Korsnick, Constellat ion Energy Mr. Thomas D. W hitmeyer, NASA

Moderator: Dr. A li Mosleh

Risk-Informed Regulat ions, Oversight , and Emergency Response

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

PANEL 2

Page 65: PDF of all Presentations

Commissioner George Apostolakis U.S. Nuclear Regulatory Commission

[email protected]

25th Anniversary of the Reliability Engineering Education Program

The Center for Risk and Reliability University of Maryland

April 2, 2014

Risk-Informed Regulation at the U.S. NRC

Page 66: PDF of all Presentations

2

NRC Oversight

New Reactors

Uranium Enrichment

Power Reactors Transportation Storage

Waste Disposal

Uranium Conversion

Medical/Industrial

Page 67: PDF of all Presentations

The Traditional Approach to Regulation (Before Risk Assessment)

• Management of uncertainty (unquantified at the time) was always a concern.

• Defense-in-depth and safety margins became embedded in the regulations (structuralist approach)

• “Defense-in-Depth is an element of the NRC’s safety philosophy that employs successive compensatory measures to prevent accidents or mitigate damage if a malfunction, accident, or naturally caused event occurs at a nuclear facility.” [Commission’s White Paper, February, 1999]

• Questions that defense in depth addresses:

What if we are wrong? Can we protect ourselves from the unknown unknowns?

3

Page 68: PDF of all Presentations

Design Basis Accidents

• A design basis accident is a postulated accident that a facility is designed and built to withstand without exceeding the offsite exposure guidelines of the NRC’s siting regulation

• They are very unlikely events

• They protect against “unknown unknowns”

4

Page 69: PDF of all Presentations

Technological Risk Assessment (Reactors)

• Study the system as an integrated socio-technical system

• Probabilistic Risk Assessment (PRA) supports Risk Management by answering the questions:

What can go wrong? (thousands of accident

sequences or scenarios) How likely are these scenarios? What are their consequences? Which systems and components contribute the most

to risk?

5

Page 70: PDF of all Presentations

What Did We Learn from the Reactor Safety Study?

6

Prior Beliefs: 1. Protect against large loss-of-coolant accident (LOCA) 2. Core damage frequency (CDF) is low (about once every 100 million years, 10-8 per reactor year) 3. Consequences of accidents would be disastrous

Major Findings 1. Dominant contributors: Small LOCAs and Transients 2. CDF higher than earlier believed (best estimate: 5x10-5, once every 20,000 years; upper bound: 3x10-4 per reactor year, once every 3,333 years) 3. Consequences significantly smaller 4. Support systems and operator actions very important

Beckjord et al, Reliability Engineering and System Safety, 39 (1993) 159-170.

Page 71: PDF of all Presentations

7

PRA Model Overview and Subsidiary Objectives

PLANT MODEL

CONTAINMENT MODEL

SITE/CONSEQUENCE MODEL

Level I Level II Level III

Results

Accident sequences leading to plant damage states

Results

Containment failure/release sequences

Results

Public health effects

PLANT MODE At-power Operation Shutdown / Transition Evolutions

SCOPE Internal Events External Events

CDF 10-4/ry

LERF 10-5/ry

QHOs

Uncertainties

Page 72: PDF of all Presentations

PRA Policy Statement (1995)

• The use of PRA should be increased to the extent supported by the state of the art and data and in a manner that complements the defense-in-depth philosophy

• PRA should be used to reduce unnecessary conservatisms associated with current regulatory requirements

8

Page 73: PDF of all Presentations

Risk-Informed Framework

9

Traditional “Deterministic”

Approach

• Unquantified probabilities

•Design-basis accidents •Defense in depth and

safety margins •Can impose unnecessary

regulatory burden •Incomplete

Risk-Based Approach

• Quantified probabilities

•Thousands of accident

sequences •Realistic

•Incomplete

Risk-Informed Approach

•Combination of traditional

and risk-based

approaches through a

deliberative process

Page 74: PDF of all Presentations

10

The Deliberation

DeliberationStakeholder

Input

Assumptions,Uncertainties

and Sensitivities

TechnicalAnalysisone or more techniques

Decision Criteria

Resource and

Schedule Constraints

Other Factors

Decision & Implementation

Options

Figure 3-2 Deliberations

NUREG-2150, A Proposed Risk Management Regulatory Framework

Page 75: PDF of all Presentations

Evolution of the NRC’s Risk-Informed Regulatory System

11

• 1980s: New or revised regulatory requirements based on PRA insights introduced

• 1990s: Risk-informed changes to a plant’s licensing basis allowed

• 2000: Change to a risk-informed reactor oversight process made

• 2004: Risk-informed alternative to comply with fire protection requirements introduced

• 2007: Regulation requiring PRAs for licensing new reactors issued

Page 76: PDF of all Presentations

Risk-Informed Decision Making in Regulation

• Improves Safety New requirements (SBO, ATWS) Design of new reactors Focus on important systems and locations

• Makes regulatory system more rational Reduction of unnecessary burden Operating experience accounted for in

regulations Consistency in regulations

12

Page 77: PDF of all Presentations

The Experience

13

• Successes Maintenance rule Risk-informed inservice inspection Reactor oversight process

• Challenges Fire protection Special treatment requirements Risk-informing Emergency Core Cooling System

rule

Page 78: PDF of all Presentations

Summary • Uncertainties have always been of concern in

safety

• Traditional methods manage uncertainties through design basis accidents and conservatism

• Risk assessment provides a global view of accident sequences, quantifies uncertainties, and is more realistic

• Risk-informed regulation combines the best features of both approaches

14

Page 79: PDF of all Presentations

Risk Informing the Commercial Nuclear Enterprise

Maria Korsnick Constellation Energy Nuclear Group, LLC

April 2, 2014

Promise of a Discipline: Reliability and Risk in Theory and in Practice

University of Maryland

Page 80: PDF of all Presentations

2

How our Business is Risk-Informed

I. Managing Risk to the Business II. Managing the Risk of Normal Plant Operation III. Defining Extreme External Events IV. Risk-Informed Lessons for External Events V. The Path Forward

Page 81: PDF of all Presentations

3

I. Managing Risk to the Business

Each CENG nuclear plant and the corporate office maintains a risk “Heat Map” – An easy-to-read summary of the risks associated with a

business unit – A method for communicating the risks being managed

‘Delphi Method’ for forecasting risk is used - experts come together to perform periodic assessments of Company risks – Subjective (non-analytical) probability and impact assessment

of each risk – Identifies mitigating actions

Page 82: PDF of all Presentations

Prob

abili

ty

Rar

e 5%

M

oder

ate

50%

Ve

ry L

ikel

y 95

%

Like

ly

80%

R

emot

e 20

%

Critical Insignificant

Minor

Significant

Major

Impact

Operating Fleet Heat Map (example)

Prolonged Forced Outage Medium

High

Low

Level of Control

Top risks.

Key Staffing

Regulatory Compliance

Nuclear Risk

Corporate / Generation

Environmental

Industrial / Radiological

Fire Protection/ NFP 805

Extended Refueling Outage

Tritium

Short term

Output/ Forced outages

Post-Fukushima Response

New NRC Regulations

EPA Cooling Water Intake regulation

GSI 191

4

Cyber Security

Significant risks from site maps grouped / assigned based on significance to fleet

Flood analysis White finding (G)

Page 83: PDF of all Presentations

5

Heat Map Risk Table (example)

Issue Risk Category Impact Probability Level of Control

Mitigation

Fukushima Response

High cost of studies, modifications, uncertainty of outcomes. Impact on emergency planning

Regulatory Major Likely Medium Active engagement with industry and NRC

EPA 316b Rule, Clean Water Act

Potential for significant modifications to intake structures at NY and MD sites

Regulatory Critical Remote Low Industry proposing alternatives to federal and state EPA

Key Staffing

High rate of retirements over next ten years, loss of expertise/talent

Corporate Significant Moderate Medium Implement Knowledge Transfer and Retention program

Page 84: PDF of all Presentations

6

II. Managing Risk during Normal Operations Plant-specific PRAs model core damage and large early

release frequency Risk impact of scheduled maintenance, plant evolutions,

and system outages are analyzed Four risk levels used to communicate to plant staff and

set controls

Pre-established risk mitigation measures applied as

higher risk conditions are entered

GREEN ORANGE YELLOW RED

Page 85: PDF of all Presentations

Example Plant PRA Risk

%CDF

30%

15%

9%

4%

3%

3%

2%

2%

2%

1%

Control Level to Prevent Boron Washout

Align RHR During ATWS

Align Fire Water for EDG Cooling

Manually Depressurize (Transient)

Vent PC (Local Actions including use of Port. Powerpack)

Isolate SW Header Flood in RB

Initiating Event Distribution Potential Risk Increase Factor for Key Equipment

System Percentage Contribution to CDF Key Operator Actions

Description

Respond to Control Room Fire

Control Service Water and Open Room Doors (HVAC)

Align Containment Heat Removal

Vent PC (Air or Div I AC lost)

Fi re, 42%

Flood, 6%

Div II AC, 8%

LOOP, 7%

Div I AC, 7%

Loss of 2 SWP Pumps, 5%

Lake Intake, 4%

Feedwater, 4%

MSIV, 3%

Seismic, 4%

Condenser, 2%Other, 8%

1 10 100 1000

Div I Emergency Switchgear

Div II Emergency Switchgear

Div I Emergency DC

Div I 120V Emergency AC

Div II 120V Emergency AC

2RHS A/LPCS Supp Pool Return

Div 1 600V Emergency Switchgear

125V DC Switchgear

Div 2 600V Emergency Switchgear

2RHS B/RHS C Supp Pool Return

0.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%

Colors correspond to the associated System Health Report status as of 4th quarter in 2013

Risk Thresholds>x30>x15>x3≤ x3

Page 86: PDF of all Presentations

Hypothetical PRA Risk Planetary Charts

Plant 3

Plant 1 Plant 2

Plant 4

Every Plant is Unique – design, internal / external events

Risk insights are gained by comparing plant risk profiles Physical

Modifications Protective Barriers Procedures Operator Response

Times Maintenance

Practices Housekeeping

Page 87: PDF of all Presentations

9

III. Defining Extreme External Events Original plant design for external events (security, seismic, flood, fire)

based on regulations and best state of knowledge of risk at time of licensing

Industry understanding of risk has been highly dynamic – 1975 Browns Ferry fire – 2001 terrorist attacks – 2011 Japan earthquake and tsunami (Fukushima)

Evolving risk insights from new data creates constant “churn” in design and operation of our plants – Fire: industrial fire code - to - “Appendix R” - to - NFPA 805 – Revised design basis security threat, robust defenses, cyber – Post-Fukushima reassessment of earthquake frequency and intensity for central and

eastern US plants (NRC GSI-199) – Post-Fukushima reassessment of design basis flood/frequency

Page 88: PDF of all Presentations

10

IV. Risk-informed Lessons for External Events

The uncertainties are real and unavoidable – Extrapolation from internal event modeling experience is not

applicable to other models – Reliance on numerical mean values is not sufficient – Data supporting rare events may have large uncertainty (e.g.,

floods)

Undue focus on numerical outcomes leads to a reduced emphasis on important insights

Adding conservatism in PRA is not an antidote, it can significantly distort sound risk-informed decision-making

Page 89: PDF of all Presentations

11

Case in Point NFPA-805 FPRA Challenge: Deterministic PRA mentality distorts risk perspective

– Conservatisms added at every major step of the process to “bound” uncertainties

Results do not match operating experience benchmarks – Risk-significant fires over-predicted – Fires with significant spurious operations over-predicted

Outcome: Disproportionately large resources spent on model refinements and plant modifications

Page 90: PDF of all Presentations

Significant Departure from Realism =

Ineffective Decision-Making

Conditional Core Damage Probability Conservatism

Fire Suppression Conservatism

Fire Severity Conservatism

Fire Frequencies

Conservatism

Large Conservatism in Fire PRA Building Blocks

+

+

+

+

Compounding conservatism reduces effectiveness of decision making tool

Page 91: PDF of all Presentations

13

V. The Path Forward

Objective

Proposed Actions

Industry NRC

Gain a more complete and balanced understanding of important risk contributors

Continue development of more realistic and complete plant-specific PRAs

Move away from imbedding conservatism in PRA models - Starts with fire PRA

Clarify risk-informed decision-making process that can deal with uncertainties

Propose a practical integrated decision-making process

Adapt/adopt a practical integrated decision-making process consistent with RG 1.174

Educate decision-making stakeholders on risk-informed decision-making

Provide focused PRA training to industry staff and decision-makers

Provide focused PRA training to NRC staff and decision-makers

Develop technical resources to support better risk-informed understanding

Expand EPRI/OG commitment to training and technology

Expand training on truly risk-informed decision-making

Page 92: PDF of all Presentations

14

Key Takeaway

PRA has added tremendous value to the Nuclear Industry allowing us to operate plants safer.

Addressing very low probability / high consequence events can be as important as addressing high probability / high consequence events.

Challenges remain with the tools: – Risk insights are masked by over conservatism or deterministic

approach – back to basics. – Uncertainty matters – what can we do to address and reduce

uncertainty?

Page 93: PDF of all Presentations
Page 94: PDF of all Presentations

•  Launch  Date  Name      Country    Result  Reason  •  1960  Korabl  4      USSR  (flyby)  Failure  Didn't  reach  Earth  orbit  •  1960  Korabl  5      USSR  (flyby)  Failure  Didn't  reach  Earth  orbit  •  1962  Korabl  11      USSR  (flyby)  Failure  Earth  orbit  only;  spacecraI  broke  apart  •  1962  Mars  1      USSR  (flyby)  Failure  Radio  Failed  •  1962  Korabl  13      USSR  (flyby)  Failure  Earth  orbit  only;  spacecraI  broke  apart  •  1964  Mariner  3      US  (flyby)    Failure  Shroud  failed  to  jeOson  •  1964  Mariner  4      US  (flyby)    Success  Returned  21  images  •  1964  Zond  2      USSR  (flyby)  Failure  Radio  failed  •  1969  Mars  1969A    USSR    Failure  Launch  vehicle  failure  •  1969  Mars  1969B    USSR    Failure  Launch  vehicle  failure  •  1969  Mariner  6      US  (flyby)    Success  Returned  75  images  •  1969  Mariner  7      US  (flyby)    Success  Returned  126  images  •  1971  Mariner  8      US    Failure  Launch  failure  •  1971  Kosmos  419    USSR    Failure  Achieved  Earth  orbit  only  •  1971  Mars  2  Orb/Lander    USSR    Failure  Orbiter  arrived,  but  no  useful  data  and  Lander  destroyed  •  1971  Mars  3  Orb/Lander    USSR    Success  Orbiter  obtained  approximately  8  months  of  data  and  lander  landed  safely,  but  only  20  seconds  of  data  •  1971  Mariner  9      US    Success  Returned  7,329  images  •  1973  Mars  4      USSR    Failure  Flew  past  Mars  •  1973  Mars  5      USSR    Success  Returned  60  images;  only  lasted  9  days  •  1973  Mars  6  Orb/Lander    USSR            Success/Failure  Occulta\on  experiment  produced  data  and  Lander  failure  on  descent  •  1973  Mars  7  Lander    USSR    Failure  Missed  planet;  now  in  solar  orbit.  •  1975  Viking  1  Orb/Lander    US    Success  Located  landing  site  for  Lander  and  first  successful  landing  on  Mars  •  1975  Viking  2  Orb/Lander    US    Success  Returned  16,000  images  and  extensive  atmospheric  data  and  soil  experiments  •  1988  Phobos  1  Orbiter    USSR    Failure  Lost  en  route  to  Mars  •  1988  Phobos  2  Orb/Lander    USSR    Failure  Lost  near  Phobos  •  1992  Mars  Observer    US    Failure  Lost  prior  to  Mars  arrival  •  1996  Mars  Global  Surveyor    US    Success  More  images  than  all  Mars  Missions  •  1996  Mars  96      Russia    Failure  Launch  vehicle  failure  •  1996  Mars  Pathfinder    US    Success  Technology  experiment  las\ng  5  \mes  longer  than  warranty  •  1998  Nozomi      Japan    Failure  No  orbit  inser\on;  fuel  problems  •  1998  Mars  Climate  Orbiter    US    Failure  Lost  on  arrival  •  1999  Mars  Polar  Lander    US    Failure  Lost  on  arrival  •  1999  Deep  Space  2  Probes    US    Failure  Lost  on  arrival  (carried  on  Mars  Polar  Lander)  •  2001  Mars  Odyssey    US    Success  High  resolu\on  images  of  Mars  •  2003  Mars  Express  Orbiter/Beagle  2    ESA              Success/Failure  Orbiter  imaging  Mars  in  detail  and  lander  lost  on  arrival  •  2003  Mars  Rover  -­‐  Spirit    US    Success  Opera\ng  life\me  of  more  than  15  \mes  original  warranty  •  2003  Mars  Rover  -­‐  Opportunity  US    Success  Opera\ng  life\me  of  more  than  15  \mes  original  warranty  •  2005  Mars  Reconnaissance  Orbiter  US    Success  Returned  more  than  26  terabits  of  data  (more  than  all  other  Mars  missions  combined)  •  2007  Phoenix  Mars  Lander    US    Success  Returned  more  than  25  gigabits  of  data  •  2011  Mars  Science  Laboratory  US    Success  Exploring  Mars'  habitability  •  2011  Phobos-­‐Grunt/Yinghuo-­‐1  Russia/China  Failure  Stranded  in  Earth  orbit  •  2013  Mangalyaan    India    En  route  On  way  to  Mars  •  2013  MAVEN      US    En  route  On  way  to  Mars  

•  hip://mars.nasa.gov/programmissions/missions/log/  

Page 95: PDF of all Presentations

•  The  path  to  Mars  involves  closing  knowledge  and  performance  gaps  in  a  systema\c  manner:    –  The  health  threat  from  exposure  to  high-­‐energy  cosmic  rays  and  other  ionizing  

radia\on  and  nega\ve  effects  of  a  prolonged  low-­‐gravity  environment  on  human  health,  including  eyesight  loss.  

–  Human  performance  considera\ons  related  to  a  long-­‐dura\on  isolated  mission  in  a  confined  habitable  space.  

–  The  inaccessibility  of  terrestrial  medical  facili\es.  –  Cri\cal  systems,  including  propulsion,  habita\on,  and  life  support  that  are  reliable,  

require  liile  to  no  maintenance,  and  have  a  small  mass/volume.  –  Long  dura\on  naviga\on,  and  opera\ons  in  deep  space  environment.  –  Ability  for  crew  to  operate  autonomously  including  onboard  analysis  of  crew  and  

environmental  samples.    

Page 96: PDF of all Presentations

Mars  228,000,000  kilometers    

ISS  400  kilometers    

Today   2020’s   2030’s  

•  1.5  year  +  crew  dura\on  •  Crew  health  and  performance  vital  to  a  mission  •  Habita\on  and  life  support  and  other  cri\cal  systems  mass/size  limited  and  must  have  high  reliability  with  limited  consumable  resupply  

•  Limited  spares,  systems  must  be  reliable  •  No  opportunity  for  ground  valida\on  of  crew/  environmental  samples  or  system  failure  

•  Communica\on  delay  of  up  to  42  minutes  •  No  emergency  crew  return  •  Heavy  liI  available  to  support  Mars  transit  

•  6  month  crew  dura\on  •  Crew  health  and  performance  research  in-­‐work  •  Habita\on  and  life  support  and  other  cri\cal  systems  are  large  and  require  regular  maintenance  and  consumable  resupply  

•  Preposi\oned  spares  and  regular  resupply  •  Ground  analysis  of  crew/environmental  samples  and  system  failures  

•  Near  real-­‐\me  communica\ons  •  Any  \me  crew  return  •  Heavy  liI  capability  in  development    

Page 97: PDF of all Presentations

Cis/trans  lunar  space    443,400  kilometers    

Mars  228,000,000  kilometers    

ISS  400  kilometers    

Today   2020’s   2030’s  

Mission  Formula\on  -­‐  System  Design  –  Technical  Management  –  Mission  Opera\ons  

 (2)  ISS  to  2024  and  Cis-­‐lunar  are  Essen\al  to  Turn  Unknown  Risk  to  Known  Risk  

•  Crew  Health  •  Human  Performance  •  System  Reliability    

(3)  Make  Risk  Informed  Decisions      Iden\fy  Alternates  –  Analyze  Risk  –  Make  Informed  Decisions    

(1)  Establish  An  Objec\ve  Hierarchy      

Page 98: PDF of all Presentations
Page 99: PDF of all Presentations

LUNCH 12:30 p.m. – 2:0 0 p.m.

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 100: PDF of all Presentations

Dr. W allace Loh President Universit y of Maryland

REMARKS

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 101: PDF of all Presentations

Dr. Jeong H. Kim Ent repreneur Chairman, Kisw e Mobile, Inc. Former President , Bell Lab

KEYNOTE SPEAKERS

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Mr. Ken Farquhar P res ident & G eneral Manager, S y stem s E ngineering and Miss ion S upport B us iness U nit, ManT ec h International

Page 102: PDF of all Presentations

Dr. Hoang Pham, Rutgers Universit y Dr. Vasiliy Krivt sov, Ford Motors Dr. J. W esley Hines, Universit y of Tennessee

Moderator: Dr. Marvin Roush

Reliab ilit y Educat ion: Challenges and Potent ial of a Non-Trad it ional Engineering Discip line

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

PANEL 3

Page 103: PDF of all Presentations

The Whereabouts of Reliability Education: Challenges & Opportunities

Hoang Pham Department of Industrial & Systems Engineering

Rutgers University

April 2, 2014

Page 104: PDF of all Presentations

Reliability Education

Reliability is a discipline that has been studied for several decades.

Today several dozen graduate programs in the US and hundreds worldwide offer reliability courses, and some universities have entire reliability programs.

There is a gap between reliability theory and practice, between school and industry, book knowledge and real world applications.

Due to changes in technology, the expectation for a reliability engineer has been changing and getting higher.

Page 105: PDF of all Presentations

Some Reliability books in … 1960s

Igor Bazovsky (1961), Reliability Theory and Practice D. K. Lloyd and M. Lipov (1962), Reliability Management,

Methods and Mathematics N. H. Roberts (1964), Mathematical Models in Reliability

Engineering G. H. Sandler (1964), System Reliability Engineering R. B. Barlow and F. Proschan (1965), Mathematical Theory of

Reliability

Page 106: PDF of all Presentations

Engineering

Reliability Engineering

Reliability Programs Computer

Science

Operations Research

& Management

Statistics & Mathematics

Reliability Programs

Reliability Management

Reliability Statistics

Page 107: PDF of all Presentations

In today’s global market, the only way to stay ahead of the competition is to provide:

Better products! Better service! Better customer experience every time!

Sample 3D TV

Boeing 787

Page 108: PDF of all Presentations

Reliability Computing Reliability requirement: 0.999999999

“The airplane systems and associated components …must

be designed so that the occurrence of any failure condition which would prevent the continued safe flight and landing…is extremely improbable (1 per billion flights~10-9). Compliance… must be shown by analysis…”

FAA Federal Aviation Regulations 25.1309

Page 109: PDF of all Presentations

Reliability Challenges From Theory to Practice

DATA QUALITY The Data of Everything!

Page 110: PDF of all Presentations

Reliability Challenges From Theory to Practice PREDICTIVE MODELING * The Uncertainty in Modeling! * What Models Should Be Used?

Page 111: PDF of all Presentations

Predictive Modeling

based on data and statistical methods

“Prediction is difficult, especially when it’s about the future!”

Operating Environments Testing

Environments

modelling

application

prediction

controlled random

Page 112: PDF of all Presentations

Many reliability studies:

Controlled Environment ≈ Operating Environment

Systemability model

Operating Environments Testing

Environments

modelling

application

prediction

controlled random

1 Controlled environment( ) Operating environmentf

ηη

=

Page 113: PDF of all Presentations

Reliability -- Definition

The probability that the system is still operating at time t.

where f(t) probability density function h(t) failure intensity rate.

0

( )( )( ) ( )

t

h s dsH t

t

R t f s ds e e∞ −

−∫

= = =∫

Page 114: PDF of all Presentations

Systemability -- Definition The probability that the system is still operating

at time t subject to the uncertainty of the operating environments.

The systemability function is [Pham,2005]:

where F is a distribution function of η.

0

( )

( ) ( )

t

h s ds

sR t e dFη

η

η− ∫

= ∫

Page 115: PDF of all Presentations

Systemability approximations using Taylor series:

2( ) 2 ( )( ) 1 ( )

2!H t H t

sR t E e H t eη µσ− − = +

( )( ) ( )H tsR t e dFη

η

η−= ∫

Page 116: PDF of all Presentations

Loglog Distribution – Example

Assume system lifetime ~ Loglog(a,b) with failure rate

Assume η ~ gamma(α, β) System reliability function

1( ) ln 0, 1, 0bb th t b a t a t a b−= > > >

11( )

btaR t e −=

Page 117: PDF of all Presentations

Failure rate h(t) for various values of a and b = 0.5

0 50 100 150 200 2500.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

t

h(t)

Loglog distribution

b=0.5; a=1.1b=0.5; a=1.13b=0.5; a=1.15

Page 118: PDF of all Presentations

Loglog Dist. - Example

Systemability function

Systemability approximations

2 ( )1

btR t

a

αβ

β

= + −

( ) ( )2 1

3 2

1( ) 1

2

btb ataR t e

α

βα

β

−−

− = +

Page 119: PDF of all Presentations

Systemability vs Systemability approximation for 1.15, b 0.05a = =

0 50 1000.82

0.84

0.86

0.88

0.9

0.92

Time

Syste

mab

ility

Fun

ctio

ns

Systemability --(alpha,beta) = (2,3)

R1R2R3

0 50 1000.75

0.8

0.85

0.9

0.95

Time

Syste

mab

ility

Fun

ctio

ns

Systemability --(alpha,beta) = (3,2)

R1R2R3

0 20 40 60 80 1000.2

0.4

0.6

0.8

1

Time

Syste

mab

ility

Fun

ctio

ns

Systemability --(alpha,beta) = (12.5,2.5)

R1R2R3

Page 120: PDF of all Presentations

What Models Should Be Used?

(H. Pham, “A New Software Reliability Model with Vtub-Shaped Fault-Detection Rate and the Uncertainty of Operating Environments”, Optimization, vol 63, 2014:

Published online December 2013)

Page 121: PDF of all Presentations

Model Comparison

Model m(t) Goel-Okumoto

(G-O) Delayed S-shaped

Inflection S-shaped

Yamada Imperfect debugging l

PNZ model

Pham-Zhang model

Dependent-parameter model 1

Dependent-parameter model 2

Vtub-shaped fault-detection rate model

( ) (1 )btm t a e−= −

( ) (1 (1 ) )btm t a bt e−= − +

(1 )( )1

bt

bt

a em teβ

−=

+

( ) [1 ][1 ] btm t a e a tbα α−= − − +

-bt( ) [1 ][1 ]1 e

btam t e tbα α

β− = − − + +

1( ) ( )(1 ) ( )1

bt t btbt

am t c a e e ebe

α

αβ− − −

= + − − − −+

( )( ) 1 ( 1)tm t t t e γα γ γ −= + + −

( ) ( )

( ) ( ) ( )

0

0

00

0

11

1 1 1

t t

t t

tm t m et

t t t e

γ

γ

γγ

α γ γ γ

− −

− −

+=

+ + + − + −

( ) 11

btm t N

a

αβ

β

= − + −

Page 122: PDF of all Presentations

MSE: measures the deviation between the predicted values with the actual observation

Predictive ratio risk (PRR): measures the distance of model estimates from the actual data against the model estimate

Predictive power (PP): the distance of model estimates from the actual data against the actual data

Criteria For Model Selection ( )

2

1

ˆ ( )MSE

n

i ii

m t y

n l=

−=

2

1

ˆ ( )ˆ ( )

ni i

i i

m t yPRR

m t=

−=

2

1

ˆ ( )ni i

i i

m t yPP

y=

−=

Page 123: PDF of all Presentations

MSE: measures the deviation between the predicted values with the actual observation

Predictive ratio risk (PRR): measures the distance of model estimates from the actual data against the model estimate

Predictive power (PP): the distance of model estimates from the actual data against the actual data

Criteria For Model Selection ( )

2

1

ˆ ( )MSE

n

i ii

m t y

n d=

−=

2

1

ˆ ( )ˆ ( )

ni i

i i

m t yPRR

m t=

−=

2

1

ˆ ( )ni i

i i

m t yPP

y=

−=

Normalized Criteria Distance (NCD) value, Dk, measures the distance of the normalized criteria from the origin for kth model where Wj denotes the weight of the criterion j for j = 1,2,…,d

2

1

1

d

kjk js

jij

i

CD w

C=

=

=

∑∑

Page 124: PDF of all Presentations

Software System Test Data (System Software Reliability,2006)

Week index Exposure time (Cum.

system test hours) Fault Cum. fault 1 416 3 3 2 832 1 4 3 1248 0 4 4 1664 3 7 5 2080 2 9 6 2496 0 9 7 2912 1 10 8 3328 3 13 9 3744 4 17

10 4160 2 19 11 4576 4 23 12 4992 2 25 13 5408 5 30 14 5824 2 32 15 6240 4 36 16 6656 1 37 17 7072 2 39 18 7488 0 39 19 7904 0 39 20 8320 3 42 21 8736 1 43

Page 125: PDF of all Presentations

Model Comparisons & Results Model / Criteria MSE (Rank) PRR (Rank) PP (Rank)

1. G -O Model 6.61 (7) 0.69 (1) 1.10 (7)

2. Delayed S-shaped 3.27 (5) 44.27 (8) 1.43 (8)

3. Inflection S-shaped 1.87 (2) 5.94 (5) 0.90 (4)

4. Yamada imperfect debugging model

4.98 (6) 4.30 (4) 0.81 (3)

5. PNZ model 1.99 (3) 6.83 (7) 0.96 (6)

6. Pham-Zhang model 2.12 (4) 6.79 (6) 0.95 (5)

7. Dependent-parameter model 1 43.69 (9) 601.34 (9) 4.53 (9)

8. Dependent-parameter model 2 24.79 (8) 1.14 (2) 0.73 (1)

9. Vtub-shaped fault-detection rate model

1.80 (1) 2.06 (3) 0.77 (2)

Page 126: PDF of all Presentations

Model Comparisons & Results (cont.) Model / Criteria MSE (Rank) PRR (Rank) PP (Rank) NCD Value (Dk) Model Rank

1. G -O Model 6.61 (7) 0.69 (1) 1.10 (7) 0.115843 6

2. Delayed S-shaped 3.27 (5) 44.27 (8) 1.43 (8) 0.139264 7

3. Inflection S-shaped 1.87 (2) 5.94 (5) 0.90 (4) 0.077194 2

4. Yamada imperfect debugging model

4.98 (6) 4.30 (4) 0.81 (3) 0.086315 5

5. PNZ model 1.99 (3) 6.83 (7) 0.96 (6) 0.082414 4

6. Pham-Zhang model 2.12 (4) 6.79 (6) 0.95 (5) 0.082015 3

7. Dependent-parameter model 1

43.69 (9) 601.34 (9) 4.53 (9) 1.079700 9

8. Dependent- parameter model 2

24.79 (8) 1.14 (2) 0.73 (1) 0.278587 8

9. Vtub-shaped fault-detection rate model

1.80 (1) 2.06 (3) 0.77 (2) 0.066303 1

Page 127: PDF of all Presentations

Reliability Opportunities: Big Data! High Tech Companies in the past 20 years!

Amazon Inc. Founded: 1994 Yahoo Founded: 1994 eBay Founded: 1995 Google Founded: 1998 Facebook, Inc. Founded: 2004 YouTube Founded: 2005 Twitter Inc. Founded: 2006

Page 128: PDF of all Presentations

Engineering Knowledge

Reliability Programs Computer

Skill

School-Industry Projects

Statistics/ Management Skill

Knowledge That Should Be Covered in Reliability Programs

Page 129: PDF of all Presentations

Have a Wonderful Day!

Page 130: PDF of all Presentations

Reliability Education Opportunity: “Reliability Analysis of Field Data”

25th Anniversary of Reliability Engineering @ University of Maryland

Vasiliy Krivtsov, PhD Sr. Staff Technical Specialist Reliability & Risk Analysis

Ford Motor Company

Page 131: PDF of all Presentations

2

Discussion Outline

Introduction

Practical Importance of Reliability Analysis of Field Data

Modelling Peculiarities in Reliability Analysis of Field Data

Staggered Production/Sales

Bivariate Models (Time & Usage)

Seasonality

Data Maturation Issues

Illustrative Case Studies

Proposed Course Structure

Conclusions

Page 132: PDF of all Presentations

3

Practical Importance of Reliability Analysis of Field Data

Root cause analysis and future failure avoidance through

statistical engineering inferences on the failure rate trends and

factors (covariates) affecting them

Lab test calibration by equating percentiles of the failure time

distributions in the field and in the lab

Cost avoidance through early detection of field reliability

problems

Cash flow optimization through the prediction of the required

warranty reserve and/or the expected maintenance costs

Page 133: PDF of all Presentations

Staggered Production/Sales

Page 134: PDF of all Presentations

5

Number of failures at time unit interval j, with r0 = 0:

k

jp

pjj rd

k

jp

1j

1q

pqpj )rv(nRisk set exposed at time unit interval j :

Number of

vehicles

Time in

service

intervals

Failure time intervals

j = 1, …, k

i = 1, …, k 1 2 3 4 5 6 7 8 9 k

v1 1 r11

v2 2 r21 r22

v3 3 r31 r32 r33

v4 4 r41 r42 r43 r44

v5 5 r51 r52 r53 r54 r55

v6 6 r61 r62 r63 r64 r65 r66

v7 7 r71 r72 r73 r74 r75 r76 r77

v8 8 r81 r82 r83 r84 r85 r86 r87 r88

v9 9 r91 r92 r93 r94 r95 r96 r97 r98 r99

vk k rk1 rk2 rk3 rk4 rk5 rk6 rk7 rk8 rk9 rkk

Nonparametric Estimation

Formalized Data Structure:

j

j

jn

dh

Hazard function at the j-th failure time unit interval:

Page 135: PDF of all Presentations

6

Numerical Example

Jan'02 Feb'02 Mar'02 Apr'02 May'02 Jun'02 Jul'02 Aug'02 Sep'02 Oct'02 Nov '02

Volume

Jan'02 10,000 1 3 6 9 15 17 20 22 41 64Feb'02 10,000 0 2 5 10 12 18 19 24 45Mar'02 10,000 1 4 5 10 14 18 20 23

Apr'02 10,000 1 2 7 11 16 17 20

May'02 10,000 0 1 6 12 17 18

Jun'02 10,000 1 3 4 9 16

Jul'02 10,000 2 3 7 11

Aug'02 10,000 1 4 6

Sep'02 10,000 1 3

Oct'02 10,000 0Nov '02 10,000

Time

t

Risk Set

n(t)

Repairs

d(t)

0 110,000 0

1 100,000 8

2 90,000 25

3 80,000 46

4 70,000 72

5 60,000 90

6 50,000 88

7 40,000 79

8 30,000 69

9 20,000 86

10 10,000 64

29592

19523

9437

69921

59849

49759

39671

110000

100000

89992

79967

0.01396

0.00956

CDF

F(t)=1-R(t)

Cum Hazard

H(t)=Sh(t)

0.99907

0.99964

0.99992

01

0.00720

0.00951

0.01387

0.02053

Repair Month

0.99804

0.99654

0.99478

0.00008

0.00036

0.00093

0.00196

0.00346

0.00522

0.00036

0.00093

0.00196

0.00347

0.00524

0.007230.00199

0.00233

0.00441

0.00678

0.99280

0.99049

0.98613

0.979470.02075

Reliability

R(t)=e{-H(t)}

Mo

nth

in

Se

rvic

e

0

0.00008

0.00028

0.00058

0.00103

0.00150

0.00177

0

Sale

s M

onth

Hazard

h(t)=d(t)/n'(t)

Risk Set (corr)

n'(t)

0.00008

Mechanical Transfuser Example: 24MIS/Unlm usage warranty plan

Page 136: PDF of all Presentations

7 Time: t

CD

F: F

(t)

1.000E-4 12.0002.400 4.800 7.200 9.6000.000

0.020

0.004

0.008

0.012

0.016

x 8x 25

x 46

x 72

x 90

x 88

x 79

x 69

x 86

Mechanical Transfuser: Nonparametric Inferences

~1.4% failing @ 9 MIS

Concavity is an indication of an IFR. Note: F(t)≈H(t), for small F(t).

Page 137: PDF of all Presentations

8

j

k

jp

j

q

jpqpj rvn1

1

))((Risk set exposed at time unit interval j :

Probability of mileage not exceeding the warranty mileage limit at failure time unit interval j :

Nonparametric Estimation under a Bivariate Warranty Plan

12 24 36 48 t, MIS

12,000

36,000

60,000

Mileage

j

Page 138: PDF of all Presentations

9

Weibull Probability Plot: Mechanical Transfuser Data ReliaSoft Weibull++ 7 - www.ReliaSoft.com

Time: t

CD

F: F

(t)

0.100 100.0001.000 10.0000.001

0.005

0.010

0.050

0.100

0.500

1.000

5.000

10.000

50.000

90.000

99.000

0.001

x 8

x 25

x 46

x 72

x 90x 88

x 79x 69

x 86x 64

0.5

0.6

0.7

0.8

0.9

1.0

1.2

1.4

1.6

2.0

3.0

4.0

6.0

Probability-W eibullCB@ 95% 2-Sided [T]

All DataW eibull-2PRRX SRM MED FMF=627/S=99373

Data PointsProbability LineTop CB-IBottom CB-I

Vasiliy KrivtsovVVK9/22/20074:51:35 PM

Mechanical Transfuser – Warranty Forecast Summary:

Failure probability @ 24MIS: 0.1364 Population Size: 110,000 Total Expected Repairs: 15,004 Cost per repair: $30 Total Expected Warranty Cost: $450,120 Year-to-date Cost: $18,810 Required Warranty Reserve: $431,310

13.64

24

Page 139: PDF of all Presentations

10

Calendarized Forecasting

ReliaSoft Weibull++ 7 - www.ReliaSoft.com

Time: t

CD

F: F

(t)

0.100 100.0001.000 10.0000.001

0.005

0.010

0.050

0.100

0.500

1.000

5.000

10.000

50.000

90.000

99.000

0.001

x 8

x 25

x 46

x 72

x 90x 88

x 79x 69

x 86x 64

0.5

0.6

0.7

0.8

0.9

1.0

1.2

1.4

1.6

2.0

3.0

4.0

6.0

Probability-W eibullCB@ 95% 2-Sided [T]

All DataW eibull-2PRRX SRM MED FMF=627/S=99373

Data PointsProbability LineTop CB-IBottom CB-I

Vasiliy KrivtsovVVK9/22/20074:51:35 PM

13.64

24

Mechanical Transfuser – Warranty Forecast Summary:

Failure probability @ 24MIS: 0.1364 Population Size: 110,000 Total Expected Repairs: 15,004 Cost per repair: $30 Total Expected Warranty Cost: $450,120 Year-to-date Cost: $18,810 Required Warranty Reserve: $431,310

How will this total number of repairs be distributed along the calendar time, i.e. how many repairs to expect next month, the following month, etc.?

Page 140: PDF of all Presentations

11

TimeParametric

PDF

thru

Oct'02in

Nov'02

in

Dec'02…

in

Sep'04

in

Oct'04

thru

Oct'02in

Nov'02

in

Dec'02…

in

Sep'04

in

Oct'040 0 110000 0 0 0 0 01 0.0001 100000 10000 0 0 0 6 1 0 0 0

2 0.0003 89992 10008 10000 0 0 27 3 3 0 0

3 0.0006 79967 10025 10008 0 0 49 6 6 0 0

4 0.0010 69921 10046 10025 0 0 69 10 10 0 0

5 0.0014 59849 10072 10046 0 0 84 14 14 0 0

6 0.0019 49759 10090 10072 0 0 92 19 19 0 0

7 0.0023 39671 10088 10090 0 0 93 24 24 0 0

8 0.0029 29592 10079 10088 0 0 84 29 29 0 0

9 0.0034 19523 10069 10079 0 0 66 34 34 0 0

10 0.0039 9437 10086 10069 0 0 37 40 40 0 0

11 0.0045 0 9437 10086 0 0 0 43 46 0 0

12 0.0051 0 0 9437 0 0 0 0 48 0 0

13 0.0057 0 0 0 0 0 0 0 0 0 0

14 0.0063 0 0 0 0 0 0 0 0 0 0

15 0.0069 0 0 0 0 0 0 0 0 0 0

16 0.0076 0 0 0 0 0 0 0 0 0 0

17 0.0082 0 0 0 0 0 0 0 0 0 0

18 0.0088 0 0 0 0 0 0 0 0 0 0

19 0.0094 0 0 0 0 0 0 0 0 0 0

20 0.0100 0 0 0 0 0 0 0 0 0 0

21 0.0106 0 0 0 0 0 0 0 0 0 0

22 0.0112 0 0 0 0 0 0 0 0 0 0

23 0.0118 0 0 0 10000 0 0 0 0 118 0

24 0.0124 0 0 0 10008 10000 0 0 0 124 124

609 222 272 … 242 124

Population Exposed Predicted Number of Repairs

total ->

Calendarized Forecast (generic example)

k

ji

i1iijij )tt(n)t(fd

Calendar Time

Tim

e in S

erv

ice

15,004

Page 141: PDF of all Presentations

Time vs. Usage

Page 142: PDF of all Presentations

13

Time or usage?

time

mileage

time

mileage

Note: DFR in time domain Note: IFR in time domain

Note

: D

FR in m

ileage d

om

ain

Note

: D

FR in m

ileage d

om

ain

Depending on variability in mileage accumulation rates of individual vehicles, the same data may result in a contradicting inference in time and mileage domains.

Page 143: PDF of all Presentations

14

Time or usage? (Hu, Lawless & Suzuki, 1998)

Time (MIS)

H(t) ~1.1K/mo

~1K/mo

~0.8K/mo

~0.9K/mo

~0.6K/mo

Note: cum haz functions in time domain appear to be dependant on mileage accumulation, which suggests that time may be NOT the appropriate domain for this failure mode.

Mileage

H(t)

~1.1K/mo

~1K/mo

~0.8K/mo

~0.9K/mo

~0.6K/mo

Note: cum haz functions in mileage domain appear to be independent of mileage accumulation, which suggests that mileage may be the appropriate domain for this failure mode.

Page 144: PDF of all Presentations

15

Time or usage? (Kordonsky & Gertsbakh, 1997)

Time (MIS)

f(t)

Choose the scale that provides a lower coefficient of variation of the respective failure distribution.

Mileage

f(t)

Page 145: PDF of all Presentations

Data Maturity

Page 146: PDF of all Presentations

17

Data Maturity: Lot Rot

t

F(t)

Jan’06

Mar’06

May’06

t0

Data Maturity Problem:

CDF estimates for a nominally homogeneous population at a fixed failure time change as a function of the observation time.

Possible cause:

“Lot Rot”, i.e., vehicle reliability degrades from sitting on the lot prior to be sold.

Various observation

times

Solution:

Stratify vehicle population by the time spent on lot (the difference between sale date and production date). t

F(t)

Jan’06

Mar’06

May’06

t0

Units with 0-10 days on lot

Page 147: PDF of all Presentations

18

Data Maturity: Reporting Delays

t

F(t)

Jan’06

Mar’06

May’06

t0

Data Maturity Problem:

CDF estimates for a nominally homogeneous population at a fixed failure time change as a function of the observation time.

Possible cause:

The number of claims processed at each observation time is under-reported due to the lag between repair date and warranty system entry date.

Various observation

times

Solution:

Adjust* the risk set by the probability of the lag time, Wj:

t

F(t)

Jan’06

Mar’06

May’06

t0

At each observation time, risk sets adjusted to account for the under-reported claims

k

jp

1j

1q

jpqpj ))rv((n W

* J. Kalbfleisch, J. Lawless and J. Robinson, "Method for the Analysis and Prediction of Warranty Claims", Technometrics, Vol. 33, # 1, 1991, pp. 273-285.

Page 148: PDF of all Presentations

19

Data Maturity: Warranty Expiration Rush

t

F(t)

Jan’06

Mar’06

May’06

t0

Data Maturity Problem:

CDF estimates for a nominally homogeneous population disproportionably increases as a function of the observation time and proximity to the warranty expiration time.

Possible cause:

“Soft” (non-critical) failures tend to not get reported until the customer realizes the proximity of warranty expiration date.

Solution:

Use historical data on similar components to empirically* adjust for the warranty-expiration rush phenomenon.

*B. Rai, N. Singh “Modeling and analysis of automobile warranty data in presence of bias due to customer-rush near warranty expiration limit”, Reliability Engineering & System Safety, Vol. 86, Issue 1, pp. 83-94.

tw

t

F(t)

Mar’04

May’04

t0 tw

A basis for adjustment

Page 149: PDF of all Presentations

Development of a Successful Program in Reliability and Maintainability Engineering

Dr. Wes Hines Head, Nuclear Engineering

College of Engineering The University of Tennessee

[email protected]

UMD Reliability Engineering Symposium April 2, 2014

Page 150: PDF of all Presentations

Overview • Goal

– Provide a case study that may be useful in developing new reliability programs.

• Outline – What Reliability programs do we have at UT – History of how they were developed – Components of the program – What makes them successful

Page 151: PDF of all Presentations

3

Reliability Programs at UT • Reliability and Maintainability Center (RMC)

– University - industry association dedicated to improving industrial productivity, efficiency, safety & profitability through advanced maintenance and reliability technologies and management principles

– Industrial Center since 1996 with ~30 members

• Reliability and Maintainability Engineering Program (RME) – Interdisciplinary Academic Program

• Undergraduate Minor in RME • Graduate Certificate and/or MS in RME

– Local or Synchronous, Interactive Distance Delivery

• Prognostics, Reliability Optimization and Control Technologies (PROaCT) Laboratory – Interdisciplinary research program with professors and students in

industrial, mechanical, nuclear engineering, and statistics.

Page 152: PDF of all Presentations

UT History in Industry Focused RME • 1988 - Preventive Maintenance Engineering Laboratory (PMEL) under

Nuclear Engineering • 1995 - Proposal to Develop College-wide Maintenance and Reliability

Center (MRC) – Industry roundtable in July – Director named in September

• 1996 - Initial Meeting with 12 Charter Members • 1997 - NSF Combined Research and Curricula Development (CRCD)

Grant to develop 4 MRE courses • 1997 - Internship Program Created • 2000 - Initial Academic Program

– Undergraduate Certificate – Graduate Certificate

• 2007 - New RME Programs Approved – Master of Science in Reliability and Maintainability Engineering – Undergraduate Minor in Reliability and Maintainability Engineering

• 2009 - MS with Specialization in Prognostics • 2010 - RME Minor most utilized minor in the COE

Page 153: PDF of all Presentations

UT Reliability and Maintainability Center The Maintenance and Reliability Center is a university - industry association dedicated to improving industrial productivity, efficiency, safety & profitability through

advanced maintenance and reliability technologies and management principles.

* Education * Research & Technology Assessment * Information Sharing * Business Support & Alliances 50 Companies with a Desire to Improve

Page 154: PDF of all Presentations

Components of Reliability and Maintainability Engineering Program

• Process vs. Product Focus

• Original Academic Programs

– Undergraduate Certificate with Industry Partnership

• Coursework (2 courses) • Summer Bootcamp • Internship (12 weeks)

– Graduate Certificate • 4 courses: 12 hours • Stats 560 Mathematical Statics for Reliability • NE 483 Introduction to Reliability Engineering • NE 484 Advanced Maintenance Engineering • NE 579 Advanced Monitoring and Diagnostic

Techniques

Internship Class of 2000

Internship Class of 1998

Page 155: PDF of all Presentations

Alcoa, Bayer, Dow, DuPont, Eastman, Energizer, Fluor Global, Harley Davidson, Jacobs, Nissan, NiSource, Novelis, ORNL, Owens Corning, Redstone Arsenal,

SABIC, Schlumberger, SNL, Y-12, ….

Boot Camp Course

Internships

Page 156: PDF of all Presentations

Maintenance Technology Teaching Labs

Page 157: PDF of all Presentations

Real Time Interactive Distance Delivery • Supports the working class. • Courses are delivered live and interactively (i.e., synchronous

delivery) to the student's desktop computer via the World Wide Web • Taught in “Dual Delivery” format • Instructor wears wireless microphone • Local students attend class or log in from home • Distance students

– Multipoint audio communication – View slides, whiteboard, demos, etc. – Students can raise hands – Make presentations to class – Courses archived

• Content Delivery Methods – PowerPoint slides – Whiteboard – Windows application sharing – Video or audio clips

Page 158: PDF of all Presentations

Graduate Programs in Reliability Maintenance and Engineering

• Interdisciplinary program offered by the College of Engineering through one of the following six departments:

– Chemical and Biomolecular Engineering – Electrical Engineering and Computer Science – Industrial and Systems Engineering – Materials Science and Engineering – Mechanical, Aerospace and Biomedical Engineering – Nuclear Engineering

• Offered on campus and through web-based, synchronous, interactive, distance education.

• The RME graduate certificate program (12 hours) is designed to allow the credits to be applied towards an M.S. degree.

Page 159: PDF of all Presentations

Support and Integrate with Research Programs

Page 160: PDF of all Presentations

Give your COE Graduates a Niche (RME Minor)

Fifteen hours of coursework are required: Hours Core courses: 6

Introduction to Maintenance Engineering Introduction to Reliability Engineering

Statistics or Math Requirement (choose 1): 3 Probability and Statistics for Scientists and Engineers (Stats 251) Probability and Statistics (Math 323) Chemical Engineering Data Analysis (ChE 301) Probability and Random Variables (ECE 313)

Electives (choose at least 2): 6 Process Dynamics and Control (ChE 360) Engineering Data Analysis and Process Improvement (IE 300) Statistical Process Control (Stats 365) (for non IE) Process Improvement through Planned Experimentation (IE 440) Signals and Systems (ECE 315) Introduction to Pattern Recognition (ECE 471) Mechanical Engineering Instrumentation and Measurement (ME 345) System Dynamics (ME 363) Nuclear and Radiological Engineering Laboratory (NE 304) __

Total: 15

• 10% of COE graduates have the RME Minor – most desired minor in COE

Page 161: PDF of all Presentations

Summary • Garner strong industrial support

– Get their input on curriculum and laboratories – Partner through internship programs – Partner with research opportunities – Meet their needs!

• Make it available to a wide range of students – An interdepartmental college-based program reaches more

students – Increase your reach through distance education

• Build expertise to increase industrial and government research opportunities

• Explain the employment benefits to increase enrollment and promote student success (students will figure this out themselves)

Page 162: PDF of all Presentations

Questions ?

Page 163: PDF of all Presentations

BREAK 3:30 p.m. – 4 :0 0 p.m.

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 164: PDF of all Presentations

Dr. Darryll Pines Nariman Farvard in Professor Dean, A . James Clark School of Engineering

FUTURE AND IMPACT OF RELIABILITY ENGINEERING AT THE CLARK SCHOOL

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Special Presentation

Page 165: PDF of all Presentations

Mpact and Future of Reliability

Engineering DARRYLL J. PINES

APRIL 2, 2014

25TH ANNIVERSARY OF CENTER ON RISK AND RELIABILITY

Page 166: PDF of all Presentations

The Center for Risk and Reliability (CRR) was formed in 1989 as the umbrella organization for many of the risk and reliability research and development activities at the UMD Clark School of Engineering. CRR research covers a wide range of subjects involving systems and processes, and include topics on predictive reliability modeling and simulation, physics of failure fundamentals, software reliability and human reliability analysis methods, advanced probabilistic inference methods, system-level health monitoring and prognostics, risk analysis theory and applications to complex systems such as space missions, civil aviation, nuclear power plants, petro-chemical installations, medical devices, information systems, and civil infrastructures. Over 20 core and adjunct faculty from various engineering departments of the Clark School of Engineering form the pool of experts at CRR. CRR is also home to numerous research laboratories with extensive state of the art equipment and high performance computers. CRR is the research arm of the Reliability Engineering educational program, the largest and most comprehensive degree granting graduate program in the field of reliability and risk analysis of engineered systems and processes. The program offers MS, PhD, and Graduate Certificate in Reliability Engineering and Risk Analysis. All courses are available both through traditional on-campus and online delivery modes.

Center for Risk and Reliability

Page 167: PDF of all Presentations

Current Core Faculty

Page 168: PDF of all Presentations

Professor Neil Goldsman (ECE) Professor Carol Smidts (ME, OSU) Professor Joseph Bernstein (ECE, Israel) Adjunct Faculty and Lecturers Dr. Stuart Katzke (NIST) Dr. Nathan Siu (NRC) Dr. Norman Eisenberg (Independent Consultant) Dr. Mark Kaminiskiy (CRR-CEE) Dr. Roy Schuyler (Independent Consultant)

Affiliate and Adjunct Faculty Al-Sheikhly, Mohamad Professor Materials Science and Engineering 2309F Chemical and Nuclear Engineering Building Phone: 301-405-5214 | [email protected]

Desai, Jaydev Professor Mechanical Engineering 0160 Glenn L. Martin Hall Phone: 301-405-4427 | [email protected]

di Marzo, Marino Professor Fire Protection Engineering 3104B JM Patterson Building Phone: 301-405-5257 | [email protected]

Sandborn, Peter Professor, Director of MTECH Mechanical Engineering 2106A Glenn L. Martin Hall Phone: 301-405-3167 | [email protected]

Schmidt, Linda Associate Professor Mechanical Engineering 2104B Glenn L. Martin Hall Phone: 301-405-0417 | [email protected]

Page 169: PDF of all Presentations

Mpact-Rankings 1. City University of Hong Kong 2. Sandia National Laboratories 3. University of Southern California 4. National University of Singapore 5. University of California Berkeley 6. Politecnico di Milano 7. University of Electronic Science & Technol... 8. University of Maryland 9. University of Manchester 10. Stanford University

Microsoft Academic Ranking Reliability Engineering (based on publications)

1. Stanford University-Management Science and Engineering, 1-2 2. Massachusetts Institute of Technology-Operations Research, 2-4 3. Georgia Institute of Technology-Main Campus-Industrial Engineering, 1-3 4. Northwestern University-Industrial Engineering and Management Sciences, 4-12 5. Carnegie Mellon University-Operations Res/Information Systems/Manufacturing and Operating Systems, 4-17 6. University of California-Berkeley-Industrial Engineering and Operations Research, 3-10 7. University of Michigan-Ann Arbor-Industrial Operations and Engineering 4-11 8. Cornell University-Operations Research 6-18 9. Carnegie Mellon University-Engineering and Public Policy 8-28 10. Purdue University-Main Campus-Industrial Engineering 6-22 11. Princeton University-Operations Research and Financial Engineering 11-29 12. University of Iowa-Industrial Engineering 11-37 13. University of Nebraska-Lincoln-Industrial and Management Systems Engineering 28-65 14. University of Wisconsin-Madison-Industrial Engineering 6-22 15. Virginia Polytechnic Institute and State University Industrial and Systems Engineering 5-28 16. University of Florida-Industrial and Systems Engineering 12-40 17. University at Buffalo-Industrial Engineering 27-53 18. University of Pennsylvania 19. Operations and Information Management 5-26 20. Arizona State University-Industrial Engineering 11-34 21. Pennsylvania State University-Main Campus-Industrial and Manufacturing Engineering 7-23 22. University of Pittsburgh-Pittsburgh Campus-Industrial Engineering 30-55

23. University of Maryland-College Park- Reliability Engineering 6-29

2010 NRC Rankings (Industrial Engineering, Operations Research, Reliability Engineering)

Stanford University 1 Massachusetts Institute of Technology 1 California Institute of Technology 3 University of California--Berkeley 3 Georgia Institute of Technology 5 University of Illinois--Urbana-Champaign 5 University of Michigan--Ann Arbor 5 Princeton University 8 Cornell University 8 Purdue University--West Lafayette 10 Carnegie Mellon University 10 University of Texas--Austin (Cockrell) 10 University of California--Los Angeles (Samueli) 13 Northwestern University (McCormick) 13 Johns Hopkins University (Whiting) 13 University of Minnesota--Twin Cities 16 University of Maryland--College Park (Clark) 17 Pennsylvania State University--University Park 17 Texas A&M University--College Station (Look) 17 Virginia Tech 17 University of California--San Diego (Jacobs) 21 University of Wisconsin--Madison 21 Rensselaer Polytechnic Institute 23 Ohio State University 23 University of Washington 23

2015 US News Mechanical Engineering Ranking

Page 170: PDF of all Presentations

Mpact-Prestige Professional Society Fellows of Center

• Mohammed Modarres

• Fellow, American Nuclear Society • Ali Mosleh

• Fellow, Society of Risk Analysis • Bilal Ayuub

• Fellow, ASEE • Shapour Azarm

• Fellow, ASME • Greg Baecher,

• Fellow, ASCE • Arist Christou

• Fellow, ASME • Fellow, APS

Faculty Service on Leading Journals

• Editorial Boards/Associate Editors • Reliability Engineering and System

Safety Journal • Journal of Risk and Reliability. • International Journal on

Performability Engineering • International Journal of Reliability

and Safety (IJR) • SNAME’s Journal of Ship

Research, Ships and Offshore Structures Journal, Naval Engineers Journal (NEJ),

Page 171: PDF of all Presentations

Mpact-NAE For contributions to the development of Bayesian methods and computational tools in probabilistic risk assessment and reliability engineering.

For contributions to national defense and security through improved battlefield communication. Also Inducted in May 2004 for innovative engineering and entrepreneurship in communications technologies.

For the development, explication, and implementation of probabilistic- and reliability-based approaches to geotechnical and water-resources engineering.

Page 172: PDF of all Presentations

Mpact-Awards

1. Michel Cukier, NSF CAREER 2. Jeffrey Herrmann, Innovator of year 3. Monifa Vaughn-Cooke

Significant Junior Faculty Awards/Recognition

Page 173: PDF of all Presentations

Mpact-Book/Monograph Contributions

Page 174: PDF of all Presentations

Mpact-Partnerships CRR Research Partnerships •  Cooperative Research Agreements with government agencies: –  US NRC –  US Navy /NAVAIR-NAWCAD –  NASA –  EC Halden Research Center, Norway –  EEC Joint Research Center, Italy –  ETH Center for System Safety, Switzerland –  Norwegian Institute of Technology –  Paul Scherrer Research Institute, Switzerland •  Partnership with the industry: –  ManTech –  Reliability Information Analysis Center RIAC Partnership

Page 175: PDF of all Presentations

Mpact-Education Innovations Professional Education-OAEE • Online Professional Masters Degree • Graduate Certificate

#1 Columbia University (Fu Foundation) New York, NY #2 University of California—Los Angeles (Samueli) Los Angeles, CA #3 University of Wisconsin—Madison Madison, WI #4 University of Southern California (Viterbi) Los Angeles, CA #5 Pennsylvania State University—World Campus College, PA #6 Purdue University— West Lafayette West Lafayette, IN #7 University of Michigan—Ann Arbor Ann Arbor, MI #7 Virginia Tech Blacksburg, VA #9 North Carolina State University Raleigh, NC #9 Texas A&M University—Kingsville (Dotterweich) Kingsville, TX #11 Arizona State University (Fulton) Tempe, AZ #12 Polytechnic Institute of New York University New York, NY #12 South Dakota School of Mines and Technology Rapid City, SD #14 Johns Hopkins University (Whiting) Baltimore, MD #14 University of Maryland—College Park (Clark) College Park, MD #16 California State University—Fullerton Fullerton, CA #17 Cornell University Ithaca, NY #17 Lawrence Technological University Southfield, MI #17 Missouri University of Science & Technology Rolla, MO #20 Texas Tech University (Whitacre) Lubbock, TX

since 1993 are as follows: MS – 211 PhD – 97 Per OAEE’s records, the Master of Engineering and Graduate Certificate in Engineering degrees awarded since 1994 and 2000 respectively are as follows: M. Eng. Reliability On-Campus 46 M. Eng. Reliability Online 16 Total M. Eng. 62 GCEN Reliability On-Campus 10 GCEN Reliability Online 22 Total GCEN 32

Page 176: PDF of all Presentations

Mpact -Placement of Meng, MS and PhDs 2006 Kristine Fretz (currently with Johns Hopkins Applied Research Lab.) 2004 S. Chamberlain (Currently, ITT - Industrial Products Group Reliability Specialist and Area Manager, ITT Industries) 2003 Chi Yeh (Currently, Systems Engineering & Integration Branch, NASA, Glenn Research) 2001 F. Li (Currently, Materials Research Scientist, Corning, Inc.) 2000 F. Joglar (Currently, Manager, Fire Risk Group, SAIC) 2000 V. Krivtsov (Currently, Ford Technical Leader for Reliability & Statistical Analysis, Ford Motor Company) 1998 H. Hadavi (Energy Research Corp., Rockville, MD) 1998 J. O’Brien (Currently Director of Office of Nuclear Safety, DOE) 1998 Y. Guan (President and CEO, Advanced System Technology Management, Inc.) 1998 K. Ouliddren (Currently, Staff Researcher, Nuclear Research Centre SCK-CEN, Mol, Belgium) 1997 T. Ni (Currently, Deputy Dean, Shanghai University, China) 1997 A. Thunem (Currently, Halden Reactor Project, Norway). 1994 Y-S. Hu (Currently, Dean, Beijing Technology & Business University, and CEO of DML International Corp.) 1991 L. Hammersten (Currently, Research Analyst, MITRE Corp.) 1990 L. Chen VP at JP Morgan

Page 177: PDF of all Presentations

Work on Grand Challenge Problems Disaster Resilience Risk and Reliability of Critical Infrastructure

Page 178: PDF of all Presentations

Work on Grand Challenge Problems Global Public and Human Health • Risk and Reliability of Devices and System

Page 179: PDF of all Presentations

What of the Future? New Faculty Hires in ME-Reliability Engineering: • Monifa Vaughn-Cooke • Offers out to at least 2 individuals

Facilities: • Upgrades to Virtual Reality Cave under review to

support future research thrusts

Education: • Develop MOOC Course Series in Reliability

Engineering

Page 180: PDF of all Presentations

Some Perspectives from Dilbert

Page 181: PDF of all Presentations

Dr. Monifa Vaughn-Cooke Assistant Professor

FACULTY VISION FOR THE FUTURE OF RELIABILTY ENGINEERING

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 182: PDF of all Presentations

Dr. Mohammad Modarres Minta Mart in Professor

CLOSING REMARKS

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Page 183: PDF of all Presentations

ANNIVERSARY RECEPTION A N D A L U MN I R E U N IO N

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Join us for the

5:0 0 p.m. – 7:0 0 p.m. Presentations by Dr. Marvin Roush, Professor Emeritus Frank Groen, Ph.D., ‘0 0 Tim Hajenko, M.S., ’13, Lesa Ross, Ph.D., ‘0 9 Ken LaSala, Ph.D., ‘93

Page 184: PDF of all Presentations

THANK YOU T O O U R G E N E R O U S E V E N T S P O N S O R S

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

ISSA Technologies, Inc.