PDF of all Presentations

2 5 th A nniv e rsa ry S y m po sium Promise of a Discipline:

Reliability & Risk in Theory and Practice

Dr. A li Mosleh Nicole J. Kim Eminent Professor of Reliab ilit y Engineering

W ELCOME & OPENING REMARKS

25th ANNIVERSARY RELIABILITY ENGINEERING SYMPOSIUM

Dr. George Dieter Professor Emeritus Glenn L. Mart in Inst itute Professor of Engineering

HISTORY OF THE RELIABILITY ENGINEERING PROGRAM


Dr. B. Balachandran Minta Mart in Professor & Chair, Department of Mechanical Engineering

RELIABILITY ENGINEERING IN THE DEPARTMENT OF MECHANICAL ENGINEERING


Dr. Elias L. Anagnostou, Northrop Grumman Dr. Aris Christou, Universit y of Maryland Dr. Antoine B. Rauzy, Cent rale-Supélec Dr. Carol Smidt s, The Ohio State Universit y

Moderator: Dr. A li Mosleh

Front iers of Reliab ilit y Engineering


PANEL 1

Frontiers of Reliability Engineering

Panel 1 25th Anniversary Symposium

Promise of a Discipline: Reliability and Risk in Theory and Practice

University of Maryland College Park April 2, 2014

Frontiers… • Integrated Probabilistic Simulation (for design and operational phases) • Probabilistic Physics of Failure • X-Ware Systems Reliability

– Hardware/Software/Human – Interface failures – Soft Casual Models

• Hybrid Methods • Advanced Inference Methods (doing more with less) • New Modeling Languages • Model-Based System Health Management • Model-Based System Engineering • HAL-9000 • Resilience Engineering

Reliability and Risk in

Theory and Practice

University of Maryland

April 2, 2014

Elias Anagnostou Engineering Fellow, Research and Technology

Panel 1: Frontiers of Reliability

Engineering

Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014


Need For Risk-Based Fleet Management

• Issues

– Uncertainties in legacy approach force conservatisms

– Austere budgets now drive the need to extract all remaining capabilities while minimizing risks

– Pervasive objectives to increase readiness and lower life cycle costs necessitate a change in the current paradigm

– New vehicle requirements for reduced weight and longer life dictate a need for high-fidelity methods to manage risk

• Approach

– Advanced modeling and simulation tools that link materials-design-manufacturing-sustainment (Digital Thread)

– Virtual representation of a system as an integrated system of data, models, and analysis tools applied over the entire life cycle on a tail-number unique basis (Digital Twin)

– Concurrent Uncertainty Management across the material system life cycle

• Enables improved reliability, affordability and maintainability with an overall goal to reduce total ownership costs

– Sustainment - Approximately 40% more life can be extracted without structural modifications (DARPA/Navy Structural Integrity Prognosis System (SIPS) demonstrated results)

2 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014


De

fect

Siz

e

Probabilistic Predictions Updated By Imperfect

Sensor Evidence

Anticipated Usage

Actual Usage

Update With Sensor Data

DARPA/Navy Structural Integrity Prognosis System (SIPS)

• Prognosis system to manage uncertainty and provide actionable information for risk-informed fleet management – Increase asset availability and reduce cost w/o increasing risk

Approach

Develop the underlying critical technologies that enable prognosis and the demonstration of these in an integrated PROGNOSIS system:

– Physics-based modeling that captures interactions between structural damage drivers and material failure mechanisms

– Sensors that measure critical vehicle and materials parameters

– Reasoning and predictive modules that accept, compare, interpret and correlate the data from the sensors and models to provide structural reliability predictions

Reasoning &

Prediction

Physics-based

Models

Sensor Systems

Software System

OUTPUT: Current and future state probabilities



SIPS Program Organization

Prognosis Program Program Manager - Madsen

Principal Scientist - Papazian

Materials & Modeling

Anagnostou

Sensor Systems Silberstein

Reasoning & Predictions

Engel

System Architecture

Teng

Demonstrations Anagnostou

Engel

An integrated team of ≈ 75 engineers, scientists, professors and graduate students

4

Structures Material Science Manufacturing

Computer Science Experimentalists Info Management

Mathematics

Sensor Science Chemistry



SIPS Uncertainty Management

5

Current State

Physics-Based Models

Combines all the available information while accounting for their respective uncertainties

Model uncertainty

Usage uncertainty

False and missed indications

Assessment uncertainty

Repair effectiveness

As-manufactured

state

Material properties Environment

Maintenance

induced damage Missing &

corrupted data



SIPS Research Progression to Flight Demonstration

• Fixed-Wing Structures Application



Multi-scale Environment

Row of fastener holes

Hole #14

5.7 mm 100 µm

1 mm

Microstructurally small cracks



Microstructural Origins of Fatigue (7075)

1 mm

Microstructurally small cracks



Failure Progression From Initial State to Failure

Multiple cracked particles cause multiple micro-structurally small cracks. Some arrest, some grow then link together and form the dominant crack that leads to failure.

* Typical images from multiple samples 9 Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014


Physics-Based Models for Fatigue Life Prognosis

10

Ntotal = Nincubation + Nnucleation+ Nsmall crack + Nlarge crack

FASTRAN/UniGrow

FASTRAN (& Crack Coalescence) (Models for small crack growth & link-up)

Multi-Stage Fatigue/

VPS-MICRO (Models for nucleation

& small crack growth)

Geometric Approach (Models for incubation & nucleation stages by

coupling experimental observations with micro-

mechanical crystal plasticity simulations)

incinc

P

NC

2max

/

TH

MSC PSC

daG CTD CTD

dN

0.5625i pa D

Initial crack size

0.5625i pa D

Initial crack size

HCF loading dominated LCF loading dominated

2

max

00 2ˆ

p

I

ut

IIGS

GSCa

n

S

U

GS

GSCfCTD

HCF loading dominated LCF loading dominated

2

max

00 2ˆ

p

I

ut

IIGS

GSCa

n

S

U

GS

GSCfCTD

250 mm

a b

c

0 Cycle: 1 Cycle:

(a)

100 Cycle:

(b)

3000 Cycle:

(c) 10 mm

Loading Direction

7075-T651



FcSSK

KC

dN

dc

xxmaxeff

K

K

N

effi

c

max

i

2

1

11

Multiscale Fatigue Modeling Environment



Multi-Scale Modeling: 3D Digital Materials

• Statistical Characterization of Material

• Digital Replication of microstructure

Two program materials: 7075-T651 & 7050-T7451, and seven 7075-T651 legacy wing panel materials



Investigation of Damage Mechanics

• Experimental methods to characterize damage evolution

• Calibrate fatigue models at various length scales/damage mechanisms

13 Three specimens tested, ~1000 particles monitored per test



Framework Validation

• Experimental characterization of damage evolution • Validation of probabilistic framework

14 Total of 35 Specimens tested, 5668 cracks measured Approved for Public Release, Distribution Unlimited : Northrop Grumman Aerospace Systems Case 14-0601 Dated March 2014


Model Integration

• Captures critical microstructurally-sensitive damage mechanisms

• Captures probability of occurrence of life-limiting fatigue mechanisms

• Produces naturally-occurring initial crack sizes for the start of small crack growth analysis to failure

• Tailored to the as-built manufactured state per aircraft tail number

Incubation Nucleated at Cycle 30

80% of 1st Cycle 1000 Cycles

Physics-Based Models for Crack Nucleation

Material Cyclic Response at the Notch

Multiaxial Methods,Neuber & Glinka-ESEB

Bulk Material mStructure Statistics

Grain OrientationParticle Aspect Ratio

Particle Size

Geometry, Material & Fatigue Loading UniGrow/FASTRAN

PredictionsSmall & Large Crack Growth

to failure

Physics-Based Initial Crack Size Distribution

Particle Size, a

P(a)P(a)

Particle Size, a

P(a)P(a)

Spectrum LoadSpectrum LoadSpectrum Load

Response Surface to Select Particles Most

Likely to Crack(Incubation Filter)

Response Surface to Select Cracked Particles Most Likely to Spawn a Crack into the Matrix

(Nucleation Filter)

Probabilistic Output

p(t)

a1

t0

De

fec

t S

ize

Flight Hours t1

Initial Crack Size Distribution

Probabilistic Predictions vs. Experiment



EA-6B Outer Wing Panel Fatigue Test Prognosis Validation

• Original life predictions – Prior flight history of panel

– Panel swap history

– Pre-test NDI

– Distribution of constituent particle sizes in 7075-T651

• Predictions modified by: – Null sensor readings, detection at sensor

threshold, crack size estimates (all accounting for sensor accuracy and uncertainty characteristics)

• Bayesian reasoning system to make a probabilistic prediction based on uncertain input data

• As the test progressed: – Significant decrease in life uncertainty

– Significant increase in predicted usable life

• Observed crack sizes validated predictions

Predictions converged to truth as test progressed

Predictions for largest crack



SIPS P-3 Flight Demonstration

Onboard Sensors

“Workable Executable Prototype” demonstration of a combination of systems consistent with Navy fleet management practice

NDI (Omni-Scan)

Reasoning &

Prediction

Sensor Physics Based

Models

CATASTROPHIC

(1)

CRITICAL

(2)

MARGINAL

(3)

NEGLIGIBLE

(4)

FREQUENT (A)

= or > 100 / 100K

Flight Hours

1 3 7 13

PROBABLE (B)

10 - 99 / 100K

Flight Hours

2 5 9 16

OCCASIONAL (C)

1.0 - 9.9 / 100K

Flight Hours

4 6 11 18

REMOTE (D)

0.1 - 0.9 / 100K

Flight Hours

8 10 14 19

IMPROBABLE (E)

= or < 0.1 / 100K

Flight Hours

12 15 17 20

SEVERITY

HAZARD

CATEGORIZATION

FR

EQ

UE

NC

Y

CATASTROPHIC

(1)

CRITICAL

(2)

MARGINAL

(3)

NEGLIGIBLE

(4)

FREQUENT (A)

= or > 100 / 100K

Flight Hours

1 3 7 13

PROBABLE (B)

10 - 99 / 100K

Flight Hours

2 5 9 16

OCCASIONAL (C)

1.0 - 9.9 / 100K

Flight Hours

4 6 11 18

REMOTE (D)

0.1 - 0.9 / 100K

Flight Hours

8 10 14 19

IMPROBABLE (E)

= or < 0.1 / 100K

Flight Hours

12 15 17 20

SEVERITY

HAZARD

CATEGORIZATION

FR

EQ

UE

NC

Y

Cra

ck

siz

e

Flight Hours

SIPS

17

NAVAIR Chose Vehicle & Sensor System



Results at a Critical Location

• Model predicted ~50% probability of a significant crack in a critical location

• Sensor had no indications

• Their combination reduced the probability to ~ 1%

18

The Combination of Model Predictions and Sensor Evidence Agreed With All Teardown Findings

• What Caused the Wide Disagreement Between Model and Sensor? During teardown inspections we discovered that the hole had been drilled out

We presume this was done to remove an existing crack

Based on the amount of material removed, model predicted repair would have been performed about 2001

No repair records were available, however Phased Depot Maintenance was performed 6/00 to 1/01



Reliability Physics and Engineering: Key to Transformative Research

Aris Christou, MSE and ME Department, University of Maryland; [email protected]

"Advanced manufacturing is a family of activities that (a) depend on the use and coordination of information, automation, computation, software, sensing, and networking, and/or (b) make use of cutting edge materials and emerging capabilities enabled by the physical and biological sciences, for example nanotechnology, chemistry, and biology. It involves both new ways to manufacture existing products, and the manufacture of new products emerging from new advanced technologies.” —President’s Council of Advisors on Science and Technology Report to the President on Ensuring American Leadership in Advanced Manufacturing,

Introduction and Motivation • Industry profitability and success depend on yield and reliability. • Advanced semiconductors i.e. 2D, wide bandgap systems are a key for

numerous applications that extend from communications to automotive, defense and security.

• Manufacturing of components is strongly dependent on in depth reliability studies that include physics-based approaches to complement the currently used industry techniques that are not adequate for improving the current status of technology.

• Point-like nano/microscopic defects can often be the cause of a macroscopic device to collapse

• The challenge is a physics based approach to reliability through an integration of science and engineering.

• The transformative breakthroughs will be based on reliability physics, chemistry, mathematics and engineering.

Approach • Meeting the challenge will be based on novel material and defect

characterization techniques which are necessary to locate the prevalent defects as well as their concentration and dynamics over time.

• Dimensional reduction, lower and higher voltages, and higher frequencies impact impact negatively the reliability

• In-situ and ex-situ characterization, will be necessary to satisfy the program’s objectives.

• Examples include reliability predictors such as spin, Transport-, Raman-, Noise-spectroscopy, Imaging for defects down to monolayer size.

• The types of defects existing in the fabricated devices need to be identified. Determining which of the defects is the cause of failure and which are effects of the failure is very important.

• Nanometer resolution characterization techniques considerably smaller than the apparent average separation between traps are required. Physics based simulation and experimental validation to further the fundamental understanding of the degradation mechanisms must also be undertaken.

Reliability Grand Challenges Identify and Quantify the failure mechanisms arising through smaller dimensions, high electric fields, coupled effects of heat, strain, and electric polarization, gate current, and the relatively high density of extended and point defects endemic in most semiconductors. Gain a physics based knowledge through extensive and targeted characterizations and analyses and incorporate it into the failure models which can then become the basis for the new robust manufacturing science. Establish the basis for the new methodology for reliability prediction and manufacturing science for future technologies. Take basic science all the way to manufacturing through education and research and enable a competitive industry to be realized.

PAST LESSONS FROM INNOVATIVE RELIABILITY ASSESSMENT TECHNIQUES

Reliability Assessment Fabrication

Fixture Mounting

Step-Stress Tests

Burn-in Tests

Reliability Assessment Fabrication Noise

Measurements

1 week 1 month 6 months

7 hours

7 months

CONVENTIONAL METHOD

NEW METHOD

7 months

7 hours

Bas

e N

oise

Pow

er D

ensi

ty

(A2 )

Determined a strong correlation between device reliability and baseband noise characteristics

Temperature dependence of peaks in base noise power density indicates reliability

Identified trap levels responsible for degradation from temperature dependent noise measurements

OUTCOME

A. Reina et al., Nano Letters 9, 30 (2009)

Contour map of I2D/IG

Contour map of I2D/IG: 60 points over 63.5×45 μm2

Thin graphene layers (mono/bi/tri-layer): I2D/IG > 1 Half of the graphene layers are covered with thin graphene

layers

X: 10pts, step: 7.1 µm Y: 6pts, step: 9 µm

mapping on sample

RECENT LESSONS FROM INNOVATIVE POTENTIAL RELIABILITY ASSESSMENT TECHNIQUES

Ref: H. Kim, E. Pichonat, D. Vignaud, D. Pavlidis and H. Happy; Graphene Layers Grown by RTP-CVD on Nickel and Their Properties, WOCSDICE 2012

Degradation model Physics and Math

Experimental Results for Future Semiconductor Devices

Characterization techniques

Engineering

Materials

Physics

Chemistry

New model reliably predicts degradation and allows for Robust

Manufacturing

Yes

No

Technological effects Physics and Engineering

Design Test structures

Process Science, Chemistry

Change parameters/

expand model

+

parameters for modeling Basic test structures

(Electrical Engineering)

Model fits exp. results?

“Updated” degradation model

Future Materials test

structures

Future Semiconductors: New Physics (High field effects - stress/temperature - Mechanical)

An Interdisciplinary Approach Device Physics and Electrical Engineering Mathematics and Materials Science Chemistry and Physics

Example of Carbon Nanotube Composite Interconnects Cabon

Nanotube

Aluminum crystal

structure

Future Electronic Approach: • Mathematical Simulation • Process Science Modeling of Defects

0102030405060708090

100

1 2 3 4 5 6

Wafer

Yield

(%)

Education, Research and Innovation REPRODUCIBLE

ROBUST DESIGN

C

B

Vb2ADVANCED DESIGN

CIRCUITS AND SYSTEMS Establish Material and

Device models

PHYSICAL PARAMETER -RELIABILITY CORRELATION

Disseminate Results through publications

Improve fabrication yield.

Improved robustness

Develop compact designs.

Improve performance with compact designs.

Establish correlation between physical parameters and reliability.

Outcome and Conclusions • Promote cross-disciplinary approaches across scientific disciplines i.e.

reliability physics, materials, chemistry and more in addition to engineering.

• Initiate “transformative research” with societal impact i.e. power electronics and transport, T-Rays and medicine, communications and low-power etc. which are robust and manufacturable.

• Establish new methodologies for reliability prediction and manufacturing science for future technologies.

• Provide education and research experience for future engineers in new semiconductor technologies.

Thank you for your attention

New Logic Modeling Paradigms for Complex System Reliability and Risk Analysis

Antoine Rauzy

Chair Blériot-Fabre* - Ecole Centrale de Paris Ecole Polytechnique

FRANCE [email protected]

http://www.lgi.ecp.fr/pmwiki.php/PagesPerso/ARauzy

*Sponsored by SAFRAN group

Probabilistic Risk Assessment …

… is now established on a solid scientific ground … is a mature technology … is a great tool for decision making

So, what’s next?

• More openness • Higher level modeling languages • Wider spectrum of applications

Standard Representation Formats

Issues • Models are tool-dependent • Calculations are provably

difficult so calculation engines perform unwarranted approximations

<define-fault-tree name="FT1" > <define-gate name="top" > <or> <gate name="G" /> <basic-event name="C" /> </or> </define-gate> <define-gate name="G" > <and> <basic-event name="A" /> <basic-event name="B" /> </and> </define-gate> </define-fault-tree>

The Open-PSA Standard Representation Format for Fault Trees and Event Trees Challenge/research direction:

Define standard representation formats, with all the necessary constructs, with a clear and sound semantics

Version 3 of the Open-PSA standard under redaction • Simplifications • Block Diagrams • Multi-phase Markov Chains with Rewards

New Algorithms for Model Assessment

Typical example (US plant): • ~2 500 Basic Events PSA model

What has been calculated: • ~100 000 Minimal Cutsets • 95% of the Core Damage Frequency with

less than 5% of the Basic Events, 100% with 25%

In a word, 75% of the model is “useless”!

Issues: • Finding the right level of

abstraction is difficult to achieve

Original Model

Minimal Cutsets

Simplified Model

Design Filtering Algorithms that to build simpler models that are equivalent w.r.t. to observation means

Categories of Models

Challenge/research direction: Many possibly very different models are undistinguishable by observation means, i.e. results of virtual experiments (typically, calculation of failure scenarios). They are equivalent in the Turing test sense. Equivalent models form a category. Design mathematical concepts, algorithms and tools to determine the most representative (simplest?) model of a category.

MCS calculation

Minimal Cutsets

Original Models

Representative Model

?

High Level Modeling Languages

Issues: • Completeness of specifications

with respect to safety concerns • Distance between system

specifications and safety models • Integration with other system

engineering disciplines

System Specification Fault Trees

class component state Boolean working (init = true); event failure (delay = exponential(lambda)); transition failure: working -> working := false; end

AltaRica

• Formal • Event-Based • Textual & graphical • Multiple assessment tools

Calculations

Automated Generation

AltaRica features

AltaRica Mathematical Framework

domain componentState { STANDBY, WORKING, FAILED} class spareComponent componentState s (init = WORKING); Boolean demanded (reset = false); event turnOn (delay = 0, expectaction = 0.98), failureOnDemand (delay = 0, expectation = 0.02), turnOff (delay = 0), failure (delay = exponential(0.001)), repair (delay = exponential(0.1)); transition turnOn: s==STANDBY and demanded -> s := WORKING; failureOnDemand: s==STANDBY and demanded -> s := FAILED; turnOff: s==WORKING and not demanded -> s := STANDBY ; failure: s==WORKING -> s := FAILED; repair: s==FAILED -> s := STANDBY; end

s=WORKING

s=FAILED

s=STANDBY

failure not demanded? turnOff

demanded? turnOn

demanded? failureOnDemand

repair

Well founded generalization of: • Fault Trees, Blocks Diagrams • Markov chains, Stochastic Petri Nets

Guarded Transition Systems:

The AltaRica 3.0 Project

class Pump … end

AltaRica 3.0

compilation to Fault Trees

generation of sequences

Libraries patterns

Guarded Transition Systems

model checking Probabilité de l'ER

0.0000 2000.0000 4000.0000 6000.0000 8000.0000

2.0000e-1

3.0000e-1

4.0000e-1

5.0000e-1

6.0000e-1

7.0000e-1

8.0000e-1

9.0000e-1

1.0000e+0

Pr[STop event]

stochastic simulation reliability allocation

Reliability Data

SysML

AADL

FMEA Petri Nets

Dynamic FaultTrees

GUI for modeling GUI for simulation Version & Configuration

Management System

compilation to Markov Chains

Performances Assessment

Issues: • The business model of industry is moving from selling products to selling capacities • Companies have to take commitments and to do so to assess performances of

systems in presence of hazards.

= PRA languages and tools are well suited to assess capacities (it mainly suffices to assess mathematical expectations rather than probabilities)

Carol Smidts Department of Mechanical and Aerospace Engineering

The Ohio State University [email protected]

To be presented at the Reliability Engineering 25th Anniversary Symposium

April 2, 2014 University of Maryland, College Park

CHARACTERISTICS IN CONTRAST MORE RECENT • First hardware reliability paper appears in 1952 in Proceedings of the Institute of

Radio Engineers. • First software reliability paper appears in 1975 in IEEE Transactions on Software

Engineering. MORE COMPLEX • The complexity of typical hardware systems is several hundreds of components

(e.g., nuclear power plants). • The complexity of current software systems is millions of lines of source code

(e.g., 15 millions for the Linux kernel). Assuming a typical function consists of 200 lines of code, there are approximately 75,000 functions in the Linux kernel.

CHARACTERISTICS IN CONTRAST EVOLVES EXTREMELY FAST • The number of important programming languages introduced per decade is

approximately 10. This number has been constant since 1950.

0

2

4

6

8

10

12

14

16

1950 1960 1970 1980 1990 2000

Number of Important Programming Languages Emerged in each Decade

CHARACTERISTICS IN CONTRAST EVOLVES EXTREMELY FAST (Cont’d) • Programming paradigms have changed from non-structured to structured,

procedural to object-oriented. • Six main paradigms currently coexist: imperative, declarative, functional, object-

oriented, logic and symbolic.

ALWAYS TIED TO HARDWARE • Software does not run in isolation • Software is tied to a computer platform • As such failures are never observed in isolation • This has led some to not want software to be modeled at all

CHARACTERISTICS IN CONTRAST DIFFERENT FAILURE MODE • Hardware:

• Hardware wears out leading to degraded performance • Failures are triggered due to harsh environment like excess heat and radiation

• Software: • Software does not wear out • Failures are due to latent faults that are triggered and propagate into failures

HIGHLY DEPENDENT UPON ITS ENVIRONMENT • Software is particularly sensitive to the environment CONTINUITY ASSUMPTION ONLY VALID WITHIN THE CONFINES OF A

LARGE NUMBER OF SMALL SUBDOMAINS • Predicates create non continuous behavior in program logic. • The typical ratio of predicates over lines of code is at the magnitude of 1/10. ONE OF A KIND • Data is difficult to collect

CURRENT AREAS OF RESEARCH

Embedded (71%)

Web (14%)

Service (14%)

SRGM

Reliability / Test / Cost

Measures

Architecture (50%) Modeling (50%)

Other

Domain Characteristics

Dependable Systems

Based on a review of papers published between 2008-2013 in the Proceedings of the International Symposium on Software Reliability Engineering (ISSRE) [excludes 2012]

AREAS OF RESEARCH EXAMPLES: OP DEFINITION

Environment

Computer (Hardware)

Software Software Software

Institutions/Customers

Factory Power Plant Bank School Corporation

Computer (1)

Computer (n)

Network

Computer (2)

Extract from: Smidts, C., Mutha, C., Rodríguez, M., & Gerber, M. J. (2014). Software testing with an operational profile: OP definition. ACM Computing Surveys (CSUR), 46(3), 39.

AREAS OF RESEARCH EXAMPLES: OP DEFINITION

Critical Operations Considers

0..*

Abstraction Level

Field-of-Interest

0..1 considers

Executive Scope Profile

Component Level System Level

Profile

Inputs

Structure

External Error Input Data

Values Variable Name

Data Types

Input Data Constraints

Source Code

Application OP

Requires 0..1

1 1..*

0..1 uses

1

derived from mapping

mapping modifies

1..* 1

1 1

1..* adds dimension

changes

extends 0...*

0..1

1 input 1 1 0..*

Context

1

1 Executable: Y/N LPhase: Early/Later ToolSupp: Y/N

OP

Single , 68.5

Multiple, 31.6

Profile

Tree , 30

State, 50

Set, 20

Structure

Extract from Smidts, C., Mutha, C., Rodríguez, M., & Gerber, M. J. (2014). Software testing with an operational profile: OP definition. ACM Computing Surveys (CSUR), 46(3), 39.

HW , 13.1

SW , 8.7

Human, 43.5

Unspecified, 34.8

Originator

Aware, 21.1

Unaware, 79

Critical Operations

Aware, 15.8

Unaware, 84.3

Executive Scope

Aware, 10.6

Unaware, 89.5

External Error

Auto , 57.9

Non-auto, 42.2

Tool Support

Component, 15.8

System, 84.3

Abstraction Level

Early , 84.3

Late, 10.6

Unspecified, 5.3 Lifecycle phase

AREAS OF RESEARCH EXAMPLES: SOFTWARE AND HARDWARE RELIABILITY

ALU maps, showing usage and probability profiles. (a) Usage in terms of number of demands. (b) Delay probability profile. (c) Different-Function probability profile. (d) Stuck-at probability profile. (e) Combined failure probability profile.

Extracted from: Bing H.; Rodriguez, M.; Ming Li; Bernstein, J.B.; Smidts, C.S., "Hardware Error Likelihood Induced by the Operation of Software," Reliability, IEEE Transactions on , vol.60, no.3, pp.622,639, Sept. 2011 “© © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”

AREAS OF RESEARCH EXAMPLES: SOFTWARE MEASURES TO SOFTWARE RELIABILITY

Group Root

Metric Total Rate Rank Inaccuracy

Ratio

I

BLOC 0.4 L 5.3764 CMM 0.6 M 5.5091

CC 0.72 H 5.6927 FP 0.5 L 5.2303

RSCR 0.69 M 5.3095 SDC 0.53 M 4.3765

II

CEG 0.44 L 2.7243 CF 0.81 H 1.4662

COM 0.36 L 2.7211 DD 0.83 H 0.1853 RT 0.55 M 0.0334

III FD 0.72 H 0.7397 TC 0.68 M 0.2146

Inaccuraccy ~ Group + Strata + Group*Strata

Sum Sq Group 54.556 Strata 3.986 Group:Strata 2.424 Residuals 1.901

AREAS OF RESEARCH EXAMPLES: CHARACTERIZING SOFTWARE FAILURE

MECHANISMS

# Defect Name

1 Missing function

2 Extra function

…

F1

F3

F2

F1

F3

F2

NEW AREAS OF RESEARCH

THANK YOU

QUESTIONS?

BREAK 10 :30 a.m. – 11:0 0 a.m.


Dr. George Apostolakis, Nuclear Regulatory Commission Ms. Maria Korsnick, Constellat ion Energy Mr. Thomas D. W hitmeyer, NASA

Moderator: Dr. A li Mosleh

Risk-Informed Regulat ions, Oversight , and Emergency Response


PANEL 2

Commissioner George Apostolakis U.S. Nuclear Regulatory Commission

[email protected]

25th Anniversary of the Reliability Engineering Education Program

The Center for Risk and Reliability University of Maryland

April 2, 2014

Risk-Informed Regulation at the U.S. NRC

2

NRC Oversight

New Reactors

Uranium Enrichment

Power Reactors Transportation Storage

Waste Disposal

Uranium Conversion

Medical/Industrial

The Traditional Approach to Regulation (Before Risk Assessment)

• Management of uncertainty (unquantified at the time) was always a concern.

• Defense-in-depth and safety margins became embedded in the regulations (structuralist approach)

• “Defense-in-Depth is an element of the NRC’s safety philosophy that employs successive compensatory measures to prevent accidents or mitigate damage if a malfunction, accident, or naturally caused event occurs at a nuclear facility.” [Commission’s White Paper, February, 1999]

• Questions that defense in depth addresses:

What if we are wrong? Can we protect ourselves from the unknown unknowns?

3

Design Basis Accidents

• A design basis accident is a postulated accident that a facility is designed and built to withstand without exceeding the offsite exposure guidelines of the NRC’s siting regulation

• They are very unlikely events

• They protect against “unknown unknowns”

4

Technological Risk Assessment (Reactors)

• Study the system as an integrated socio-technical system

• Probabilistic Risk Assessment (PRA) supports Risk Management by answering the questions:

What can go wrong? (thousands of accident

sequences or scenarios) How likely are these scenarios? What are their consequences? Which systems and components contribute the most

to risk?

5

What Did We Learn from the Reactor Safety Study?

6

Prior Beliefs: 1. Protect against large loss-of-coolant accident (LOCA) 2. Core damage frequency (CDF) is low (about once every 100 million years, 10-8 per reactor year) 3. Consequences of accidents would be disastrous

Major Findings 1. Dominant contributors: Small LOCAs and Transients 2. CDF higher than earlier believed (best estimate: 5x10-5, once every 20,000 years; upper bound: 3x10-4 per reactor year, once every 3,333 years) 3. Consequences significantly smaller 4. Support systems and operator actions very important

Beckjord et al, Reliability Engineering and System Safety, 39 (1993) 159-170.

7

PRA Model Overview and Subsidiary Objectives

PLANT MODEL

CONTAINMENT MODEL

SITE/CONSEQUENCE MODEL

Level I Level II Level III

Results

Accident sequences leading to plant damage states

Results

Containment failure/release sequences

Results

Public health effects

PLANT MODE At-power Operation Shutdown / Transition Evolutions

SCOPE Internal Events External Events

CDF 10-4/ry

LERF 10-5/ry

QHOs

Uncertainties

PRA Policy Statement (1995)

• The use of PRA should be increased to the extent supported by the state of the art and data and in a manner that complements the defense-in-depth philosophy

• PRA should be used to reduce unnecessary conservatisms associated with current regulatory requirements

8

Risk-Informed Framework

9

Traditional “Deterministic”

Approach

• Unquantified probabilities

•Design-basis accidents •Defense in depth and

safety margins •Can impose unnecessary

regulatory burden •Incomplete

Risk-Based Approach

• Quantified probabilities

•Thousands of accident

sequences •Realistic

•Incomplete

Risk-Informed Approach

•Combination of traditional

and risk-based

approaches through a

deliberative process

10

The Deliberation

DeliberationStakeholder

Input

Assumptions,Uncertainties

and Sensitivities

TechnicalAnalysisone or more techniques

Decision Criteria

Resource and

Schedule Constraints

Other Factors

Decision & Implementation

Options

Figure 3-2 Deliberations

NUREG-2150, A Proposed Risk Management Regulatory Framework

Evolution of the NRC’s Risk-Informed Regulatory System

11

• 1980s: New or revised regulatory requirements based on PRA insights introduced

• 1990s: Risk-informed changes to a plant’s licensing basis allowed

• 2000: Change to a risk-informed reactor oversight process made

• 2004: Risk-informed alternative to comply with fire protection requirements introduced

• 2007: Regulation requiring PRAs for licensing new reactors issued

Risk-Informed Decision Making in Regulation

• Improves Safety New requirements (SBO, ATWS) Design of new reactors Focus on important systems and locations

• Makes regulatory system more rational Reduction of unnecessary burden Operating experience accounted for in

regulations Consistency in regulations

12

The Experience

13

• Successes Maintenance rule Risk-informed inservice inspection Reactor oversight process

• Challenges Fire protection Special treatment requirements Risk-informing Emergency Core Cooling System

rule

Summary • Uncertainties have always been of concern in

safety

• Traditional methods manage uncertainties through design basis accidents and conservatism

• Risk assessment provides a global view of accident sequences, quantifies uncertainties, and is more realistic

• Risk-informed regulation combines the best features of both approaches

14

Risk Informing the Commercial Nuclear Enterprise

Maria Korsnick Constellation Energy Nuclear Group, LLC

April 2, 2014

Promise of a Discipline: Reliability and Risk in Theory and in Practice

University of Maryland

2

How our Business is Risk-Informed

I. Managing Risk to the Business II. Managing the Risk of Normal Plant Operation III. Defining Extreme External Events IV. Risk-Informed Lessons for External Events V. The Path Forward

3

I. Managing Risk to the Business

Each CENG nuclear plant and the corporate office maintains a risk “Heat Map” – An easy-to-read summary of the risks associated with a

business unit – A method for communicating the risks being managed

‘Delphi Method’ for forecasting risk is used - experts come together to perform periodic assessments of Company risks – Subjective (non-analytical) probability and impact assessment

of each risk – Identifies mitigating actions

Prob

abili

ty

Rar

e 5%

M

oder

ate

50%

Ve

ry L

ikel

y 95

%

Like

ly

80%

R

emot

e 20

%

Critical Insignificant

Minor

Significant

Major

Impact

Operating Fleet Heat Map (example)

Prolonged Forced Outage Medium

High

Low

Level of Control

Top risks.

Key Staffing

Regulatory Compliance

Nuclear Risk

Corporate / Generation

Environmental

Industrial / Radiological

Fire Protection/ NFP 805

Extended Refueling Outage

Tritium

Short term

Output/ Forced outages

Post-Fukushima Response

New NRC Regulations

EPA Cooling Water Intake regulation

GSI 191

4

Cyber Security

Significant risks from site maps grouped / assigned based on significance to fleet

Flood analysis White finding (G)

5

Heat Map Risk Table (example)

Issue Risk Category Impact Probability Level of Control

Mitigation

Fukushima Response

High cost of studies, modifications, uncertainty of outcomes. Impact on emergency planning

Regulatory Major Likely Medium Active engagement with industry and NRC

EPA 316b Rule, Clean Water Act

Potential for significant modifications to intake structures at NY and MD sites

Regulatory Critical Remote Low Industry proposing alternatives to federal and state EPA

Key Staffing

High rate of retirements over next ten years, loss of expertise/talent

Corporate Significant Moderate Medium Implement Knowledge Transfer and Retention program

6

II. Managing Risk during Normal Operations Plant-specific PRAs model core damage and large early

release frequency Risk impact of scheduled maintenance, plant evolutions,

and system outages are analyzed Four risk levels used to communicate to plant staff and

set controls

Pre-established risk mitigation measures applied as

higher risk conditions are entered

GREEN ORANGE YELLOW RED

Example Plant PRA Risk

%CDF

30%

15%

9%

4%

3%

3%

2%

2%

2%

1%

Control Level to Prevent Boron Washout

Align RHR During ATWS

Align Fire Water for EDG Cooling

Manually Depressurize (Transient)

Vent PC (Local Actions including use of Port. Powerpack)

Isolate SW Header Flood in RB

Initiating Event Distribution Potential Risk Increase Factor for Key Equipment

System Percentage Contribution to CDF Key Operator Actions

Description

Respond to Control Room Fire

Control Service Water and Open Room Doors (HVAC)

Align Containment Heat Removal

Vent PC (Air or Div I AC lost)

Fi re, 42%

Flood, 6%

Div II AC, 8%

LOOP, 7%

Div I AC, 7%

Loss of 2 SWP Pumps, 5%

Lake Intake, 4%

Feedwater, 4%

MSIV, 3%

Seismic, 4%

Condenser, 2%Other, 8%

1 10 100 1000

Div I Emergency Switchgear

Div II Emergency Switchgear

Div I Emergency DC

Div I 120V Emergency AC

Div II 120V Emergency AC

2RHS A/LPCS Supp Pool Return

Div 1 600V Emergency Switchgear

125V DC Switchgear

Div 2 600V Emergency Switchgear

2RHS B/RHS C Supp Pool Return

0.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%

Colors correspond to the associated System Health Report status as of 4th quarter in 2013

Risk Thresholds>x30>x15>x3≤ x3

Hypothetical PRA Risk Planetary Charts

Plant 3

Plant 1 Plant 2

Plant 4

Every Plant is Unique – design, internal / external events

Risk insights are gained by comparing plant risk profiles Physical

Modifications Protective Barriers Procedures Operator Response

Times Maintenance

Practices Housekeeping

9

III. Defining Extreme External Events Original plant design for external events (security, seismic, flood, fire)

based on regulations and best state of knowledge of risk at time of licensing

Industry understanding of risk has been highly dynamic – 1975 Browns Ferry fire – 2001 terrorist attacks – 2011 Japan earthquake and tsunami (Fukushima)

Evolving risk insights from new data creates constant “churn” in design and operation of our plants – Fire: industrial fire code - to - “Appendix R” - to - NFPA 805 – Revised design basis security threat, robust defenses, cyber – Post-Fukushima reassessment of earthquake frequency and intensity for central and

eastern US plants (NRC GSI-199) – Post-Fukushima reassessment of design basis flood/frequency

10

IV. Risk-informed Lessons for External Events

The uncertainties are real and unavoidable – Extrapolation from internal event modeling experience is not

applicable to other models – Reliance on numerical mean values is not sufficient – Data supporting rare events may have large uncertainty (e.g.,

floods)

Undue focus on numerical outcomes leads to a reduced emphasis on important insights

Adding conservatism in PRA is not an antidote, it can significantly distort sound risk-informed decision-making

11

Case in Point NFPA-805 FPRA Challenge: Deterministic PRA mentality distorts risk perspective

– Conservatisms added at every major step of the process to “bound” uncertainties

Results do not match operating experience benchmarks – Risk-significant fires over-predicted – Fires with significant spurious operations over-predicted

Outcome: Disproportionately large resources spent on model refinements and plant modifications

Significant Departure from Realism =

Ineffective Decision-Making

Conditional Core Damage Probability Conservatism

Fire Suppression Conservatism

Fire Severity Conservatism

Fire Frequencies

Conservatism

Large Conservatism in Fire PRA Building Blocks

+

+

+

+

Compounding conservatism reduces effectiveness of decision making tool

13

V. The Path Forward

Objective

Proposed Actions

Industry NRC

Gain a more complete and balanced understanding of important risk contributors

Continue development of more realistic and complete plant-specific PRAs

Move away from imbedding conservatism in PRA models - Starts with fire PRA

Clarify risk-informed decision-making process that can deal with uncertainties

Propose a practical integrated decision-making process

Adapt/adopt a practical integrated decision-making process consistent with RG 1.174

Educate decision-making stakeholders on risk-informed decision-making

Provide focused PRA training to industry staff and decision-makers

Provide focused PRA training to NRC staff and decision-makers

Develop technical resources to support better risk-informed understanding

Expand EPRI/OG commitment to training and technology

Expand training on truly risk-informed decision-making

14

Key Takeaway

PRA has added tremendous value to the Nuclear Industry allowing us to operate plants safer.

Addressing very low probability / high consequence events can be as important as addressing high probability / high consequence events.

Challenges remain with the tools: – Risk insights are masked by over conservatism or deterministic

approach – back to basics. – Uncertainty matters – what can we do to address and reduce

uncertainty?

•  Launch Date Name Country Result Reason •  1960 Korabl 4 USSR (flyby) Failure Didn't reach Earth orbit •  1960 Korabl 5 USSR (flyby) Failure Didn't reach Earth orbit •  1962 Korabl 11 USSR (flyby) Failure Earth orbit only; spacecraI broke apart •  1962 Mars 1 USSR (flyby) Failure Radio Failed •  1962 Korabl 13 USSR (flyby) Failure Earth orbit only; spacecraI broke apart •  1964 Mariner 3 US (flyby) Failure Shroud failed to jeOson •  1964 Mariner 4 US (flyby) Success Returned 21 images •  1964 Zond 2 USSR (flyby) Failure Radio failed •  1969 Mars 1969A USSR Failure Launch vehicle failure •  1969 Mars 1969B USSR Failure Launch vehicle failure •  1969 Mariner 6 US (flyby) Success Returned 75 images •  1969 Mariner 7 US (flyby) Success Returned 126 images •  1971 Mariner 8 US Failure Launch failure •  1971 Kosmos 419 USSR Failure Achieved Earth orbit only •  1971 Mars 2 Orb/Lander USSR Failure Orbiter arrived, but no useful data and Lander destroyed •  1971 Mars 3 Orb/Lander USSR Success Orbiter obtained approximately 8 months of data and lander landed safely, but only 20 seconds of data •  1971 Mariner 9 US Success Returned 7,329 images •  1973 Mars 4 USSR Failure Flew past Mars •  1973 Mars 5 USSR Success Returned 60 images; only lasted 9 days •  1973 Mars 6 Orb/Lander USSR Success/Failure Occulta\on experiment produced data and Lander failure on descent •  1973 Mars 7 Lander USSR Failure Missed planet; now in solar orbit. •  1975 Viking 1 Orb/Lander US Success Located landing site for Lander and first successful landing on Mars •  1975 Viking 2 Orb/Lander US Success Returned 16,000 images and extensive atmospheric data and soil experiments •  1988 Phobos 1 Orbiter USSR Failure Lost en route to Mars •  1988 Phobos 2 Orb/Lander USSR Failure Lost near Phobos •  1992 Mars Observer US Failure Lost prior to Mars arrival •  1996 Mars Global Surveyor US Success More images than all Mars Missions •  1996 Mars 96 Russia Failure Launch vehicle failure •  1996 Mars Pathfinder US Success Technology experiment las\ng 5 \mes longer than warranty •  1998 Nozomi Japan Failure No orbit inser\on; fuel problems •  1998 Mars Climate Orbiter US Failure Lost on arrival •  1999 Mars Polar Lander US Failure Lost on arrival •  1999 Deep Space 2 Probes US Failure Lost on arrival (carried on Mars Polar Lander) •  2001 Mars Odyssey US Success High resolu\on images of Mars •  2003 Mars Express Orbiter/Beagle 2 ESA Success/Failure Orbiter imaging Mars in detail and lander lost on arrival •  2003 Mars Rover -‐ Spirit US Success Opera\ng life\me of more than 15 \mes original warranty •  2003 Mars Rover -‐ Opportunity US Success Opera\ng life\me of more than 15 \mes original warranty •  2005 Mars Reconnaissance Orbiter US Success Returned more than 26 terabits of data (more than all other Mars missions combined) •  2007 Phoenix Mars Lander US Success Returned more than 25 gigabits of data •  2011 Mars Science Laboratory US Success Exploring Mars' habitability •  2011 Phobos-‐Grunt/Yinghuo-‐1 Russia/China Failure Stranded in Earth orbit •  2013 Mangalyaan India En route On way to Mars •  2013 MAVEN US En route On way to Mars

•  hip://mars.nasa.gov/programmissions/missions/log/

•  The path to Mars involves closing knowledge and performance gaps in a systema\c manner: –  The health threat from exposure to high-‐energy cosmic rays and other ionizing

radia\on and nega\ve effects of a prolonged low-‐gravity environment on human health, including eyesight loss.

–  Human performance considera\ons related to a long-‐dura\on isolated mission in a confined habitable space.

–  The inaccessibility of terrestrial medical facili\es. –  Cri\cal systems, including propulsion, habita\on, and life support that are reliable,

require liile to no maintenance, and have a small mass/volume. –  Long dura\on naviga\on, and opera\ons in deep space environment. –  Ability for crew to operate autonomously including onboard analysis of crew and

environmental samples.

Mars 228,000,000 kilometers

ISS 400 kilometers

Today 2020’s 2030’s

•  1.5 year + crew dura\on •  Crew health and performance vital to a mission •  Habita\on and life support and other cri\cal systems mass/size limited and must have high reliability with limited consumable resupply

•  Limited spares, systems must be reliable •  No opportunity for ground valida\on of crew/ environmental samples or system failure

•  Communica\on delay of up to 42 minutes •  No emergency crew return •  Heavy liI available to support Mars transit

•  6 month crew dura\on •  Crew health and performance research in-‐work •  Habita\on and life support and other cri\cal systems are large and require regular maintenance and consumable resupply

•  Preposi\oned spares and regular resupply •  Ground analysis of crew/environmental samples and system failures

•  Near real-‐\me communica\ons •  Any \me crew return •  Heavy liI capability in development

Cis/trans lunar space 443,400 kilometers

Mars 228,000,000 kilometers

ISS 400 kilometers

Today 2020’s 2030’s

Mission Formula\on -‐ System Design – Technical Management – Mission Opera\ons

(2) ISS to 2024 and Cis-‐lunar are Essen\al to Turn Unknown Risk to Known Risk

•  Crew Health •  Human Performance •  System Reliability

(3) Make Risk Informed Decisions Iden\fy Alternates – Analyze Risk – Make Informed Decisions

(1)  Establish An Objec\ve Hierarchy

LUNCH 12:30 p.m. – 2:0 0 p.m.


Dr. W allace Loh President Universit y of Maryland

REMARKS


Dr. Jeong H. Kim Ent repreneur Chairman, Kisw e Mobile, Inc. Former President , Bell Lab

KEYNOTE SPEAKERS


Mr. Ken Farquhar P res ident & G eneral Manager, S y stem s E ngineering and Miss ion S upport B us iness U nit, ManT ec h International

Dr. Hoang Pham, Rutgers Universit y Dr. Vasiliy Krivt sov, Ford Motors Dr. J. W esley Hines, Universit y of Tennessee

Moderator: Dr. Marvin Roush

Reliab ilit y Educat ion: Challenges and Potent ial of a Non-Trad it ional Engineering Discip line


PANEL 3

The Whereabouts of Reliability Education: Challenges & Opportunities

Hoang Pham Department of Industrial & Systems Engineering

Rutgers University

April 2, 2014

Reliability Education

Reliability is a discipline that has been studied for several decades.

Today several dozen graduate programs in the US and hundreds worldwide offer reliability courses, and some universities have entire reliability programs.

There is a gap between reliability theory and practice, between school and industry, book knowledge and real world applications.

Due to changes in technology, the expectation for a reliability engineer has been changing and getting higher.

Some Reliability books in … 1960s

Igor Bazovsky (1961), Reliability Theory and Practice D. K. Lloyd and M. Lipov (1962), Reliability Management,

Methods and Mathematics N. H. Roberts (1964), Mathematical Models in Reliability

Engineering G. H. Sandler (1964), System Reliability Engineering R. B. Barlow and F. Proschan (1965), Mathematical Theory of

Reliability

Engineering

Reliability Engineering

Reliability Programs Computer

Science

Operations Research

& Management

Statistics & Mathematics

Reliability Programs

Reliability Management

Reliability Statistics

In today’s global market, the only way to stay ahead of the competition is to provide:

Better products! Better service! Better customer experience every time!

Sample 3D TV

Boeing 787

Reliability Computing Reliability requirement: 0.999999999

“The airplane systems and associated components …must

be designed so that the occurrence of any failure condition which would prevent the continued safe flight and landing…is extremely improbable (1 per billion flights~10-9). Compliance… must be shown by analysis…”

FAA Federal Aviation Regulations 25.1309

Reliability Challenges From Theory to Practice

DATA QUALITY The Data of Everything!

Reliability Challenges From Theory to Practice PREDICTIVE MODELING * The Uncertainty in Modeling! * What Models Should Be Used?

Predictive Modeling

based on data and statistical methods

“Prediction is difficult, especially when it’s about the future!”

Operating Environments Testing

Environments

modelling

application

prediction

controlled random

Many reliability studies:

Controlled Environment ≈ Operating Environment

Systemability model

Operating Environments Testing

Environments

modelling

application

prediction

controlled random

1 Controlled environment( ) Operating environmentf

ηη

=

Reliability -- Definition

The probability that the system is still operating at time t.

where f(t) probability density function h(t) failure intensity rate.

0

( )( )( ) ( )

t

h s dsH t

t

R t f s ds e e∞ −

−∫

= = =∫

Systemability -- Definition The probability that the system is still operating

at time t subject to the uncertainty of the operating environments.

The systemability function is [Pham,2005]:

where F is a distribution function of η.

0

( )

( ) ( )

t

h s ds

sR t e dFη

η

η− ∫

= ∫

Systemability approximations using Taylor series:

2( ) 2 ( )( ) 1 ( )

2!H t H t

sR t E e H t eη µσ− − = +

( )( ) ( )H tsR t e dFη

η

η−= ∫

Loglog Distribution – Example

Assume system lifetime ~ Loglog(a,b) with failure rate

Assume η ~ gamma(α, β) System reliability function

1( ) ln 0, 1, 0bb th t b a t a t a b−= > > >

11( )

btaR t e −=

Failure rate h(t) for various values of a and b = 0.5

0 50 100 150 200 2500.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

t

h(t)

Loglog distribution

b=0.5; a=1.1b=0.5; a=1.13b=0.5; a=1.15

Loglog Dist. - Example

Systemability function

Systemability approximations

2 ( )1

btR t

a

αβ

β

= + −

( ) ( )2 1

3 2

1( ) 1

2

btb ataR t e

α

βα

β

−−

− = +

Systemability vs Systemability approximation for 1.15, b 0.05a = =

0 50 1000.82

0.84

0.86

0.88

0.9

0.92

Time

Syste

mab

ility

Fun

ctio

ns

Systemability --(alpha,beta) = (2,3)

R1R2R3

0 50 1000.75

0.8

0.85

0.9

0.95

Time

Syste

mab

ility

Fun

ctio

ns

Systemability --(alpha,beta) = (3,2)

R1R2R3

0 20 40 60 80 1000.2

0.4

0.6

0.8

1

Time

Syste

mab

ility

Fun

ctio

ns

Systemability --(alpha,beta) = (12.5,2.5)

R1R2R3

What Models Should Be Used?

(H. Pham, “A New Software Reliability Model with Vtub-Shaped Fault-Detection Rate and the Uncertainty of Operating Environments”, Optimization, vol 63, 2014:

Published online December 2013)

Model Comparison

Model m(t) Goel-Okumoto

(G-O) Delayed S-shaped

Inflection S-shaped

Yamada Imperfect debugging l

PNZ model

Pham-Zhang model

Dependent-parameter model 1

Dependent-parameter model 2

Vtub-shaped fault-detection rate model

( ) (1 )btm t a e−= −

( ) (1 (1 ) )btm t a bt e−= − +

(1 )( )1

bt

bt

a em teβ

−

−

−=

+

( ) [1 ][1 ] btm t a e a tbα α−= − − +

-bt( ) [1 ][1 ]1 e

btam t e tbα α

β− = − − + +

1( ) ( )(1 ) ( )1

bt t btbt

am t c a e e ebe

α

αβ− − −

−

= + − − − −+

( )( ) 1 ( 1)tm t t t e γα γ γ −= + + −

( ) ( )

( ) ( ) ( )

0

0

00

0

11

1 1 1

t t

t t

tm t m et

t t t e

γ

γ

γγ

α γ γ γ

− −

− −

+=

+ + + − + −

( ) 11

btm t N

a

αβ

β

= − + −

MSE: measures the deviation between the predicted values with the actual observation

Predictive ratio risk (PRR): measures the distance of model estimates from the actual data against the model estimate

Predictive power (PP): the distance of model estimates from the actual data against the actual data

Criteria For Model Selection ( )

2

1

ˆ ( )MSE

n

i ii

m t y

n l=

−=

−

∑

2

1

ˆ ( )ˆ ( )

ni i

i i

m t yPRR

m t=

−=

∑

2

1

ˆ ( )ni i

i i

m t yPP

y=

−=

∑

MSE: measures the deviation between the predicted values with the actual observation

Predictive ratio risk (PRR): measures the distance of model estimates from the actual data against the model estimate

Predictive power (PP): the distance of model estimates from the actual data against the actual data

Criteria For Model Selection ( )

2

1

ˆ ( )MSE

n

i ii

m t y

n d=

−=

−

∑

2

1

ˆ ( )ˆ ( )

ni i

i i

m t yPRR

m t=

−=

∑

2

1

ˆ ( )ni i

i i

m t yPP

y=

−=

∑

Normalized Criteria Distance (NCD) value, Dk, measures the distance of the normalized criteria from the origin for kth model where Wj denotes the weight of the criterion j for j = 1,2,…,d

2

1

1

d

kjk js

jij

i

CD w

C=

=

=

∑∑

Software System Test Data (System Software Reliability,2006)

Week index Exposure time (Cum.

system test hours) Fault Cum. fault 1 416 3 3 2 832 1 4 3 1248 0 4 4 1664 3 7 5 2080 2 9 6 2496 0 9 7 2912 1 10 8 3328 3 13 9 3744 4 17

10 4160 2 19 11 4576 4 23 12 4992 2 25 13 5408 5 30 14 5824 2 32 15 6240 4 36 16 6656 1 37 17 7072 2 39 18 7488 0 39 19 7904 0 39 20 8320 3 42 21 8736 1 43

Model Comparisons & Results Model / Criteria MSE (Rank) PRR (Rank) PP (Rank)

1. G -O Model 6.61 (7) 0.69 (1) 1.10 (7)

2. Delayed S-shaped 3.27 (5) 44.27 (8) 1.43 (8)

3. Inflection S-shaped 1.87 (2) 5.94 (5) 0.90 (4)

4. Yamada imperfect debugging model

4.98 (6) 4.30 (4) 0.81 (3)

5. PNZ model 1.99 (3) 6.83 (7) 0.96 (6)

6. Pham-Zhang model 2.12 (4) 6.79 (6) 0.95 (5)

7. Dependent-parameter model 1 43.69 (9) 601.34 (9) 4.53 (9)

8. Dependent-parameter model 2 24.79 (8) 1.14 (2) 0.73 (1)

9. Vtub-shaped fault-detection rate model

1.80 (1) 2.06 (3) 0.77 (2)

Model Comparisons & Results (cont.) Model / Criteria MSE (Rank) PRR (Rank) PP (Rank) NCD Value (Dk) Model Rank

1. G -O Model 6.61 (7) 0.69 (1) 1.10 (7) 0.115843 6

2. Delayed S-shaped 3.27 (5) 44.27 (8) 1.43 (8) 0.139264 7

3. Inflection S-shaped 1.87 (2) 5.94 (5) 0.90 (4) 0.077194 2

4. Yamada imperfect debugging model

4.98 (6) 4.30 (4) 0.81 (3) 0.086315 5

5. PNZ model 1.99 (3) 6.83 (7) 0.96 (6) 0.082414 4

6. Pham-Zhang model 2.12 (4) 6.79 (6) 0.95 (5) 0.082015 3

7. Dependent-parameter model 1

43.69 (9) 601.34 (9) 4.53 (9) 1.079700 9

8. Dependent- parameter model 2

24.79 (8) 1.14 (2) 0.73 (1) 0.278587 8

9. Vtub-shaped fault-detection rate model

1.80 (1) 2.06 (3) 0.77 (2) 0.066303 1

Reliability Opportunities: Big Data! High Tech Companies in the past 20 years!

Amazon Inc. Founded: 1994 Yahoo Founded: 1994 eBay Founded: 1995 Google Founded: 1998 Facebook, Inc. Founded: 2004 YouTube Founded: 2005 Twitter Inc. Founded: 2006

Engineering Knowledge

Reliability Programs Computer

Skill

School-Industry Projects

Statistics/ Management Skill

Knowledge That Should Be Covered in Reliability Programs

Have a Wonderful Day!

Reliability Education Opportunity: “Reliability Analysis of Field Data”

25th Anniversary of Reliability Engineering @ University of Maryland

Vasiliy Krivtsov, PhD Sr. Staff Technical Specialist Reliability & Risk Analysis

Ford Motor Company

2

Discussion Outline

Introduction

Practical Importance of Reliability Analysis of Field Data

Modelling Peculiarities in Reliability Analysis of Field Data

Staggered Production/Sales

Bivariate Models (Time & Usage)

Seasonality

Data Maturation Issues

Illustrative Case Studies

Proposed Course Structure

Conclusions

3

Practical Importance of Reliability Analysis of Field Data

Root cause analysis and future failure avoidance through

statistical engineering inferences on the failure rate trends and

factors (covariates) affecting them

Lab test calibration by equating percentiles of the failure time

distributions in the field and in the lab

Cost avoidance through early detection of field reliability

problems

Cash flow optimization through the prediction of the required

warranty reserve and/or the expected maintenance costs

Staggered Production/Sales

5

Number of failures at time unit interval j, with r0 = 0:

k

jp

pjj rd

k

jp

1j

1q

pqpj )rv(nRisk set exposed at time unit interval j :

Number of

vehicles

Time in

service

intervals

Failure time intervals

j = 1, …, k

i = 1, …, k 1 2 3 4 5 6 7 8 9 k

v1 1 r11

v2 2 r21 r22

v3 3 r31 r32 r33

v4 4 r41 r42 r43 r44

v5 5 r51 r52 r53 r54 r55

v6 6 r61 r62 r63 r64 r65 r66

v7 7 r71 r72 r73 r74 r75 r76 r77

v8 8 r81 r82 r83 r84 r85 r86 r87 r88

v9 9 r91 r92 r93 r94 r95 r96 r97 r98 r99

vk k rk1 rk2 rk3 rk4 rk5 rk6 rk7 rk8 rk9 rkk

Nonparametric Estimation

Formalized Data Structure:

j

j

jn

dh

Hazard function at the j-th failure time unit interval:

6

Numerical Example

Jan'02 Feb'02 Mar'02 Apr'02 May'02 Jun'02 Jul'02 Aug'02 Sep'02 Oct'02 Nov '02

Volume

Jan'02 10,000 1 3 6 9 15 17 20 22 41 64Feb'02 10,000 0 2 5 10 12 18 19 24 45Mar'02 10,000 1 4 5 10 14 18 20 23

Apr'02 10,000 1 2 7 11 16 17 20

May'02 10,000 0 1 6 12 17 18

Jun'02 10,000 1 3 4 9 16

Jul'02 10,000 2 3 7 11

Aug'02 10,000 1 4 6

Sep'02 10,000 1 3

Oct'02 10,000 0Nov '02 10,000

Time

t

Risk Set

n(t)

Repairs

d(t)

0 110,000 0

1 100,000 8

2 90,000 25

3 80,000 46

4 70,000 72

5 60,000 90

6 50,000 88

7 40,000 79

8 30,000 69

9 20,000 86

10 10,000 64

29592

19523

9437

69921

59849

49759

39671

110000

100000

89992

79967

0.01396

0.00956

CDF

F(t)=1-R(t)

Cum Hazard

H(t)=Sh(t)

0.99907

0.99964

0.99992

01

0.00720

0.00951

0.01387

0.02053

Repair Month

0.99804

0.99654

0.99478

0.00008

0.00036

0.00093

0.00196

0.00346

0.00522

0.00036

0.00093

0.00196

0.00347

0.00524

0.007230.00199

0.00233

0.00441

0.00678

0.99280

0.99049

0.98613

0.979470.02075

Reliability

R(t)=e{-H(t)}

Mo

nth

in

Se

rvic

e

0

0.00008

0.00028

0.00058

0.00103

0.00150

0.00177

0

Sale

s M

onth

Hazard

h(t)=d(t)/n'(t)

Risk Set (corr)

n'(t)

0.00008

Mechanical Transfuser Example: 24MIS/Unlm usage warranty plan

7 Time: t

CD

F: F

(t)

1.000E-4 12.0002.400 4.800 7.200 9.6000.000

0.020

0.004

0.008

0.012

0.016

x 8x 25

x 46

x 72

x 90

x 88

x 79

x 69

x 86

Mechanical Transfuser: Nonparametric Inferences

~1.4% failing @ 9 MIS

Concavity is an indication of an IFR. Note: F(t)≈H(t), for small F(t).

8

j

k

jp

j

q

jpqpj rvn1

1

))((Risk set exposed at time unit interval j :

Probability of mileage not exceeding the warranty mileage limit at failure time unit interval j :

Nonparametric Estimation under a Bivariate Warranty Plan

12 24 36 48 t, MIS

12,000

36,000

60,000

Mileage

j

9

Weibull Probability Plot: Mechanical Transfuser Data ReliaSoft Weibull++ 7 - www.ReliaSoft.com

Time: t

CD

F: F

(t)

0.100 100.0001.000 10.0000.001

0.005

0.010

0.050

0.100

0.500

1.000

5.000

10.000

50.000

90.000

99.000

0.001

x 8

x 25

x 46

x 72

x 90x 88

x 79x 69

x 86x 64

0.5

0.6

0.7

0.8

0.9

1.0

1.2

1.4

1.6

2.0

3.0

4.0

6.0

Probability-W eibullCB@ 95% 2-Sided [T]

All DataW eibull-2PRRX SRM MED FMF=627/S=99373

Data PointsProbability LineTop CB-IBottom CB-I

Vasiliy KrivtsovVVK9/22/20074:51:35 PM

Mechanical Transfuser – Warranty Forecast Summary:

Failure probability @ 24MIS: 0.1364 Population Size: 110,000 Total Expected Repairs: 15,004 Cost per repair: $30 Total Expected Warranty Cost: $450,120 Year-to-date Cost: $18,810 Required Warranty Reserve: $431,310

13.64

24

10

Calendarized Forecasting

ReliaSoft Weibull++ 7 - www.ReliaSoft.com

Time: t

CD

F: F

(t)

0.100 100.0001.000 10.0000.001

0.005

0.010

0.050

0.100

0.500

1.000

5.000

10.000

50.000

90.000

99.000

0.001

x 8

x 25

x 46

x 72

x 90x 88

x 79x 69

x 86x 64

0.5

0.6

0.7

0.8

0.9

1.0

1.2

1.4

1.6

2.0

3.0

4.0

6.0

Probability-W eibullCB@ 95% 2-Sided [T]

All DataW eibull-2PRRX SRM MED FMF=627/S=99373

Data PointsProbability LineTop CB-IBottom CB-I

Vasiliy KrivtsovVVK9/22/20074:51:35 PM

13.64

24

Mechanical Transfuser – Warranty Forecast Summary:

Failure probability @ 24MIS: 0.1364 Population Size: 110,000 Total Expected Repairs: 15,004 Cost per repair: $30 Total Expected Warranty Cost: $450,120 Year-to-date Cost: $18,810 Required Warranty Reserve: $431,310

How will this total number of repairs be distributed along the calendar time, i.e. how many repairs to expect next month, the following month, etc.?

11

TimeParametric

PDF

thru

Oct'02in

Nov'02

in

Dec'02…

in

Sep'04

in

Oct'04

thru

Oct'02in

Nov'02

in

Dec'02…

in

Sep'04

in

Oct'040 0 110000 0 0 0 0 01 0.0001 100000 10000 0 0 0 6 1 0 0 0

2 0.0003 89992 10008 10000 0 0 27 3 3 0 0

3 0.0006 79967 10025 10008 0 0 49 6 6 0 0

4 0.0010 69921 10046 10025 0 0 69 10 10 0 0

5 0.0014 59849 10072 10046 0 0 84 14 14 0 0

6 0.0019 49759 10090 10072 0 0 92 19 19 0 0

7 0.0023 39671 10088 10090 0 0 93 24 24 0 0

8 0.0029 29592 10079 10088 0 0 84 29 29 0 0

9 0.0034 19523 10069 10079 0 0 66 34 34 0 0

10 0.0039 9437 10086 10069 0 0 37 40 40 0 0

11 0.0045 0 9437 10086 0 0 0 43 46 0 0

12 0.0051 0 0 9437 0 0 0 0 48 0 0

13 0.0057 0 0 0 0 0 0 0 0 0 0

14 0.0063 0 0 0 0 0 0 0 0 0 0

15 0.0069 0 0 0 0 0 0 0 0 0 0

16 0.0076 0 0 0 0 0 0 0 0 0 0

17 0.0082 0 0 0 0 0 0 0 0 0 0

18 0.0088 0 0 0 0 0 0 0 0 0 0

19 0.0094 0 0 0 0 0 0 0 0 0 0

20 0.0100 0 0 0 0 0 0 0 0 0 0

21 0.0106 0 0 0 0 0 0 0 0 0 0

22 0.0112 0 0 0 0 0 0 0 0 0 0

23 0.0118 0 0 0 10000 0 0 0 0 118 0

24 0.0124 0 0 0 10008 10000 0 0 0 124 124

609 222 272 … 242 124

Population Exposed Predicted Number of Repairs

total ->

Calendarized Forecast (generic example)

k

ji

i1iijij )tt(n)t(fd

Calendar Time

Tim

e in S

erv

ice

15,004

Time vs. Usage

13

Time or usage?

time

mileage

time

mileage

Note: DFR in time domain Note: IFR in time domain

Note

: D

FR in m

ileage d

om

ain

Note

: D

FR in m

ileage d

om

ain

Depending on variability in mileage accumulation rates of individual vehicles, the same data may result in a contradicting inference in time and mileage domains.

14

Time or usage? (Hu, Lawless & Suzuki, 1998)

Time (MIS)

H(t) ~1.1K/mo

~1K/mo

~0.8K/mo

~0.9K/mo

~0.6K/mo

Note: cum haz functions in time domain appear to be dependant on mileage accumulation, which suggests that time may be NOT the appropriate domain for this failure mode.

Mileage

H(t)

~1.1K/mo

~1K/mo

~0.8K/mo

~0.9K/mo

~0.6K/mo

Note: cum haz functions in mileage domain appear to be independent of mileage accumulation, which suggests that mileage may be the appropriate domain for this failure mode.

15

Time or usage? (Kordonsky & Gertsbakh, 1997)

Time (MIS)

f(t)

Choose the scale that provides a lower coefficient of variation of the respective failure distribution.

Mileage

f(t)

Data Maturity

17

Data Maturity: Lot Rot

t

F(t)

Jan’06

Mar’06

May’06

t0

Data Maturity Problem:

CDF estimates for a nominally homogeneous population at a fixed failure time change as a function of the observation time.

Possible cause:

“Lot Rot”, i.e., vehicle reliability degrades from sitting on the lot prior to be sold.

Various observation

times

Solution:

Stratify vehicle population by the time spent on lot (the difference between sale date and production date). t

F(t)

Jan’06

Mar’06

May’06

t0

Units with 0-10 days on lot

18

Data Maturity: Reporting Delays

t

F(t)

Jan’06

Mar’06

May’06

t0


CDF estimates for a nominally homogeneous population at a fixed failure time change as a function of the observation time.

Possible cause:

The number of claims processed at each observation time is under-reported due to the lag between repair date and warranty system entry date.

Various observation

times

Solution:

Adjust* the risk set by the probability of the lag time, Wj:

t

F(t)

Jan’06

Mar’06

May’06

t0

At each observation time, risk sets adjusted to account for the under-reported claims

k

jp

1j

1q

jpqpj ))rv((n W

* J. Kalbfleisch, J. Lawless and J. Robinson, "Method for the Analysis and Prediction of Warranty Claims", Technometrics, Vol. 33, # 1, 1991, pp. 273-285.

19

Data Maturity: Warranty Expiration Rush

t

F(t)

Jan’06

Mar’06

May’06

t0


CDF estimates for a nominally homogeneous population disproportionably increases as a function of the observation time and proximity to the warranty expiration time.

Possible cause:

“Soft” (non-critical) failures tend to not get reported until the customer realizes the proximity of warranty expiration date.

Solution:

Use historical data on similar components to empirically* adjust for the warranty-expiration rush phenomenon.

*B. Rai, N. Singh “Modeling and analysis of automobile warranty data in presence of bias due to customer-rush near warranty expiration limit”, Reliability Engineering & System Safety, Vol. 86, Issue 1, pp. 83-94.

tw

t

F(t)

Mar’04

May’04

t0 tw

A basis for adjustment

Development of a Successful Program in Reliability and Maintainability Engineering

Dr. Wes Hines Head, Nuclear Engineering

College of Engineering The University of Tennessee

[email protected]

UMD Reliability Engineering Symposium April 2, 2014

Overview • Goal

– Provide a case study that may be useful in developing new reliability programs.

• Outline – What Reliability programs do we have at UT – History of how they were developed – Components of the program – What makes them successful

3

Reliability Programs at UT • Reliability and Maintainability Center (RMC)

– University - industry association dedicated to improving industrial productivity, efficiency, safety & profitability through advanced maintenance and reliability technologies and management principles

– Industrial Center since 1996 with ~30 members

• Reliability and Maintainability Engineering Program (RME) – Interdisciplinary Academic Program

• Undergraduate Minor in RME • Graduate Certificate and/or MS in RME

– Local or Synchronous, Interactive Distance Delivery

• Prognostics, Reliability Optimization and Control Technologies (PROaCT) Laboratory – Interdisciplinary research program with professors and students in

industrial, mechanical, nuclear engineering, and statistics.

UT History in Industry Focused RME • 1988 - Preventive Maintenance Engineering Laboratory (PMEL) under

Nuclear Engineering • 1995 - Proposal to Develop College-wide Maintenance and Reliability

Center (MRC) – Industry roundtable in July – Director named in September

• 1996 - Initial Meeting with 12 Charter Members • 1997 - NSF Combined Research and Curricula Development (CRCD)

Grant to develop 4 MRE courses • 1997 - Internship Program Created • 2000 - Initial Academic Program

– Undergraduate Certificate – Graduate Certificate

• 2007 - New RME Programs Approved – Master of Science in Reliability and Maintainability Engineering – Undergraduate Minor in Reliability and Maintainability Engineering

• 2009 - MS with Specialization in Prognostics • 2010 - RME Minor most utilized minor in the COE

UT Reliability and Maintainability Center The Maintenance and Reliability Center is a university - industry association dedicated to improving industrial productivity, efficiency, safety & profitability through

advanced maintenance and reliability technologies and management principles.

* Education * Research & Technology Assessment * Information Sharing * Business Support & Alliances 50 Companies with a Desire to Improve

Components of Reliability and Maintainability Engineering Program

• Process vs. Product Focus

• Original Academic Programs

– Undergraduate Certificate with Industry Partnership

• Coursework (2 courses) • Summer Bootcamp • Internship (12 weeks)

– Graduate Certificate • 4 courses: 12 hours • Stats 560 Mathematical Statics for Reliability • NE 483 Introduction to Reliability Engineering • NE 484 Advanced Maintenance Engineering • NE 579 Advanced Monitoring and Diagnostic

Techniques

Internship Class of 2000

Internship Class of 1998

Alcoa, Bayer, Dow, DuPont, Eastman, Energizer, Fluor Global, Harley Davidson, Jacobs, Nissan, NiSource, Novelis, ORNL, Owens Corning, Redstone Arsenal,

SABIC, Schlumberger, SNL, Y-12, ….

Boot Camp Course

Internships

Maintenance Technology Teaching Labs

Real Time Interactive Distance Delivery • Supports the working class. • Courses are delivered live and interactively (i.e., synchronous

delivery) to the student's desktop computer via the World Wide Web • Taught in “Dual Delivery” format • Instructor wears wireless microphone • Local students attend class or log in from home • Distance students

– Multipoint audio communication – View slides, whiteboard, demos, etc. – Students can raise hands – Make presentations to class – Courses archived

• Content Delivery Methods – PowerPoint slides – Whiteboard – Windows application sharing – Video or audio clips

Graduate Programs in Reliability Maintenance and Engineering

• Interdisciplinary program offered by the College of Engineering through one of the following six departments:

– Chemical and Biomolecular Engineering – Electrical Engineering and Computer Science – Industrial and Systems Engineering – Materials Science and Engineering – Mechanical, Aerospace and Biomedical Engineering – Nuclear Engineering

• Offered on campus and through web-based, synchronous, interactive, distance education.

• The RME graduate certificate program (12 hours) is designed to allow the credits to be applied towards an M.S. degree.

Support and Integrate with Research Programs

Give your COE Graduates a Niche (RME Minor)

Fifteen hours of coursework are required: Hours Core courses: 6

Introduction to Maintenance Engineering Introduction to Reliability Engineering

Statistics or Math Requirement (choose 1): 3 Probability and Statistics for Scientists and Engineers (Stats 251) Probability and Statistics (Math 323) Chemical Engineering Data Analysis (ChE 301) Probability and Random Variables (ECE 313)

Electives (choose at least 2): 6 Process Dynamics and Control (ChE 360) Engineering Data Analysis and Process Improvement (IE 300) Statistical Process Control (Stats 365) (for non IE) Process Improvement through Planned Experimentation (IE 440) Signals and Systems (ECE 315) Introduction to Pattern Recognition (ECE 471) Mechanical Engineering Instrumentation and Measurement (ME 345) System Dynamics (ME 363) Nuclear and Radiological Engineering Laboratory (NE 304) __

Total: 15

• 10% of COE graduates have the RME Minor – most desired minor in COE

Summary • Garner strong industrial support

– Get their input on curriculum and laboratories – Partner through internship programs – Partner with research opportunities – Meet their needs!

• Make it available to a wide range of students – An interdepartmental college-based program reaches more

students – Increase your reach through distance education

• Build expertise to increase industrial and government research opportunities

• Explain the employment benefits to increase enrollment and promote student success (students will figure this out themselves)

Questions ?

BREAK 3:30 p.m. – 4 :0 0 p.m.


Dr. Darryll Pines Nariman Farvard in Professor Dean, A . James Clark School of Engineering

FUTURE AND IMPACT OF RELIABILITY ENGINEERING AT THE CLARK SCHOOL


Special Presentation

Mpact and Future of Reliability

Engineering DARRYLL J. PINES

APRIL 2, 2014

25TH ANNIVERSARY OF CENTER ON RISK AND RELIABILITY

The Center for Risk and Reliability (CRR) was formed in 1989 as the umbrella organization for many of the risk and reliability research and development activities at the UMD Clark School of Engineering. CRR research covers a wide range of subjects involving systems and processes, and include topics on predictive reliability modeling and simulation, physics of failure fundamentals, software reliability and human reliability analysis methods, advanced probabilistic inference methods, system-level health monitoring and prognostics, risk analysis theory and applications to complex systems such as space missions, civil aviation, nuclear power plants, petro-chemical installations, medical devices, information systems, and civil infrastructures. Over 20 core and adjunct faculty from various engineering departments of the Clark School of Engineering form the pool of experts at CRR. CRR is also home to numerous research laboratories with extensive state of the art equipment and high performance computers. CRR is the research arm of the Reliability Engineering educational program, the largest and most comprehensive degree granting graduate program in the field of reliability and risk analysis of engineered systems and processes. The program offers MS, PhD, and Graduate Certificate in Reliability Engineering and Risk Analysis. All courses are available both through traditional on-campus and online delivery modes.

Center for Risk and Reliability

Current Core Faculty

Professor Neil Goldsman (ECE) Professor Carol Smidts (ME, OSU) Professor Joseph Bernstein (ECE, Israel) Adjunct Faculty and Lecturers Dr. Stuart Katzke (NIST) Dr. Nathan Siu (NRC) Dr. Norman Eisenberg (Independent Consultant) Dr. Mark Kaminiskiy (CRR-CEE) Dr. Roy Schuyler (Independent Consultant)

Affiliate and Adjunct Faculty Al-Sheikhly, Mohamad Professor Materials Science and Engineering 2309F Chemical and Nuclear Engineering Building Phone: 301-405-5214 | [email protected]

Desai, Jaydev Professor Mechanical Engineering 0160 Glenn L. Martin Hall Phone: 301-405-4427 | [email protected]

di Marzo, Marino Professor Fire Protection Engineering 3104B JM Patterson Building Phone: 301-405-5257 | [email protected]

Sandborn, Peter Professor, Director of MTECH Mechanical Engineering 2106A Glenn L. Martin Hall Phone: 301-405-3167 | [email protected]

Schmidt, Linda Associate Professor Mechanical Engineering 2104B Glenn L. Martin Hall Phone: 301-405-0417 | [email protected]

http://crr.umd.edu/faculty/sheikly

mailto:[email protected]

http://crr.umd.edu/faculty/desai



http://crr.umd.edu/faculty/di-marzo



http://crr.umd.edu/faculty/sandborn



http://crr.umd.edu/faculty/schmidt







Mpact-Rankings 1. City University of Hong Kong 2. Sandia National Laboratories 3. University of Southern California 4. National University of Singapore 5. University of California Berkeley 6. Politecnico di Milano 7. University of Electronic Science & Technol... 8. University of Maryland 9. University of Manchester 10. Stanford University

Microsoft Academic Ranking Reliability Engineering (based on publications)

1. Stanford University-Management Science and Engineering, 1-2 2. Massachusetts Institute of Technology-Operations Research, 2-4 3. Georgia Institute of Technology-Main Campus-Industrial Engineering, 1-3 4. Northwestern University-Industrial Engineering and Management Sciences, 4-12 5. Carnegie Mellon University-Operations Res/Information Systems/Manufacturing and Operating Systems, 4-17 6. University of California-Berkeley-Industrial Engineering and Operations Research, 3-10 7. University of Michigan-Ann Arbor-Industrial Operations and Engineering 4-11 8. Cornell University-Operations Research 6-18 9. Carnegie Mellon University-Engineering and Public Policy 8-28 10. Purdue University-Main Campus-Industrial Engineering 6-22 11. Princeton University-Operations Research and Financial Engineering 11-29 12. University of Iowa-Industrial Engineering 11-37 13. University of Nebraska-Lincoln-Industrial and Management Systems Engineering 28-65 14. University of Wisconsin-Madison-Industrial Engineering 6-22 15. Virginia Polytechnic Institute and State University Industrial and Systems Engineering 5-28 16. University of Florida-Industrial and Systems Engineering 12-40 17. University at Buffalo-Industrial Engineering 27-53 18. University of Pennsylvania 19. Operations and Information Management 5-26 20. Arizona State University-Industrial Engineering 11-34 21. Pennsylvania State University-Main Campus-Industrial and Manufacturing Engineering 7-23 22. University of Pittsburgh-Pittsburgh Campus-Industrial Engineering 30-55

23. University of Maryland-College Park- Reliability Engineering 6-29

2010 NRC Rankings (Industrial Engineering, Operations Research, Reliability Engineering)

Stanford University 1 Massachusetts Institute of Technology 1 California Institute of Technology 3 University of California--Berkeley 3 Georgia Institute of Technology 5 University of Illinois--Urbana-Champaign 5 University of Michigan--Ann Arbor 5 Princeton University 8 Cornell University 8 Purdue University--West Lafayette 10 Carnegie Mellon University 10 University of Texas--Austin (Cockrell) 10 University of California--Los Angeles (Samueli) 13 Northwestern University (McCormick) 13 Johns Hopkins University (Whiting) 13 University of Minnesota--Twin Cities 16 University of Maryland--College Park (Clark) 17 Pennsylvania State University--University Park 17 Texas A&M University--College Station (Look) 17 Virginia Tech 17 University of California--San Diego (Jacobs) 21 University of Wisconsin--Madison 21 Rensselaer Polytechnic Institute 23 Ohio State University 23 University of Washington 23

2015 US News Mechanical Engineering Ranking

Mpact-Prestige Professional Society Fellows of Center

• Mohammed Modarres

• Fellow, American Nuclear Society • Ali Mosleh

• Fellow, Society of Risk Analysis • Bilal Ayuub

• Fellow, ASEE • Shapour Azarm

• Fellow, ASME • Greg Baecher,

• Fellow, ASCE • Arist Christou

• Fellow, ASME • Fellow, APS

Faculty Service on Leading Journals

• Editorial Boards/Associate Editors • Reliability Engineering and System

Safety Journal • Journal of Risk and Reliability. • International Journal on

Performability Engineering • International Journal of Reliability

and Safety (IJR) • SNAME’s Journal of Ship

Research, Ships and Offshore Structures Journal, Naval Engineers Journal (NEJ),

Mpact-NAE For contributions to the development of Bayesian methods and computational tools in probabilistic risk assessment and reliability engineering.

For contributions to national defense and security through improved battlefield communication. Also Inducted in May 2004 for innovative engineering and entrepreneurship in communications technologies.

For the development, explication, and implementation of probabilistic- and reliability-based approaches to geotechnical and water-resources engineering.

Mpact-Awards

1. Michel Cukier, NSF CAREER 2. Jeffrey Herrmann, Innovator of year 3. Monifa Vaughn-Cooke

Significant Junior Faculty Awards/Recognition

Mpact-Book/Monograph Contributions

Mpact-Partnerships CRR Research Partnerships • Cooperative Research Agreements with government agencies: – US NRC – US Navy /NAVAIR-NAWCAD – NASA – EC Halden Research Center, Norway – EEC Joint Research Center, Italy – ETH Center for System Safety, Switzerland – Norwegian Institute of Technology – Paul Scherrer Research Institute, Switzerland • Partnership with the industry: – ManTech – Reliability Information Analysis Center RIAC Partnership

Mpact-Education Innovations Professional Education-OAEE • Online Professional Masters Degree • Graduate Certificate

#1 Columbia University (Fu Foundation) New York, NY #2 University of California—Los Angeles (Samueli) Los Angeles, CA #3 University of Wisconsin—Madison Madison, WI #4 University of Southern California (Viterbi) Los Angeles, CA #5 Pennsylvania State University—World Campus College, PA #6 Purdue University— West Lafayette West Lafayette, IN #7 University of Michigan—Ann Arbor Ann Arbor, MI #7 Virginia Tech Blacksburg, VA #9 North Carolina State University Raleigh, NC #9 Texas A&M University—Kingsville (Dotterweich) Kingsville, TX #11 Arizona State University (Fulton) Tempe, AZ #12 Polytechnic Institute of New York University New York, NY #12 South Dakota School of Mines and Technology Rapid City, SD #14 Johns Hopkins University (Whiting) Baltimore, MD #14 University of Maryland—College Park (Clark) College Park, MD #16 California State University—Fullerton Fullerton, CA #17 Cornell University Ithaca, NY #17 Lawrence Technological University Southfield, MI #17 Missouri University of Science & Technology Rolla, MO #20 Texas Tech University (Whitacre) Lubbock, TX

since 1993 are as follows: MS – 211 PhD – 97 Per OAEE’s records, the Master of Engineering and Graduate Certificate in Engineering degrees awarded since 1994 and 2000 respectively are as follows: M. Eng. Reliability On-Campus 46 M. Eng. Reliability Online 16 Total M. Eng. 62 GCEN Reliability On-Campus 10 GCEN Reliability Online 22 Total GCEN 32

Mpact -Placement of Meng, MS and PhDs 2006 Kristine Fretz (currently with Johns Hopkins Applied Research Lab.) 2004 S. Chamberlain (Currently, ITT - Industrial Products Group Reliability Specialist and Area Manager, ITT Industries) 2003 Chi Yeh (Currently, Systems Engineering & Integration Branch, NASA, Glenn Research) 2001 F. Li (Currently, Materials Research Scientist, Corning, Inc.) 2000 F. Joglar (Currently, Manager, Fire Risk Group, SAIC) 2000 V. Krivtsov (Currently, Ford Technical Leader for Reliability & Statistical Analysis, Ford Motor Company) 1998 H. Hadavi (Energy Research Corp., Rockville, MD) 1998 J. O’Brien (Currently Director of Office of Nuclear Safety, DOE) 1998 Y. Guan (President and CEO, Advanced System Technology Management, Inc.) 1998 K. Ouliddren (Currently, Staff Researcher, Nuclear Research Centre SCK-CEN, Mol, Belgium) 1997 T. Ni (Currently, Deputy Dean, Shanghai University, China) 1997 A. Thunem (Currently, Halden Reactor Project, Norway). 1994 Y-S. Hu (Currently, Dean, Beijing Technology & Business University, and CEO of DML International Corp.) 1991 L. Hammersten (Currently, Research Analyst, MITRE Corp.) 1990 L. Chen VP at JP Morgan

Work on Grand Challenge Problems Disaster Resilience Risk and Reliability of Critical Infrastructure

Work on Grand Challenge Problems Global Public and Human Health • Risk and Reliability of Devices and System

What of the Future? New Faculty Hires in ME-Reliability Engineering: • Monifa Vaughn-Cooke • Offers out to at least 2 individuals

Facilities: • Upgrades to Virtual Reality Cave under review to

support future research thrusts

Education: • Develop MOOC Course Series in Reliability

Engineering

Some Perspectives from Dilbert

Dr. Monifa Vaughn-Cooke Assistant Professor

FACULTY VISION FOR THE FUTURE OF RELIABILTY ENGINEERING


Dr. Mohammad Modarres Minta Mart in Professor

CLOSING REMARKS


ANNIVERSARY RECEPTION A N D A L U MN I R E U N IO N


Join us for the

5:0 0 p.m. – 7:0 0 p.m. Presentations by Dr. Marvin Roush, Professor Emeritus Frank Groen, Ph.D., ‘0 0 Tim Hajenko, M.S., ’13, Lesa Ross, Ph.D., ‘0 9 Ken LaSala, Ph.D., ‘93

THANK YOU T O O U R G E N E R O U S E V E N T S P O N S O R S


ISSA Technologies, Inc.

PDF of all Presentations

Documents

Transcript of PDF of all Presentations