QRMEC-Elsayed May 28 2013

download QRMEC-Elsayed May 28 2013

of 47

Transcript of QRMEC-Elsayed May 28 2013

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    1/47

    Reliability and Maintainability

    Engineering: An Overview

    E. A. ElsayedDepartment of Industrial and Systems Engineering

    Rutgers University

    New Jersey, USA

    [email protected]

    1

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    2/47

    Brief Topics

    Some initial thoughts and personal

    experience Reliability definition

    Reliability testing

    Methods for improving reliability

    Maintenance strategies: PM, FR,Inspection policies

    Condition-based maintenance based oncurrent sensors technologies

    2

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    3/47

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    4/47

    Topics

    Some initial thoughts and personal

    experience Reliability definition

    Reliability testing

    Methods for improving reliability

    Maintenance strategies: PM, FR,Inspection policies

    Condition-based maintenance based oncurrent sensors technologies

    4

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    5/47

    Reliability Economics

    Oil Pipeline Shutdown (Hardware Failure)

    BP shuts oilfield August 8, 2006

    Damaged pipeline in Alaska affects 8% of U.S. oil

    production; crude surges; record gas prices seen.

    The threat of a stoppage also endangers Alaska's budget:Oil taxes account for more than 90 percent of its revenues.

    BP officials have acknowledged they did not test the pipes

    adequately using a so-called pig device which is run

    through a pipe to gauge corrosion (utilizes ultrasound todetect corrosion).,,,inspection and maintenance strategies.

    5

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    6/47

    Reliability Engineering

    Air Traffic Delays (Software Failure)

    Nov 19 2009: A computer glitch caused flight

    cancellations and delays across the U.S.

    The problem involved the FAA computer systems

    in Salt Lake City and Atlanta that handle

    automated flight plans, forcing air traffic

    controllers to revert to the much more time-

    consuming approach of entering flight plans by

    hand.

    Software failure (7000 flights)

    6

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    7/47

    Global Impact: Oceanic Airspace

    Data Link Communication Reliability

    Oakland ARTCC

    ARINC SITA

    Ground

    EarthStations

    Uplink

    7

    Engine

    MonitoringAOC or Aircraft

    Operational

    Communication

    Position

    Monitoring

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    8/47

    Recall of Cars due to Degradation

    October 1, 2012(Environmental conditions)

    8

    General Motors Co. recalled more than 40,000 cars sold inwarm-weather states because a plastic part might crack and

    cause a fuel leak.

    The recall affects vehicles sold or currently registered inArizona, California, Florida, Nevada or Texas. Owners in

    Arkansas and Oklahoma also are included in the recall of the

    2009 Cobalt and G5.

    The vehicles have plastic parts connected to the fuel pump

    which could crack. If the crack gets large enough, fuel could

    leak out of the vehicle and cause a fire.

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    9/47

    Reliability Definitions Measurements

    When you a buy a product or service

    you request high quality and high reliability

    How do you measure it? What is high?

    How long? Reliability: 0.99 at year 5, 0.999 at year 4

    Time dependent qualityreliability

    How do companies predict reliability and estimatewarranty?

    Reliability of cold standby units New tires and oldtires

    9

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    10/47

    Maximum Reliability level

    Reliability

    WithRepairs

    Time

    NoRepairs

    Some Initial Thoughts

    Repairable and Non-Repairable

    Another measure of reliability is availability (probability

    that the system provides its functions when needed).

    10

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    11/47

    Some Initial Thoughts

    Failure Rate During Life Cycle

    Will you buy additional warranty?

    Burn in and removal of early failures.

    Time

    Fa

    ilureRate

    Early Failures

    ConstantFailure Rate

    Increasing

    FailureRate

    11

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    12/47

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    13/47

    13

    Reliability Definitions

    Reliability is a time dependent characteristic.

    It can only be determined after an elapsed time but

    can be predicted at any time.

    It is the probability that a product or service will

    operate properly for a specified period of time (design

    life) under the design operating conditions without

    failure.

    13

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    14/47

    14

    Other Measures of Reliability

    Availability is used for repairable systems

    It is the probability that the system is operational atany random time t.

    It can also be specified as a proportion of time thatthe system is available for use in a given interval(0,T).

    Mission availabilityplay time, military products

    One shot devicesmissiles, standby generators.

    14

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    15/47

    15

    Other Measures of Reliability

    Mean Time To Failure (MTTF): It is the average

    time that elapses until a failure occurs.

    It does not provide information about the distribution

    of the TTF, hence we need to estimate the variance

    of the TTF.

    Mean Time Between Failure (MTBF): It is the

    average time between successive failures.

    It is used for repairable systems.

    15

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    16/47

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    17/47

    Brief Topics

    Some initial thoughts and personal

    experience Reliability definition

    Reliability testing

    Methods for improving reliability

    Maintenance strategies: PM, FR,Inspection policies

    Condition-based maintenance based oncurrent sensors technologies

    17

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    18/47

    1818

    Types of Reliability Testing

    Highly accelerated life testing (HALT) Highly accelerated stress screening (HASS)

    Accelerated life testing (ALT)

    Degradation testing (DT) Accelerated Degradation Testing (ADT)

    Acceptance test (AT)

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    19/47

    19

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    20/47

    2020

    Purposes and Needs

    1. Investigate failure modes and correct designs

    2. Improve reliability during development

    3. Demonstrate reliability for accepting a design

    4. Predict reliability

    5. Eliminate units with manufacturing defects

    (infant mortality failures)

    6. Accept or reject a products

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    21/47

    2121

    Idea of Accelerated Stress Testing

    Compression and extrapolation

    Use conditionSeverer condition

    Life

    Stress

    Unknown

    extrapolation

    MTTF

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    22/47

    Testing and Acceptance

    Testing and Life Prediction:

    Conduct extensive reliability testing for both

    demonstration (acceptance of products) and life prediction.

    Collect continuously to have constant assessment of thesystem reliability. Always assess the mean residual life to

    determine the optimum time to replace the systemspecially in the wear out region..

    Suppliers need to provide evidence of systems reliability.

    22

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    23/47

    Current Cycle Profile: -40 C to 125 C

    New Cycle Profile: -65 C to 170 C

    Stress type, stress loading, sample size, duration of

    the test.???

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    24/47

    2424

    Summary of Reliability Testing

    1 2 3 4 5 6

    RDT X

    RGT X

    HALT X X

    HASS X X

    ALT X X

    DT X X

    ADT X X

    Burn-in X

    Acceptance X X

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    25/47

    Brief Topics

    Some initial thoughts and personal

    experience Reliability definition

    Reliability testing

    Methods for improving reliability

    Maintenance strategies: PM, FR,Inspection policies

    Condition-based maintenance based oncurrent sensors technologies

    25

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    26/47

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    27/47

    System Design Contd

    System design: Implicit

    RedundancyPumps connected in series.

    Two consecutive failures for the system to fail

    27

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    28/47

    Brief Topics

    Some initial thoughts and personal

    experience Reliability definition

    Reliability testing

    Methods for improving reliability

    Maintenance strategies: PM, FR,Inspection policies

    Condition-based maintenance based oncurrent sensors technologies

    28

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    29/47

    Failure Rate and Maintenance

    Availability

    Region 1, repair upon failure (FR)

    Region 2, no PM

    Region 3, PM, Condition-based, if possible.

    Time

    FailureRate

    Early Failures

    Constant

    Failure Rate

    Increasing

    FailureRate

    29

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    30/47

    30

    Distribut ion of the state

    after maintenance

    System State under Imperfect Maintenance

    Time t

    T1

    > T2

    > T3> T

    4

    0

    ST

    A

    T

    E

    X(t) D

    PM

    R1 R+

    1 R2R+

    2

    DF

    R3R+

    3 R4 R+

    4 R5R+

    5 R6R+

    6

    is the failure threshold while is preventive

    maintenance threshold.FD PMD

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    31/47

    Maintenance

    Maintenance:

    Time to perform maintenance is normally

    recommended by the manufacturer of the equipmentoperating under normal conditionswhat is normal?

    If the estimate of failure time is conservative, onerisks the replacement of items before its useful life.

    If the estimate of failure time is optimistic, one risks

    that the component or system may fail in service.

    Use Condition-Based Maintenance

    31

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    32/47

    Brief Topics

    Some initial thoughts and personal

    experience Reliability definition

    Reliability testing

    Methods for improving reliability

    Maintenance strategies: PM, FR,Inspection policies

    Condition-based maintenance based onadvances in sensors technologies

    32

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    33/47

    Technological Advances and Maintenance

    Condition-based Maintenance

    The recent advances in sensors technology,

    chemical and physical non-destructive testing

    (NDT), and sophisticated measurement techniques,

    information processing, wireless communications

    and internet capabilities have significantly impactedthe condition-based maintenance approach by

    providing dynamic maintenance schedules that

    minimize the cost, downtime and increase system

    availability.

    33

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    34/47

    Condition-Based Maintenance Example

    Otis Elevators / United Technologies

    Remote Elevator Monitoring

    It is an interconnected system of sensors,

    monitors, circuits, hardware and software used

    to collect, record, analyze and communicate

    elevator data continuously monitors hundreds ofdifferent functions on elevators worldwide

    (braking time, acceleration, deceleration, floor

    level, door opening, forces on strands,..).

    34

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    35/47

    Condition-Based Maintenance Example

    If the system detects a problem, it analyzes and

    diagnoses the problems cause and location. Itmakes the service and aids Otis mechanic in

    identifying the exact component(s) causing the

    problem, which helps to facilitate a timely and

    accurate resolution.

    Issues:

    Which component or subsystem?What is the indicator?

    What is the alarm level of the indicator?

    What is the action?35

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    36/47

    3636

    Most of mechanical components such as gears,

    brakes, bearings exhibit degradation before

    failure.

    Ductile materials have degradation indicators:

    change in its physical properties Electronic components such as resistors,

    capacitors, diodes exhibit change in

    performanceresistance drift as an example

    Polymers and elastomers change properties with

    time and stress

    Degradation Indicators

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    37/47

    3737

    Examples of these indicators includehardness

    which is a measure of degradation of elastomers. This is due to the fact that

    elastomeric materials are critical to many

    applications including hoses, seals and dampers

    of various types and their hardness increases

    over time to a critical level at which their ability

    to absorb energy is severely degraded. This

    may lead to cracks or excessive wear and

    related failure modes in components

    Degradation Indicators

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    38/47

    38

    Continuous-time Continuous-state

    Degradation (Indicators)

    Examples

    Light intensity degradation of light emitting diode (LED)

    Metal crack propagation

    Wear of tires, brake system wear out, nozzle blocking.

    Increase of stiffness of energy absorbing material.

    Strength loss of steel beams due to corrosion

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    39/47

    Degradation Path

    39

    dX t dt dW t

    0 i idX t dt dW t X t x t W t

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    40/47

    Brownian Motion: Parameters Estimation

    Using the Maximum Likelihood Estimator we obtain the

    parameters of the Brownian Motion degradation path as

    40

    2

    1 1

    2

    2

    2

    1 1

    1 1

    n ni ii

    i

    i i

    i i

    n ni ii

    i

    i i

    y mym v

    n n

    m t v t

    y my

    t n t n

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    41/47

    First Passage Time Distribution

    41

    degradation 1

    degradation 2

    threshold 1

    threshold 2

    R l t D d ti t F il Ti

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    42/47

    42

    Relate Degradation to Failure Time

    Distribution (First Passage Time)

    :Failure time

    0

    10

    20

    30

    40

    50

    0 20 40 60 80 100 120 140 160 180 200

    Time

    Data 1

    &Exponential Fit 1

    Data 2

    &Exponential Fit 2

    Data 3

    &Exponential Fit 3

    Data 4

    &Exponential Fit 4

    Critical Degradatio

    Degradation vs Time( ) )R t t stProb(the 1 time for degradation measure to cross the failure threshold

    Failure threshold

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    43/47

    First Passage Time Distribution

    43

    In degradation modeling we set a degradation

    threshold level at which the performance of the

    system is considered unacceptable. The thresholdlevel is . The probability of failure can be

    defined as

    The time to cross the threshold level is referred to

    as the first passage time. It has a distribution andits density function is given as shown next.

    fD

    1

    fD t

    F t P T t

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    44/47

    Failure Data in Oil and Gas Industry

    OREDA database:

    MTBF motor driven dry-wet compressor unit

    =1.8 year (It is really MTTF)

    Major Failures: 80% of all forced outages are

    caused by unforeseen liquid ingress into

    compressor

    44

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    45/47

    Failure Data in Oil and Gas Industry

    OREDA database:

    Seal Failures: 80% of all seal failures are caused contamination

    aging (elastomers)

    Solution Analysis of 11,000 mechanical seal failures from

    148 different reliability contract and alliance plant

    sites over two years show that that 13% of theseal failures are attributable to lack of effective

    corrective and preventive maintenance.

    45

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    46/47

  • 8/12/2019 QRMEC-Elsayed May 28 2013

    47/47